Moving an NPB Prediction System to BigQuery — BQML and Cloud Run on the Free Tier

Background I've been running an NPB (Japanese professional baseball) player performance prediction project for over a year. → Previous articles: Why Marcel Beat LightGBM: Building an NPB Player Performance Prediction System Annual Auto-Retraining for NPB Baseball Predictions with GitHub Actions The setup was: GitHub Actions fetches data → trains models → saves CSVs → Streamlit displays results. Data lived in CSVs, the API ran on a Raspberry Pi 5 Docker container, and analysis was done in local Python. I added Google BigQuery to centralize the data, run SQL analysis, compare BQML accuracy against Python ML, and deploy the API to Cloud Run. Everything fits within GCP's free tier. → GitHub : https://github.com/yasumorishima/npb-prediction Why BigQuery Pain points with the CSV-based setup: Full re-fetch every run — The annual pipeline re-downloads all data from scratch. No incremental updates Cross-analysis was tedious — JOINing hitter stats with park factors meant writing pandas merge cod

Moving an NPB Prediction System to BigQuery — BQML and Cloud Run on the Free Tier

Related Articles

Understand OpenClaw by Building One — Part 7

The Systems Question That Separates Juniors From Seniors

[Learning notes and hw] getting started with R-cnn: Manually implementing Intersection over Union (IoU)

Botanical garden

Task 3: Delivery Man Task

Related Articles

How-To
Understand OpenClaw by Building One — Part 7
Medium Programming • 6h ago

How-To
The Systems Question That Separates Juniors From Seniors
Medium Programming • 7h ago

How-To
[Learning notes and hw] getting started with R-cnn: Manually implementing Intersection over Union (IoU)
Dev.to Beginners • 8h ago

How-To
Botanical garden
Dev.to Tutorial • 13h ago

How-To
Task 3: Delivery Man Task
Dev.to • 13h ago