
Moving an NPB Prediction System to BigQuery — BQML and Cloud Run on the Free Tier
Background I've been running an NPB (Japanese professional baseball) player performance prediction project for over a year. → Previous articles: Why Marcel Beat LightGBM: Building an NPB Player Performance Prediction System Annual Auto-Retraining for NPB Baseball Predictions with GitHub Actions The setup was: GitHub Actions fetches data → trains models → saves CSVs → Streamlit displays results. Data lived in CSVs, the API ran on a Raspberry Pi 5 Docker container, and analysis was done in local Python. I added Google BigQuery to centralize the data, run SQL analysis, compare BQML accuracy against Python ML, and deploy the API to Cloud Run. Everything fits within GCP's free tier. → GitHub : https://github.com/yasumorishima/npb-prediction Why BigQuery Pain points with the CSV-based setup: Full re-fetch every run — The annual pipeline re-downloads all data from scratch. No incremental updates Cross-analysis was tedious — JOINing hitter stats with park factors meant writing pandas merge cod
Continue reading on Dev.to Python
Opens in a new tab

![[Learning notes and hw] getting started with R-cnn: Manually implementing Intersection over Union (IoU)](/_next/image?url=https%3A%2F%2Fmedia2.dev.to%2Fdynamic%2Fimage%2Fwidth%3D800%252Cheight%3D%252Cfit%3Dscale-down%252Cgravity%3Dauto%252Cformat%3Dauto%2Fhttps%253A%252F%252Fdev-to-uploads.s3.amazonaws.com%252Fuploads%252Farticles%252Favit2emoxc0g68e5ltqj.jpg&w=1200&q=75)

