
Can Statcast Data Improve MLB Player Performance Predictions? — Beating Marcel with LightGBM
Introduction This article is a continuation of my NPB Bayesian prediction series. Along the way, I reached a conclusion: "Without tracking data like Statcast, we can't break through the next wall." In my NPB project, I added Bayesian regression (Stan/Ridge) on top of Marcel projections. At the player level there was consistent improvement (p=0.06), but at the team level the gains disappeared. The reason: Marcel's 3-year weighted average is already accurate for high-PA regulars, leaving no margin for improvement using only aggregate stats like K%/BB%/BABIP. MLB has Statcast . This article tests whether Statcast tracking features can beat Marcel. GitHub : https://github.com/yasumorishima/baseball-mlops Streamlit : https://baseball-mlops.streamlit.app/ What is Marcel? Marcel is a simple projection system from the 1980s: weighted average of the past 3 years (weights 5:4:3) + regression to the mean + age adjustment. Despite its simplicity, it's remarkably accurate — especially for regular p
Continue reading on Dev.to Python
Opens in a new tab


