Please ensure Javascript is enabled for purposes of website accessibility

The Power of Predictive Analytics: Is it always enough?

01.18.2021

Many expected an Alabama victory in the College Football Championship Game on Monday, but the 52-24 blowout win we witnessed over Ohio State was more decisive than any sportsbook had predicted. Personally, I was securely in the camp of believers who expected a close game, perhaps even an upset. Pre-game betting lines expected Nick Saban’s team to win by around 8 points, quite far off from the actual 28-point margin. 2020 has been a unique year for college football, with Ohio State playing about half the number of games in a normal schedule, so some error between a game’s outcome and the pregame prediction (betting line) can be expected. Still, it is in the best interest of the sportsbook to offer the most accurate line possible, so how were they so far off? The classic explanation of “anything can happen in sports” might not tell the whole story.

Betting lines are set through a combination of analytics and oddsmakers’ decisions. Modeling procedures can attempt to predict the outcome of a specific game based on the statistics that describe the events leading up to and the outcomes of past games. Metrics like a team’s win/loss record, third-down efficiency, or yards per play that correlate with success and point-scoring could help build a model to predict the winning or losing margin of a game. These correlations give us a general idea of how good a team is, but to accurately predict the outcome of a game the interaction between different statistics must also be considered. For example, Ohio State may have had a very impressive winning margin over the course of this season, but how would their opponent’s defenses compare to that of Alabama? The intricate interactions of all the factors that can affect the outcome of a football game become so abstract that it is no longer practical to attempt to model them. How can a machine learning algorithm numerically represent the effect that three of the top ten Heisman candidates playing on the same offense has on the outcome of a single game? How does the hit Ohio State quarterback Justin Fields received from James Skalski a week before affect his expected impact on the game against Alabama? These are the type of questions that can only be properly estimated by a human. There is simply not enough data available to build a proper model with this level of specificity. The predictive power of this type of machine learning model comes from the ability to associate specific game data representing similar historical games. There may be two or three games per year that come close to representing this year’s Ohio State Alabama matchup, even in terms of the most basic predictive metrics like win/loss record. With such a limited pool of historical data to draw from, it becomes extremely difficult to model a game of this magnitude.

Even with these limitations, sportsbooks are tasked with setting a proper betting line. People and money have the final say. As sophisticated and elegant as a model might be, the opening betting line is ultimately decided on by oddsmakers, not computers. It then devolves into a free market, moving up and down as money is wagered on either side of the line. While the predictive power of analytics is unquestionably powerful, it has yet to replace the practical human knowledge of the oddsmaker.

About the Author Will Krebs

Will is currently a Lead Data Analyst at NCS, directing the analytics team. He is the curator of the rare data sets NCS collects and maps the analytical model evolution alongside the CEO. He graduated from the University of Colorado Boulder, where he earned a bachelor’s degree in physics while mastering diverse data analytics strategies. Will is a Colorado native who enjoys the great outdoors by skiing in his free time.

Recent Insights

Receive Key Insights