Innovation and the Use of AI in Baseball

Published on Author Spencer Shippel

HISTORY

 The first recorded history of baseball being played in America dates back to 1845. It was in Hoboken, New Jersey where the New York Nine club faced off against the Knickerbocker club, and utterly humiliated them 23-1. It was considered an “Intrasquad game” and was treated almost the same as the way we currently treat Spring Training games for the MLB. The only statistics that were kept from the game are the players that played, how many runs each player scored, and the final score, along with the name of the umpire. It wasn’t until 1859 when a man named Henry Chadwick began to start keeping a box score for each game played. This box score included each player’s hits, at-bats, walks, and how many innings a pitcher threw. He then used these stats to devise and calculate some of the key statistics we still use today, such as batting average and ERA, or earned run average. He, like many others after him, believed that batting average was the best measure of a batter’s offensive skills.

           Since then and the creation of the MLB, statistics have been kept for every game, and have been used to judge the prowess of players. However, it is the scouts and the managers of each team who truly decide who gets drafted, who plays, and where they play in the field or bat in the lineup. For a very long time, these scouts and managers used what many refer to as the “eye test” to decide how good an incoming prospect or another player in the league was. They would see how “smooth” a player’s swing was, how fast they could throw, or if they had the “it factor”. Everything was very subjective for them. Granted, they used statistics to a varying degree to assist in making decisions but would disregard them if they had a gut feeling or saw something odd in a player.

“MONEYBALL”

           Then, Billy Beane appeared. He was the manager of the Oakland Athletics in the early 2000s. The Athletics were, and still are considered to be the cheapest team in baseball and have had one of smallest average payrolls league-wide for the last 30 years. In 2002 they had the 3rd lowest opening-day payroll of roughly $39.6 million compared to the largest payroll team and notably historically successful New York Yankee’s payroll of $126 million. It was impossible to compete with teams who were simply willing to buy the best players on the market that the A’s could not afford. Beane decided to view baseball in a different approach and made decisions based on purely statistical reasoning. He added players such as David Justice, who was considered too old to still be playing, and Scott Hatteburg, a catcher who had such severe shoulder damage he could no longer throw a baseball. His reasoning behind this was that while they don’t look great, their statistics show that they can as he would say “get on base” and were cheap due to their “defects”. He took the worst team in the league to the playoffs that season and since then, every team in the league has adapted his model to how they build and manage their teams. His formula for building a team has even created its own word now: Moneyball – the use of analytics and data science in Major League Baseball. (Luca)

       

INNOVATION

Since then, teams around the league have begun another revolution using artificial intelligence and machine learning. The MLB gathers seven terabytes of data at every single baseball game played. It’s impossible for any human, even Billy Beane, to understand and interpret this much data and use it truly better their team. However, teams using AI and machine learning technology can detect patterns in the data, and provide insight into things such as pitch selection, roster construction, and player performance. Not only is it helping managers build their teams, but it is also providing players with insights on how to develop their skills. Perhaps a player is dropping his elbow only an inch or even a half in more than required on his delivery. Such a mechanic would make him more susceptible to an injury such as “Tommy John”, and even slow down his delivery and cause him to lose velocity. Such a minute detail would be almost impossible to diagnose via a batting coach’s human eye, however, these AI-driven models can notice it as it’s happening and provide the player with this data before their game the next day. (Heater)

THE SHIFT

           However, the most notable example of AI in the MLB is the use of the defensive shift. The shift is the practice of seeing a hitter’s tendencies and moving your defensive positions to align with where they are most likely to hit it. This practice has been used for decades to stop great sluggers and even began in an attempt to slow down Ted Williams in 1941. However, in recent years, with the introduction of AI, the shift has been elevated to another level. The AI programs in use have given such incredible information, that pitchers have devised which part of the zone they need to throw the ball to in order to produce a hit into their shifted defense. Over the last two years, 1 out of every 6 plate appearances in the MLB was against a shifted defense. In 2022 the league-wide average on ground balls above 90 mph without a shift was .349, compared to just a .273 average with a shift. It has become so successful people have blamed a lack of fan viewership and participation in recent years on the dwindling number of home runs and game action resulting from the shift. It has garnered so much controversy, the MLB has banned the defensive shift from being used for the 2023 season in an attempt to rebuild viewership numbers. (Cox Communications) (Verducci)

           The future of the MLB may have even more in store for it regarding the use of AI. For years players and fans alike have argued over whether umpires should be replaced with AI, specifically behind home plate. The leaguewide average for accuracy regarding the calls of balls and strikes was roughly 92% during the 2022 season. This may seem great, however, when the result of a game, or even a team’s entire season can rely on the outcome of 1 pitch, an 8% error rate can become shockingly large. Especially when each umpire has different tendencies on where they call balls and strikes. Each game one plays can be vastly different based on who’s behind the plate. One fan has even attempted to shed more light on this “epidemic” through the creation of Umpire Scorecards: a website and Twitter account used to show the accuracy of an umpire for a specific game and the season as a whole. The account takes information gathered from the AI software used by the MLB, to provide fans with relevant information on how meaningful an umpire was to each game. The account has kept statistics since 2015, meanwhile, it wasn’t until the 2022 World Series, when Pat Hoberg recorded the first ever “Perfect Game” by an umpire. It took 7 years for any single umpire to do what an AI machine could do for every single game. (Singer) Do you think baseball has enough uses for AI already or should they continue to innovate and provide automation for a job that would severely impact the sport as we know it?

WORKS CITED

Cox Communications. “How AI and Machine Learning Are Revolutionizing Baseball.” Cox Communications, Cox Communications, 9 May 2022, https://www.cox.com/residential/articles/ai-machine-learning-baseball.html#:~:text=Machine%20learning%20models%20pair%20data,spot%20patterns%20within%20pitching%20mechanics.

Heater, Brian. “Analytics, AI and Robotics Help MLB Teams Get a Step Closer to a Perfect Pitching Machine.” TechCrunch, 27 Aug. 2022, https://techcrunch.com/2022/08/27/analytics-ai-and-robotics-help-mlb-teams-get-a-step-closer-to-a-perfect-pitching-machine/.

Luca, De Michael, et al. Moneyball. Sony Pictures Entertainment, 2011.

Singer, Ethan. “Umpire Scorecards.” Umpire Scorecard, https://umpscorecards.com/.

Verducci, Tom. “How Banning Infield Shifts Will Change MLB.” Sports Illustrated, 21 Nov. 2022, https://www.si.com/mlb/2022/11/21/banning-infield-shifts-impact.