Friday, August 25, 2023

Analytics, Boredom, and Mis-assessing MLB Player Value

The analytics revolution in baseball was about uncovering player values. It was about moving away from relying *just* on traditional stats (RBIs, runs, wins, etc.) to try to find ways to assess players independent of their context, say a manager who misplayed them or a pitcher who won a lot of games because their team scored a lot of runs.  Since Bill James and others first starting making the case for a better approach to baseball math over a generation a go, we now have a raft of new statistical categories that sub in for traditional numbers. This includes OPS, OPS+, wins above replacement (WAR), etc.  There are mathematical variations in some of formulae used to calculate these stats (not every determines WAR in exactly the same way), but by and large I think these new stats are remarkably useful for determining player value. But, they also have their problems particularly when combined with an MLB penchant to measure just about everything that can be measured, at least this seems to be Sportsnet's approach with the Jays, where we are treated to a daily discussion of "exit velocity," distance home runs are travel, etc. I find this fascination less useful and I also want to explain why. 

What is now called analytics was about finding more effective ways to evaluate players. The original analytics writings did not dismiss older stats, but argued that they needed to be understood better and in context. For instance, imagine a player who drives in a lot of runs and hence has a lot of RBIs. Was that a product of their ability to hit in the clutch (when, say, their team had a runner scoring position) or was it a product of the fact that the players who batted before him were on base a lot. If that were the case, a player would have more RBIs and not necessarily be any better at driving in runs than a player who have fewer. 

I'm going to come back to this but the two most important offensive skills in baseball are the ability to get on base (hard to score without people on base) and the ability to advance runners (that is move them from one base to the next). Some players are really good at that.  The batter in front them gets a double and is on second with none out. The next batter, hits the ball behind the runner, allowing him to advance to third. It doesn't show up on standard score sheets but that is a productive out, moving a teammate to third with one out where he can now score with a sac fly. Analytics attempted to find ways to enumerate these skills so that a player could be assessed on their merits. 

I agreed with analytics. I think its advances were understood long before baseball had analytic departments or anyone had seen Moneyball. It makes sense that the ability to get on base is important and that it does not matter exactly how one gets on base. Thus, on base percentage is a more important statistical measure than batting average. I get it.

I don't think "we have gone too far" with analytics but I worry that analytic stats are disguising other stats. I'll say it again, Bill James never rejected traditional stats. Instead, he worked with them and interpreted them. The substitution of one stats for another taken out of its context -- I'll say it again, taken out of its context -- disguises a player's value and that, it seems to me, is one of the problems that haunts major league baseball today. The Jays are a case in point and I'll give you an example: Brandon Belt. 

The Jays brought Belt in to provide veteran leadership and good defence at first. He was never going to play everyday but rather was a role player, doing time at DH and backing up Vladdy at first. He could defensive sub in tight games. It is a valuable role on the team. Belt got off to a bad start, as happens but the Jays kept him in and he's played more and more over time. Right now, he's batting third and seems -- barring something unforeseen -- ensconced their for the rest of the year. 

When you watch Jays games, the announcers make a lot of the fact that some -- often nameless -- people were down on Belt because of his slow start. I might say that anyone who knows much about baseball was not down him and recognizes that not everyone gets out of the gates at the same pace. It is annoying (and likely more so to the coaching staff) but nothing to write home about. It is a regular part of the game. After this, however, the announcers go on to make a comment the goes something like "Belt has the second highest OPS, OPS+, etc., since date X."

This is less useful. It is good to know but what are more traditional stats telling us about Belt: that he's not driving in runs at the pace one needs from a top of the order guy. Right now Belt has 350 plate appearances (the lowest of a non-catcher starter or a player who has not missed time for injury, but not a bad number). The problem is that he has only 37 RBIs and 15 of those RBIs come from homers. IOW, aside from himself, Belt has driven in only 22 other runs this year. That is a bad number no matter how you cut it. My point, of course, is not that Belt is secretly a bad player. I don't think he is. My point is that the focus on new analytic stats -- where he looks really good -- disguises the fact that the Jays have put a guy at the top of their lineup who is not doing what a guy at the top of the lineup is supposed to do. 

There are a lot of qualifications, to be sure, and it would not be fair to lay the blame for the Jays inconsistency at the feet of the Belt. But, it might also be fair to say that he's not well cast to the role in which the Jays have him. 

Likewise, the fascination with exit velocity -- the speed of the ball coming off the bat -- and distance for HRs create other problems.  It does not matter how hard a ball is hit if that speed is not producing actual hits. Vladdy hits the ball harder than just about anyone. But, that velocity is not translating itself into either hits or RBIs. Again, don't get me wrong. Vladdy is an exceptionally good younger player who I'd want on my team. He's also the opposite of Belt. He got off to a blazing start.  Hitting .309 by at the end of April (I get my data from Baseball Reference) with an .885 OPS. We don't need to worry too much about precisely what that stats means but it all-star level. He's been up and down with a lot of down since.  Right now, he is not on pace to drive in 100 runs, something you need from your best player. In this case it is not an analytic stat that is disguising a problem but the fascination new measurable stats. 

Let me make this point clear: if a ball is a HR, what difference does it matter how far it traveled? It might be interesting for fans, but not relevant to the score. If you need to hit the ball 400 ft to get a home run, what difference does it make if the ball is hit 401 feet or 456 feet? If you hit a home run, what difference does it matter that its exit velocity was 106 mph or 90 mph. If you said "none," you were right. 

I didn't get to boredom and baseball so I'll carry that into another post, but finally, let me say a word about "best in the league since date X" syndrome. Baseball loves this kind of language and the Jays media is riven with it. But, it is also a problem. Again: I get it. Media commentators are trying to show a trend -- a player heating up or someone who has made an effective adjustment. But, it also misses a point and that point is this: it does not matter when you lose games if you lose them. For instance, imagine that a team needs to win, say, 90 games to make the playoffs. That means that they will lose 72 over the pan of the season. Does it matter when they lost those games? Again, the answer is no.  The fact that a player gets off to a bad start is not a reason to throw them away and. there will be ups and downs over a season. But the fact that a player is playing well at a certain moment and not at another moment could be a problem for a team that is looking for consistency, as the Jays are. Said differently, this start -- or this approach to reporting stats -- disguises something important. Playing poorly and losing games at the start of the season can keep. team out of the playoffs as surely as playing poorly and losing games late in the season. 

No comments:

Blue Jay Way II: A Real Gamble

I don't want to be mistaken for an old baseball fuddy-duddy. Last year I complained about analytics, but I did so as a fellow traveler. ...