Throw It Like a Ballplayer

providing baseball commentary and ponderings since April 2010

Posts Tagged ‘sabermetrics’

Quick thought on “clutch”

Posted by dannmckeegan on December 20, 2010

A clutch hitter is the kind of guy you always want coming up with the game on the line. Big situations demand big-time players with big-time confidence. Or so conventional wisdom dictates. Sabermetrics has yielded many zealots both defending and denying the existence of clutch hitters. On one side is the sabermetric majority (and baseball fan minority, I presume) that accepts evidence that clutch performances exist despite a lack of long-term consistency in individual “clutch” numbers for a given player. The other side consists of the faithful, the true believers, those who want to believe in the Clutch Player in the same way one buys into historical clean-up jobs on national or sports heroes.

Joe Sheehan of Baseball Prospectus puts it beautifully when he writes:

In trying to get across the notion that no players possess a special ability to perform in particular situations, the usual line we use is that clutch performances exist, not clutch players. That’s wrong. The correct idea is that clutch performances exist, and clutch players exist: every last one of them.

What he’s saying is that, by virtue of making it as far as the big leagues, all of those guys have passed the tipping point at which pure skill is no longer the only factor. They have intangibles that go along with skill and a hefty side dish of luck. Another way of putting it is the analysis provided by Tom Tango on The Book blog:

[E]ven though we have determined that clutch skill exists in that population of players, it is simply too hard to identify the specific players that it makes any practical difference.

Basically, any reasonable amount of plate appearances from which to glean any information is too small a sample size for clutch ability to normalize. Furthermore, the difference between a clutch hitter and not-clutch hitter is pretty much the same difference you’ll get from the platoon advantage with mirror-image doppelgangers. A ten-year career for a full time player is the bare minimum for getting a legit sample size.

One shorthand statistic has been developed, however, that is pretty useful. It’s called “Clutch,” and you can find it over on FanGraphs player pages. It is simply the difference between Win Probability Added and “WPA/LI,” or the ratio of Win Probability Added to Leverage Index. That’s a fancy way of saying that, through years of analysis, sabermetricians have assigned leverage values to different game situations and win probability adjustments to play results. No one denies the existence of clutch performances. I refer anyone who questions that to game 7 of the 1991 World Series.

So What?

Well, I wanted to glance through some of baseball’s “known” clutch and un-clutch hitters to gather their 2010 Clutch stats to see what kind of variation from the public perception we get in a random year. Let’s start with the most common comparison of clutch vs. un-clutch, Yankees infielders Derek Jeter and Alex Rodriguez:

A-Rod: 1.44, led team by 0.90 (6th positive year against 11 negatives; career -6.72 regular season +.90 postseason)

Jeter: -0.40, 3rd-lowest on team among regulars (10th negative against 6 positive; career +1.37 regular season, -1.14 postseason)

Interesting to see not only that A-Rod was clutchier in 2010, but also that the two men have comparable year-to-year variations, but quite different cumulative tallies. Basically, when Jeter is good, he’s truly clutch, while his Mr. November moniker might not be so well-deserved after all.

Now let’s jog over to Boston to compare the reputedly clutch immobile DH and the supposedly soft, fragile and choke-prone right fielder:

David Ortiz: -0.18 (7th negative year against 7 positives; +2.44 regular season, 0.96 playoffs)

J.D. Drew: -0.17 (9th negative year against 4 positives; -2.87 regular season, -0.03 playoffs)

Drew and Ortiz were equally ineffective in clutch situations, but the difference we need to recognize is that Ortiz had significantly more WPA and WPA/LI than Drew. Thus, his lack of “clutch” was counter-balanced by his greater propensity for finding himself in such situations and coming through quite often. Drew’s opportunities were far fewer and further between. Also, Drew joins both Yankees in having a more-or-less 2:1 negative-to-positive ratio of Clutch seasons. Further study would then, perhaps, be to look across the board at long-tenured veterans with enough PA to qualify as at least close to a valid sample, and to see if such a trend emerges beyond this group. That is, might we measure clutch best as a meta-narrative for a whole career? Hypothesis: while similar approach/contact can lead to varying results, over time those variances will balance out. Whether the sum total of single or multiple seasons in positive or negative, “clutchiness” might best be found within those who outperformed expectations over the course of individual seasons at a greater frequency than their peers.

Until next time…

Remember to tune into The Dann and Twan Show on Slam Internet Radio ( this Tuesday at 8pm CST. Yes, it’s a special holiday edition, complete with the Festivus traditions of grievance-airing and the feats of strength.

Posted in Analysis, Statistics | Tagged: , , , , , , , , , , , , , , , | 1 Comment »

Rethinking “Clutch”

Posted by dannmckeegan on June 6, 2010

Is there any statistic that can prove or disprove the existence of “clutch” hitting or pitching in baseball?  The simple answer is, no, despite what number-crunchers have said on the subject.  The primary issue is that the concept of “clutch” is itself a subjective one and an arbitrary one.  When we are talking about clutch hitting or clutch pitching, we do not have a specified game situation in mind.  That is, a statistical study of batting average (in the seventh inning or later with runners in scoring position with a score separated by three or fewer runs) doesn’t tell us whether or not clutch exists.

Many sabermetricians, including some of the bigger names in the coven, appear to deny that clutch exists at all, since it does not present itself in their data.  This stance essentially requires a metric quantifying clutch into a single column for there to be any consideration of its existence.  There are other, more rational, analysts in the community who are willing to accept that it might exist, and a lack of quantified metric is not necessarily a reason to dismiss the notion.

Clutch hitting and pitching have long been assumed to exist based on a very simple principle: people do not all handle pressure in the same way.  In a 2000 article for the New Yorker, Malcolm Gladwell provided an insight into the difference between two kinds of failure: panicking and choking.  Gladwell sums up the two types of failure quite nicely:

Panic, in this sense, is the opposite of choking. Choking is about thinking too much. Panic is about thinking too little. Choking is about loss of instinct. Panic is reversion to instinct. They may look the same, but they are worlds apart.

Are these of any use to an outsider’s approach to “clutch” hitting and pitching?

On May 25, Bill Baer of Baseball Daily Digest posted an article “Analyzing the Internet’s Impact on Sabermetrics.”  While his overall thesis – that the web’s rapid rise allowed the dissemination of research and the building of a sabermetric community – is both accurate and absurdly obvious, I take issue with a major part of his overall argument:

When someone calls Derek Jeter a “clutch” hitter, I just go to Baseball Reference on the Safari browser on my iPod touch and point out that Jeter’s career .308 batting average with runners in scoring position is actually lower than his overall career .316 batting average. (His batting average with the bases empty is also .316.)…In the example above, the person espousing Jeter’s “clutch” ability was proven wrong…If, for instance, the barroom discussion was more philosophical and focused on “clutch” ability in general — does it exist? — before the primacy of the Internet, there would have been no way the “clutch” enthusiast could be proven wrong.

The problem with this entire line of reasoning is that the so-called clutch enthusiast has not been proven wrong, and the author has shown himself to be hampered by an unfortunate bit of tunnel vision.  If a team is up 10-0 and a hitter comes up with RISP, how much pressure is he under?  Is it really a “clutch” hit?  Conversely, in a one-run game in the 8th inning, a bases-empty, lead-off walk against a tough pitcher can be a very “clutch” plate appearance.  So there is no specific category from which statisticians can mine “clutch” data to find an answer to the question.

Allow me to return to Baer’s example, expanding on it to understand more fully my critique.

Jeter’s average with RISP does nothing to dispute his supposed clutch hitting ability.  Jeter is able to perform at his normal level of ability with runners in scoring position, to the tune of a .308/.403/.432 (compared to .317/.387/.458 overall).  But we also see the 16-point spike in on-base average.  Whereas his overall SO/BB ratio is 1.67:1, his SO/BB with RISP is 1.29:1.  Now, Jeter has spent about 90% of his career batting first or second.  On any given Yankees team, that has meant lots of lineup protection.  So Jeter’s primary statistical change with RISP – raising his walk rate to 12% compared to 7.7% in other situations – is itself a possible piece of evidence contrary to Baer’s position.  In context, Jeter is trying to get on base for the lineup’s run producers.

One primary aspect of “clutch,” or handling pressure generally, is the ability to remain calm and collected.  One could argue that, in pressure situations, a hitter’s ability to increase his patience, even if it ends up costing him a few more strikeouts, as well, is a positive thing.

To elaborate my argument, let’s look at Marcus Thames, a teammate of Jeter’s on the Yankees this season.  A career fourth outfielder or platoon starter, Thames possesses a line of .245/.311/.489 in just under 1,800 plate appearances.  He has hit 103 home runs and strikes out 3 times for every walk he draws.  Just for the sake of argument, let’s accept Baer’s RISP statistic as being meaningful.  Thames’ numbers are .241/.320/.480.  Thames performs at his career average, just like Jeter.

Now let’s use a slightly different definition of a pressure situation to measure Jeter’s and Thames’ performance.  We will add the condition of 2 outs to the RISP situation.  With two outs and runners in scoring position, Derek Jeter has a .315/.418/.453 line, a .385 average on balls in play and a 4:3 SO/BB ratio.  So his on-base, slugging, and BABIP all look really good in that situation.  The picture isn’t quite so rosy for Thames, however: he only has managed a .201/.299/.387 with a .256 BABIP and a slightly better-than-normal SO/BB ratio.

How do we square these numbers with the previous findings?  The first thing we must do is remind ourselves that neither of these game situations is necessarily under heavy pressure.  In a game separated by a wide margin, these aren’t nearly as important in context as they would be in the later innings of a close game.  So they are imperfect frames to begin with.  However, they do provide us with a few initial glimpses into what might be encompassed by the “clutch” concept.  It is far more than just one statistical category.  Players are unaware of the leverage indices of a given game situation, as they should be.  But pressure, being subjective, is omnipresent.  So it is far more important to study failure in certain game situations than success.

Tying this back to the beginning, what we see in Derek Jeter is a player who very clearly shows no signs of panicking or choking.  Thames, however, does see a meaningful statistical drop when there are two outs and runners in scoring position.  He’s just not as good then.  Already a bench player, Thames is under extra pressure to perform in clutch situations.  While he isn’t the high-paid star, he is the player who takes occasional PAs away from that star when he rests.  Under the lights and in front of the fans, he has far less experience handling these situations.  He simply has fewer in-game, real life reps.  Considering the minuscule margin for error in hitting a baseball, it’s easy to see the potential for nerves inducing either a minor panic or minor choke wherein the player thinks too little or thinks too much.

Therein lies the essence of clutch, and why it is so statistically elusive.  Players acquire experience in those tough situations, and the best ones will obviously show no panic and little choke.  It comes with the territory of being a star.  For the other players, the decent to pretty good ones, we may well expect to see some of that panic and choke come through at times.  But we also know that luck comes into play.  Just consider Brooks Conrad’s grand slam to cap off an 8-run comeback some weeks back for Atlanta.  Few would truly consider that clutch, as opposed to luck.

The first step in understanding clutch is recognizing that it is not unique to baseball or sports.  It is simply a term that conveys a third party’s perception that an individual performs well under pressure.  If I take a standardized test and know the answers to 95% of the questions, a score of 95% is a clutch performance.  I didn’t make a careless error or have a brain cramp.  A higher score means I made a few lucky guesses.  A lower score means I either choked or panicked.  Apply the same analogy to any given job, from no-collar to white collar.

Clutch isn’t about going above and beyond the rational limits of accomplishment.  Rather, it is about maintaining a level head despite the increased pressure.  When the pressure increases, the clutch performer is unfazed.  It’s everyone else who can be expected to do less.

To examine this hypothesis further, a few lines of inquiry can be pursued:

1) Is there a year-to-year correlation of potentially clutch statistics (such as 2 outs RISP or late & close) for individual players?

2) Is there a greater disparity, on average, between the overall and possibly clutch stats of bench players than lineup regulars, or poor hitters and good hitters?

3) What impact might sample sizes and luck have on the numbers for players with fewer plate appearances?

Posted in Analysis, Opinion | Tagged: , , | Leave a Comment »