Throw It Like a Ballplayer

providing baseball commentary and ponderings since April 2010

The NL Central Quarterly, pt. 2

Posted by dannmckeegan on May 21, 2010

With the 2010 season now past the 40-game mark, a quarterly report is due. This is the second part of a two-part analysis. The first half of was published here yesterday. It covered the relationships between the records of the six teams in the NL Central and runs both scored and allowed. Today’s entry will deal with the two questions left unanswered yesterday:

1) What relationship exists between the 4-run barrier and the individual teams’ runs scored and runs allowed?
2) Does the cumulative view change much when we separate runs for and against each team into categories above and below the 4-run barrier?

If you missed the article yesterday, this 4-run barrier is simply a binary split of games in which a given team’s offense was or was not able to score 4 or more runs. My interest in this split began with the Chicago media’s harping on the Cubs’ 1-17 record when they score three or fewer runs. The next angle that interested me was that a team receiving a quality start would probably be in a position to win those games with a minimum of 4 runs of support.

Pythagorean Records: If You Don’t Like Math, Feel Free To Skip Ahead

My curiosity naturally drew me to Pythagorean records. Bill James, one of the first and biggest sabermetrics names, came up with the idea that it is a square law rather than a linear relationship that defines the correlation of runs and record. The Pythagorean name, along with a few derivations such as the Pythagenport and Pythagenpat formulae, comes from the similarity between the Pythagorean theorem (a2 + b2 = c2) and the basic equation used for the runs-based winning percentage:

Wpythagorean% = (Rfor)2 / [(Rfor)2 + (Ragainst)2]

This really is no different from the form of the normal winning percentage equation (wins divided by total games). Here, though, runs rather than wins and losses are used. “Runs for” are analogous to wins, while “runs against” substitute for the loss column. The inclusion of the exponent shows that this isn’t a linear relationship. This is actually entirely intuitive: the difference between scoring 1 run and 2 runs can be big, but it’s nowhere near as huge as the divide between scoring 1 run and being shut out. So as the extremes are approached, real winning percentages also tend toward extremes.

Now, with years of data to examine, sabermetricians have found some more accurate exponents and overall calculations, but the simple square law works pretty well. According to the FAQ of BaseballReference.com, 1.83 is generally an accurate exponent for a basic calculation. Because I’m working with raw data rather than metrics, I’ve used the 1.83 as a compromise between 2 and anything that legitimately wants me to incorporate π and θ for a free blog.

Using Pythagorean Records To See What We Can See

Because I had already broken down runs scored and runs against into the two categories of <4 and 4+ runs scored games, it was easy to take the next step and work on not only overall Pythagorean records, but also component Pythagorean records. The goal here was to see how run distribution, rather than just run difference, has factored into the NL Central’s uninspiring landscape.

I’m using data accurate as of Wednesday of this week, the same used in yesterday’s entry. Let’s quickly review the standings and runs for/against at the quarter-mark:

TEAM RECORD GAMES RUNS FOR R AGAINST R DIFF
Cincinnati 23-17 40 192 195 -3
St. Louis 23-18 41 170 143 +27
Chicago 19-22 41 188 194 -6
Pittsburgh 18-22 40 141 245 -104
Milwaukee 15-25 40 210 234 -24
Houston 14-26 40 122 189 -67

Seeing as St. Louis is the only team with a positive run difference, we are assured of having only one .500+ team in the Pythagorean Central. Let’s see those numbers side by side with the Pythagorean records of the teams through 40 or 41 games:

TEAM RECORD WIN% PYTHAG. WIN% PYTHAG RECORD – ROUNDED
Cincinnati 23-17 .575 .493 20-20
St. Louis 23-18 .561 .578 24-17
Chicago 19-22 .463 .486 20-21
Pittsburgh 18-22 .450 .267 11-29
Milwaukee 15-25 .375 .451 18-22
Houston 14-26 .350 .310 12-28

I know that the Cincinnati line seems odd: a percentage under .500, yet a .500 record. This is simply a matter of rounding. I chose not to give a “partial” record. Cincinnati, Chicago and Milwaukee are pulled a bit closer toward .500. The Cardinals get a bit of breathing room up top. Houston loses even more. And Pittsburgh…well, I think only former Arizona Cardinals coach Denny Green can express it:

It is really striking to see how negatively this formula views the Pirates in contrast to the mild deviations of everyone else. Houston is understandable: Thursday night, Roy Oswalt fell to 2-6 on the year despite a 2.66 ERA. But for Pittsburgh to drop from 18 to 11 wins? Well, the Pirates are #15 in the NL in both runs scored and runs allowed. It makes more sense for them to be in the cellar than clinging to mediocrity with bad stats. And Cincinnati’s fall from the top coincides nicely with their complete and utter 9th inning, 7-run meltdown against Atlanta. The Cardinals and Cubs? They each spent Thursday on opposite ends of separate one-run games. Guess which team won.

But before we get too busy with this new set of “standings,” let’s see what we learn from the 4-run barrier. Can we get any further insight by calculating “component” Pythagoreans? My expectation before running the numbers was that teams with a penchant for avoiding blowouts and losing close games would benefit, while eking out wins wouldn’t hold up to another level of scrutiny. Here are the results of the calculations:

TEAM WIN% PYTHAG W% PYTHAG W% <4 RFOR PYTHAG RECORD <4 RFOR PYTHAG W% 4+ RFOR PYTHAG RECORD 4+ RFOR CUMULATIVE PYTHAG RECORD CUMULATIVE PYTHAG W%
Cincinnati .575 .493 .270 4.0-11.0 .572 14.3-10.7 18-22 .450
St. Louis .561 .578 .220 4.0-14.0 .729 16.8-6.2 21-20 .512
Chicago .463 .486 .159 2.9-15.1 .644 14.8-8.2 18-23 .439
Pittsburgh .450 .267 .076 1.8-22.2 .596 9.5-6.5 11-29 .275
Milwaukee .375 .451 .085 1.6-17.4 .706 14.8-6.2 16-24 .400
Houston .350 .310 .088 2.1-21.9 .622 10.0-6.0 12-28 .300

Yet again, the numbers say that Pittsburgh is overachieving like a mother this year. By this measure, the Cubs and Brewers have essentially “evened out” their records with “when it rains, it pours” luck. When they score, they score, but they have also given up a lot of runs in those high-output games. They also have both underachieved in low-scoring games: luck dictates that they should be able to win some 3-2 or 2-0 games once in a while, at least one more than they have. The Reds drop back yet again. They appear to be riding a wave of good fortune. Some teams maintain that, but often those spurts are the kind of play that comes in inexplicable fits and starts. Houston simply doesn’t score 4+ often enough to support the good pitching they do get.

Analysis

This is all a roundabout way of analyzing a baseball division. The only methodological problem with turning out a component Pythagorean record is that the only closed interval is the “runs for” variable in the “fewer than 4” component. The “runs for” in the “4+” component and both “runs against” variables have open upper limits. The problem with this is the implicit knowledge that a 10-0 loss and a 20-0 loss will have differing effects on our “fewer than 4” component, just as a 20-0 victory is more beneficial to the “4+” component than a 10-0 win. Of course, this problem exists in any calculation that looks beyond individual game results.

You can’t put runs in the bank and cash them in on a later date. However, the use of cumulative “runs for” and “runs against” tallies allow us to even out a team’s over- and under-performance. We know that the Brewers aren’t as good as their 20-0 and 17-3 wins over the Pirates, but they also aren’t as bad as the 3 shutouts in 4 days they experienced at the hands of the Padres. Another level of study would be to further break down these numbers into a category of 4-6 runs, and a category of 7+. But that really belongs in a “Further Study” section, and this isn’t academic.

We knew before the start of the season that St. Louis looked like the front-runner, but they don’t appear to have the offense necessary to run away as predicted. The Reds and Cubs are the mediocre questions marks they were assumed to be. Milwaukee is toiling to earn the respect afforded mediocrity. Pittsburgh is a train wreck masquerading as a mediocre rebuilding team. The Astros are a mess that, even by the averages, is doing slightly better than should be expected.

This numerical study resembles the gut feelings many people have about the NL Central. Without any real digging into the individual players responsible for the good and the bad, it’s been pretty easy to see what happened and under what circumstances. The only question that has really been asked here these last two days is, How well have the offenses and defenses of the several NL Central teams coordinated their efforts? The Pythagorean calculations provide a ballpark figure for how often an offense has wasted a quality start and how often the pitchers couldn’t hold on when the offense was clicking.

An in-depth study of each team might yield some answers. Aramis Ramirez looks lost and the bullpen is bad in Chicago. Houston held on to its veterans too long. Matt Holliday better be a second half player in St. Louis. The new-found starting pitching in Cincinnati has to hold up. Milwaukee really should consider the value of a pitching staff. And the Pirates should really think about where they are going with Octavio Dotel, Ronny Cedeno, Aki Iwamura and Brendan Donnelley on their roster. But after a quarter of a season, we still only know what a team has been. That only gives us an idea of what the next 120 games will bring. More of the same? Maybe. But if we come at this from the right angle, we might question which “same” is coming due. Will it be the prolonged luck that has buoyed the Reds and Pirates? Or will the apparent odds catch up to the division, for good or ill?

With interleague play starting, the additional variable of the DH comes into play for a short period of time. There is a lot of baseball yet to be played, and lots of time next week to begin looking forward. But there also is time to look back and see how my arguments have held up. And for even more excitement, there are lots of numbers still to be played with, many of which don’t even touch on the Cubs. Enjoy the weekend, folks. The weather should be great, the Hawks have two games – a potential clincher on Sunday, perhaps? – and the Cubs are in Arlington for three.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: