By Martin Kleinbard
“And the Best Fans in Baseball Are….”
The Minnesota Twins have the best fans in baseball.
Now that I have your attention, allow me to introduce myself. My name is Martin Kleinbard, and I’ll be blogging for CBS Local on all things sports analytics. My goal is to add a dash of quantitative flair to the time-honored barstool debates that make sports fandom so much fun. Don’t worry: I’m not trying to put an end to any of these debates with one brilliant Excel spreadsheet. Anyone who’s met me knows that I love to argue far too much for that. I just hope that I can give you, the loyal reader, a few more arrows in your quarrelsome quiver.
I also hope to make this blog an interactive, fan-first one. If you have any topics that you’d like to see more analytic research devoted to, send them my way and I’ll do my best to publish them. If it helps, you can think of me as a sabermetric Dear Abby—but please, don’t ask me to create an algorithm to help fix your love life.
Which baseball team has the best fans? If you were trying to answer the question as honestly and objectively as possible (hard for many of you, I’m sure), where would you start? I imagine that home attendance—being the most visible means by which a team interacts with its fan base—would be near, if not at, the top of your list of metrics.
As obvious as the initial inclusion of attendance in this discussion are the problems with using raw turnstile figures as a proxy for fan loyalty. Let’s assume Team A and Team B both draw exactly 2 million fans. Team A won the division, plays in a beautiful new ballpark, and is located in one of the largest metropolitan areas in the country. Team B rivaled the ’62 Mets on the field, plays in a glorified parking lot, and is located in a town where everyone goes by their first name.
All else equal, we should proclaim the latter’s fans superior in spite of the nominal tie in attendance. The logic behind this intuition is that what we really care about is how many fans show up to the ballpark irrespective of these on-field, ballpark, and demographic factors that are out of the fans’ control.
But how much better are Team B’s fans than Team A’s? And what if Team A had the better record but Team B had the better ballpark—how would we break the tie? Rather than stab in the dark (and, considering that MLB consists of 30 teams, not 2, that would take a lot of stabbing), I used the power of regression analysis to control for these factors, with the goal being to isolate the true impact of each team’s fan base.
Achieving the rankings was a two-step process. First, I used several on-field, ballpark, and demographic (i.e. fan-neutral) factors to predict attendance (1) through the process of linear regression for the 2001 through 2013 seasons. I then used the model to predict each team’s attendance, and called the difference between the actual attendance and fan-neutral predicted attendance as the bump (or drop) in attendance attributable to the fans.
First, the results from the fan-neutral metrics. Here’s how much we’d expect attendance to increase given a 10% increase in each variable:
Of all these variables, you’re probably most suspicious of how I quantified ballpark quality. Since that could be a whole post onto itself (and maybe one day it will be), I decided to crowdsource the issue and rely on Yelp ratings (2).
While teams in large media markets have recently reaped ludicrous TV deals, my data show that market size has a relatively small effect on attendance. All else equal, a winning team and top-notch ballpark are far more beneficial to filling seats. And, as the median income variable demonstrates, you’re actually better off being in a smaller, wealthier city than a larger, poorer one.
Ballpark age also factored into the models, but not for long. I found that a park’s first year gives the home team a 17.1% attendance bump, which trails off to 7.3% in the second year. By the time the park is three years old, the “honeymoon effect” is completely gone.
And finally, we reach the individual team fan base rankings. These results are the 13-year averages of each team’s gap between its actual and predicted attendance:
Like the ballpark age variables, these are all binary variables. They are meant to demonstrate how much the presence of a given team’s fan base helped or hurt its attendance, holding all the on-field, ballpark, and demographic variables constant. For example, the Twins’ fan base increased its attendance by 21% relative to the league average.
You can just as easily use these figures to compare two fan bases head-to-head. For example, Dodgers fans are 2.9% more rabid/more loyal/whatever-comparative-term-you-want-to-use than Giants fans (9.9% – 7.0%). In case Cubs fans need any more reasons to prove their dominance over Chicago, they can cite a 33.8% edge over the fans of the crosstown White Sox (19.2% – (-14.7%)). And who knew the Rays had better fans than the Yankees?(3)
If I’m looking at this from Bud Selig’s perspective (and of course he’s reading this as well), a couple of points jump out at me. First is the regional dominance at the top and bottom of the rankings. Teams in the Midwest own four of the top six spots, while two of the league’s only three teams from the Southeast (Miami and Atlanta) wind up in the bottom five. In other words: forget Vegas and Portland; why not move a team to Indianapolis? It’s the 12th largest city in the country and located in the heart of the heartland.
Another two cellar dwellers—the A’s and White Sox—suffer from their own geographic misfortune of being “little brothers” to more heralded neighbors. Maybe moving to San Jose won’t be far enough away from the Giants for the A’s to succeed at the turnstiles. Second, I’d want to take a deeper look at some of the “surprise” teams at the top of the list. We’re not shocked to see the Cardinals, Cubs, and Red Sox in the top 10, but what about the Twins, Brewers, and Astros? Maybe those teams have engineered some brand-building best practices that could be replicated by the rest of the league.
These points hint at an inherent flaw with the study. Are the rankings more a measure of fan loyalty, geography, the team’s marketing prowess, or some other factors that I haven’t yet considered? Maybe the Twins don’t actually have the best fans in baseball; maybe they just have a really good marketing department. One thing’s for certain: comparing teams’ marketing departments would make for a far less interesting barroom debate.
Coming next issue: I apply the same process to determine the best fans in the NFL. Check back to see where your team lands.
1. If I were to use only overall attendance as the outcome variable, my results would be biased toward teams with large parks. Conversely, using only percent capacity would bias teams with smaller ballparks. To account for this, I created separate models to predict both, and then averaged the coefficients of each variable.
2. While a simple solution to a complex issue, using Yelp ratings as a proxy for ballpark quality has its share of shortcomings. The first is of sample size: some parks have only a few dozen ratings. Second, fans may be building the team’s performance into their ballpark ratings, which would compromise the independent nature of those two variables. Third, since this study dates back to 2001, a few now extinct ballparks are not featured on Yelp, which required some measured approximation on my part.
3. Though the 0.2% difference is well within the margin of error, meaning that we can’t be certain the Rays fans are in fact better. The Rays’ rankings were also helped by the fact that the team was dreadful for the first half of the time period studied. If we were to look at only the last five years, when the Rays were good and still not drawing any fans, their ranking would drop significantly.