Brendan Nyhan

Cat leaving bag on UC-Berkeley study

Despite their lack of social science credentials, the liberal activists at Media Matters are endorsing the highly problematic UC-Berkeley study of voting in Florida:

The mainstream media have mostly ignored a statistical study conducted by faculty and students of the University of California at Berkeley sociology department on voting irregularities in Florida in the 2004 presidential election that found major discrepancies in vote counts between counties that utilized electronic voting machines (e-voting) and those that used traditional voting methods. The study, released on November 18, determined that President George W. Bush may have wrongly been awarded between 130,000 and 260,000 extra votes in Florida — 130,000 if they were all “ghost votes” created by machine error, or twice that if votes intended for Senator John Kerry were misattributed to Bush.

And here’s where the train really goes off the tracks:

According to the Berkeley study:
-Irregularities associated with electronic voting machines may have awarded 130,000 excess votes or more to Bush in Florida.
-Compared to counties with paper ballots, counties with electronic voting machines were significantly more likely to show increases in support for Bush between 2000 and 2004. This effect cannot be explained by differences between counties in income, number of voters, change in voter turnout, or size of Hispanic/Latino population.
-In Broward County alone, Bush appears to have received approximately 72,000 excess votes.
-The Berkeley researchers can be 99.9 percent sure that these effects are not attributable to chance.

Extrapolating “excess votes” from these regressions is ridiculous, and they also use the p-value from the UC-Berkeley regression to give a misleading picture of how precise the results really are. Consider this: when you control for changes in party registration, the supposed e-voting effect vanishes. 99.9 percent sure indeed.

Update: The Berkeley study is now on CNN.com, Salon.com and in the San Francisco Chronicle. Also, Craig Newmark, a NC State economist, has a good post on the problems with the Berkeley study — here’s the key part:

1. They estimate a regression model. The dependent variable is “Change in % Voting for Bush from 2000 to 2004”. The observations in the analysis are all 67 counties in Florida . There are four independent, explanatory variables in their basic model: % Voting for Bush in 2000; % Voting for Bush in 2000, squared; total number of votes cast for Kerry and Bush in 2004; and–the variable of interest–a binary variable indicating whether the county used electronic voting in 2004. (15 of the 67 counties did so.)

The results from estimating the model using just these four explanatory variables do not support the hypothesis that Bush received an excess number of votes in counties using electronic voting. The coefficient on the electronic-voting variable is not positive–which would mean electronic voting raised Bush’s total beyond what would be expected–but the opposite, negative (albeit with a t-ratio of just -0.66).

2. To obtain their reported results, they have to add two other explanatory variables: % Voting for Bush in 2000 times the electronic-voting binary variable and % Voting for Bush in 2000, squared, times the electronic-voting binary variable. These variables reflect the additional claim that problems with electronic voting occurred ” . . . proportional to the Democratic support in the county, i.e., it was especially large in Broward, Palm Beach, and Miami-Dade [counties].” (Page 1 of the Hout., et. al. paper.)

But they have absolutely no theory for why electronic voting problems that spuriously created votes for Bush should affect counties in proportion to their Democratic votes!

All you need to know. (Links via Alex Strashny and Newmark.)

Update 2: UCLA’s Jason Lenderman has more on the statistical problems with the Berkeley analysis.