Brendan Nyhan

James O. Freedman Presidential Professor of Government

Dartmouth College

New at CJR: The predictable failure of Americans Elect

I have a new column up at Columbia Journalism Review on the failure of the independent group Americans Elect and the pundits who overhyped the group’s efforts. Here’s how it begins:

On Thursday, the board of Americans Elect folded its presidential nominating process after the set of declared candidates repeatedly failed to muster the support required to receive the group’s backing. Despite spending $35 million on “swank offices”, a fancy website, and expensive ballot access drives, Americans Elect ultimately attracted neither a credible candidate nor widespread support.

If you read the op-ed pages, you might have had different expectations…

Read the whole thing for more.

May 18, 2012
The importance of NSF funding for political science

Last week, Rep. Jeff Flake (R-AZ) sponsored an amendment in the House that would prohibit National Science Foundation support for political science. Other political scientists have already written eloquently about the important research that NSF grants have supported. I thought I would build on their points by highlighting how work I have featured on this blog has been supported by the NSF:

1. I’ve frequently cited Keith Poole and Howard Rosenthal’s DW-NOMINATE estimates of Congressional ideal points. Despite its limitations, Poole and Rosenthal’s work, which has been supported numerous NSF grants over the past three decades, provides the best available evidence of trends in polarization and the extremity of legislators’ voting records.

2. Another important metric for understanding American politics is Jim Stimson’s measure of public mood, which I often invoke as a counter to prevailing mythology about presidential leadership. Stimson’s pioneering work shows that public opinion tends to move in the opposite direction of government policy, which is what led me to predict in May 2009 that “we should expect demand for government to decline substantially over the next few years given that Democrats control both Congress and the White House” even though Obama and the Democrats were ascendant. Stimson recently received a grant to collaborate with Frank Baumgartner to extend his measure of public mood to specific issue domains and to link it with data from another NSF-supported project measuring government attention to policy issues.

3. Over the years, I have cited the classic book Why Parties?, which was written by my Ph.D. adviser John Aldrich, as a corrective to the mistaken notion that parties are passé or that a centrist third party will emerge. Aldrich’s early research on modeling party activism received NSF support.

4. Finally, I’ve also cited NSF-supported research by Gary Cox on strategic voting, which is one of the key barriers to third party electoral success in a two-party system, and NSF-backed research by Cox and Mat McCubbins on how majority parties block legislation that would split their party from being considered on the floor.

While none of these studies have direct policy implications or economic benefits, that’s true of many of the projects funded by NSF. As Penn State’s Chris Zorn points out, Flake is targeting our discipline due to the perceived lack of value of its research – an intrusion into the scientific process that, as Flake himself admits, saves no money (see Ezra Klein and Inside Higher Ed for more on this point). I hope readers will contact their senators to help defeat this amendment.

Update 5/16 12:31 PM: For more on the thermostatic relationship between public opinion and public policy, see also this AJPS article (gated) by Chris Wlezien and his later collaborative work with Stuart Soroka, especially their book Degrees of Democracy.

May 15, 2012
NPR interview on partisanship versus facts

I was interviewed by NPR’s Shankar Vedantam for a segment that just aired on Morning Edition:

The story features my research with Jason Reifler on self-affirmation and misperceptions. Here’s the title and abstract of our manuscript (PDF):

Opening the Political Mind?
The effects of self-affirmation and graphical information
on factual misperceptions

People often resist information that contradicts their preexisting beliefs. This
disconfirmation bias is a particular problem in the context of political misperceptions,
which are widespread and frequently difficult to correct. In this paper, we examine two
possible explanations of the prevalence of misinformation. First, people tend to resist
unwelcome information because it is threatening to their worldview or self-concept.
Drawing from social psychology research, we therefore test whether affirming
individuals’ self-worth and thereby buttressing them against this threat can make them
more willing to acknowledge uncomfortable facts. Second, corrective information is
often presented in an ineffective manner. We thus also examine whether graphical
corrections may be more effective than text at reducing counter-arguing by individuals
inclined to resist counter-attitudinal information. Results from three experiments show
that self-affirmation substantially reduces reported misperceptions among those most
likely to hold them, suggesting that people cling to false beliefs in part because giving
them up would threaten their sense of self. Graphical corrections are also found to
successfully reduce incorrect beliefs among potentially resistant subjects and to perform
better than an equivalent textual correction. However, contrary to previous research,
affirmed subjects rarely differ from unaffirmed subjects in their willingness to accept new
counter-attitudinal information

For more information on our research, please see our original article on the difficulty of correcting factual misperceptions, our New America report summarizing the literature on misinformation and fact-checking, and our study of race of researcher effects and the Obama Muslim misperception (all PDFs).

May 9, 2012
New at CJR: Obama “evolves,” Romney “flip-flops”

I have a new column at Columbia Journalism Review on the differing narratives that the press has applied to the changing political positions of Barack Obama and Mitt Romney. Here’s how it begins:

Are Barack Obama and Mitt Romney so different after all? Despite the media’s portrayal of Romney as a uniquely craven politician, the recent controversy over Obama’s views on gay marriage highlights the ways that both candidates—like nearly all politicians—have adjusted their positions over their careers for political reasons.

Read the whole thing for more.

May 8, 2012
NPR interview on “The Death of Facts”

I was interviewed for NPR’s Weekend All Things Considered on “The Death Of Facts In An Age Of ‘Truthiness’” — here’s the audio (the part I’m in starts at 7:20 in the clip):

A transcript of the interview is here.

April 30, 2012
More on pre-accepted academic articles

A couple of weeks ago, I posted a proposal for four academic reforms. Most notably, I suggested that academic journals should pre-accept articles based on their design before results are available to authors, which would reduce the system’s bias toward statistically significant findings that fail to replicate in future studies.

Unsurprisingly, other people have had similar ideas. Here are some comments on the related proposals that I’ve come across:

1. Chris Said, a postdoc at the Center for Neural Science at NYU, emphasizes the importance of funding agencies in promoting replication:

Granting agencies should reward scientists who publish in journals that have acceptance criteria that are aligned with good science. In particular, the agencies should favor journals that devote special sections to replications, including failures to replicate. More directly, the agencies should devote more grant money to submissions that specifically propose replications.

The problem, however, is that most replications will continue to fail given the incentives produced by the current system. That’s why I’m most interested in his proposal for funding agencies to give preference to scientists who publish in “outcome-unbiased” journals:

I would like to see some preference given to fully “outcome-unbiased” journals that make decisions based on the quality of the experimental design and the importance of the scientific question, not the outcome of the experiment. This type of policy naturally eliminates the temptation to manipulate data towards desired outcomes.

He makes a convincing argument that NIH, NSF, etc. could play a key role in overcoming the collective action problem inherent to switching to a new system. I still think leading journals could make a contribution on their own, but the funding agencies could play a key role, especially in fields that are heavily grant-driven. In practice, this could mean both rewarding scientists who published in outcome-unbiased journals in the past as well as grant proposals that promise to submit the proposed study to such a journal.

2. George Mason economist Robin Hanson proposes results-blind peer review, a potentially more general approach that could be applied to non-experimental data:

I’d add an extra round of peer review. In the first found, all conclusions about signs, amounts, and significance would be blanked out. After a paper had passed the first round, the reviewers would see the full paper. While reviewers might then allow the conclusions to influence their evaluation, they could not as easily hide such bias. Reviewers who rejected on the second round after accepting on the first round would feel pressure to explain what about the actual results, over and above the method, suggested that the paper was poor.

Glymour and Kawachi offered a similar proposal in BMJ in 2005:

We offer a solution to this problem that lies at the disposal of journal editors. Preliminary editorial decisions could be based solely on the peer review of the introduction and methods sections of submitted papers. These two sections deal with the key issues on which editorial decisions would ideally be based: the importance of the research question and the potential for the study design and proposed analyses to inform that question.

Blinding reviewers to the results and discussion sections may pose some challenges to the reviewing process because elements of these later sections are also relevant for editorial decisions. However, these difficulties would probably be outweighed by the benefits of reducing publication bias. Peer reviewers might be asked to make a preliminary recommendation to the editor (reject or continue further review) on the basis of the merit of the study design and proposed data analyses—not on the findings themselves.

If manuscripts pass this initial stage then reviewers could be unblinded to the results and discussion sections. Our proposal could have the additional benefit of improving the clarity and detail of methods sections.

The problem, as a commenter on Hanson’s blog notes, is that many reviewers have already read relevant papers in their field or seen talks about them at conferences, especially in social science (which is characterized by very long publication lags). Even if the reviewer has not read the paper in question before being assigned the review, it’s often easy to look them up online and find the results. As a result, this approach could only work in fields where papers are not made public before they are published.

3. Columbia’s Macartan Humphreys, Raul Sanchez de la Sierra, and Peter van der Windt have proposed “comprehensive registration” for experiments in political science:

Researchers in political science generally enjoy substantial latitude in selecting measures and models for hypothesis testing. Coupled with publication and related biases, this latitude raises the concern that researchers may intentionally or unintentionally select models that yield positive findings, leading to an unreliable body of published research. To combat this problem of “data fishing” in medical studies, leading journals now require preregistration of designs that emphasize the prior identification of dependent and independent variables. However, we demonstrate here that even with this level of advanced specification, the scope for fishing is considerable when there is latitude over selection of covariates, subgroups, and other elements of an analysis plan. These concerns could be addressed through the use of a form of “comprehensive registration.” We experiment with such an approach in the context of an ongoing field experiment for which we drafted a complete “mock report” of findings using fake data on treatment assignment. We describe the advantages and disadvantages of this form of comprehensive registration and propose that a comprehensive but non-binding approach be adopted as a first step in registration in political science. Likely effects of a comprehensive but non-binding registration are discussed, the principal advantage being communication rather than commitment, in particular that it generates a clear distinction between exploratory analyses and genuine tests.

Unfortunately, the incentives to engage in this form of registration are weak. The comprehensive report format limits authors’ ability to produce the statistically significant findings that reviewers demand and may lead authors to opt out of registration or to shelve non-significant findings. That’s why it’s essential that pre-accepted articles be offered as an option to authors by top journals.

4. MIT’s David Karger has suggested changing the submission requirements for conference papers in computer science so that evaluations of the proposed system are conducted after acceptance, increasing the incentives for evaluation and reducing incentives to report that the evaluation results were successful.

5. Perhaps most notably, Northwestern’s Philip Greenland, the past editor of the Archives of Internal Medicine, conducted a pilot study of “mechanisms that might identify and reduce biases,” including a two-stage review process:

First, to understand the tendency of authors to submit mostly positive studies, we assessed the percentage of positive articles that authors submitted to the Archives. Of 100 consecutive submitted manuscripts assessed in June and July of 2008, 77% reported significant primary results, based on editors’ assessments of the results. If the articles had been categorized based on the authors’ interpretation of their analyses, a higher percentage of manuscripts would have fallen into the positive category. Of the manuscripts sent out for external peer review, over 83% of positive studies were accepted by the Archives. Only 3 negative studies were sent to external review, of which only 1 was ultimately accepted. Overall, only 5.3% of all negative studies that were submitted were accepted.

Recognizing that publication bias can result from reviewers’ enthusiasm for positive results, we next evaluated the willingness of our 58 most highly rated and prolific peer reviewers to participate in an alternate peer-review process. The proposed hypothetical alternate process involved 2 steps. First, peer reviewers would have access only to a modified abstract containing no mention of results, the full introduction describing the nature of the research question, and a complete “Methods” section to allow an evaluation of the quality of the research. With this information available, the reviewers would be asked to provide a preliminary assessment of the manuscript in the absence of the “Results” section. Following this preliminary assessment, we proposed that reviewers would gain access to the full article, including the “Results” section, and be asked to make a final evaluation of the manuscript. We hypothesized that this 2-stage procedure would force peer reviewers to make an initial evaluation solely based on the quality of the methods and that the result would be a more equitable consideration of well-performed negative studies. Of the 43 respondents, 37 (>86%) stated that they were willing to complete a full review following an abbreviated one as described herein.

We then turned to an assessment of the role of the editorial board. Prior to peer review, editors may decide to reject articles on their face value. Furthermore, editors assign reviewers and render final decisions after receiving reviewer comments. At the Archives, an editorial estimate of study rejection without any external peer review was roughly 70% of all submissions, whereas a JAMA study reported a 50% editorial rejection rate at that journal. These substantial figures suggest that any investigation of publication bias at the journal level ought to begin with, or at least include, the editors. Consequently, the aforementioned alternate review process was applied to the editorial review that occurred prior to outside peer review. In a pilot study, among a selection of submitted articles, a study was characterized as positive if an author’s conclusion about his or her primary outcome was portrayed as such. Of the 46 articles examined, 28 were positive, and 18 were negative (with an explicit attempt to oversample negative studies in this pilot research). Ultimately, 36 of the 46 articles (>77%) were rejected, consistent with prior publication decisions at this journal. Of note, editors were consistent in their assessment of a manuscript in both steps of the review process in over 77% of cases. This suggests that most of the time the editors’ decision after reviewing the “Methods” section alone does not change after reviewing the full results.

Although this provides some comfort, it is important to look at not only the majority of manuscripts but also the tail ends of the curve, because this is most likely where any bias would lie. In doing so, we found that over 7% of positive articles benefited from editors changing their minds between steps 1 and 2 of the alternate review process, deciding to push forward with peer review after reading the results. By contrast, in this small study, we found that this never occurred with the negative studies. Indeed, 1 negative study, which was originally queued for peer review after an editor’s examination of the introduction and “Methods” section, was removed from such consideration after the results were made available.

We admit that these findings are neither conclusive nor definitive but rather a descriptive analysis from a pilot study. Certainly, it is reassuring that the editors were mostly consistent in their opinions regardless of the results. However, in the minority of cases in which bias matters, the influence of the results on the editor’s decision to move to peer review and ultimately to publication is still uncertain. There is a dearth of rigorous research on editorial bias and the possible interventions that may attenuate it. The alternate review process piloted at the Archives has never been performed before, to the best of our knowledge, although it has been suggested. Importantly, such a mechanism can be implemented both with editors and peer reviewers, addressing 2 sources of potential bias over which a medical journal can have the most direct impact. The negative trial by Etter et al published in this issue of the Archives was a part of our pilot study. Obviously, the editors supported peer review and publication of this study based on the rigor and quality of its methods alone, and that decision was sustained even when the negative results were revealed to them.

Greenland is to be commended for his willingness to innovate, but the results reported above suggest some of the challenges that a two-stage review system will face and the need for further experimentation by journals. Most disappointingly, the current instructions to reviewers provided by the journal make no mention of the two-stage process, suggesting that the approach has been abandoned by his successors. Let’s hope some other journal editor out there is willing to experiment further.

Update 4/30 10:06 AM: One challenge raised by Chris Said via email is how these approaches could be adapted to fields like neuroscience in which articles typically include several studies that build on each other. Here are two possible approaches I’ve contemplated:

1. The journal offers rounds of results-blind reviewing in which authors propose Study 1, get results, and then come back for a second round of results-blind reviewing. This approach would ensure that each round was fully outcome-unbiased, but would increase the burden on reviewers and editors.

2. An alternate option would be for authors to conduct a set of exploratory studies 1…x and then submit the design and analysis plan for study x+1 on a pre-accepted basis. Readers would then be told that the results of studies 1….x were not pre-specified but that study x+1 was pre-specified.

Also, I’ve updated the Humphreys item above to include his co-authors on the paper in question (which is not yet available publicly). Finally, see Hanson’s followup item here.

April 27, 2012
Obama’s second scandal: Secret Service

A few weeks ago, I noted the arrival of the GSA scandal as the first under President Obama to meet the standard used in my research: a front-page Washington Post story that focuses on the controversy and describes it as a “scandal” in the reporter’s own voice. I then taped an interview with NPR’s On the Media about my research in which I noted the role that slow news periods play in fomenting scandal and suggested that Obama was vulnerable to executive branch scandal in the period before the fall campaign.

Just two days after the interview was taped, news broke of Secret Service agents hiring prostitutes in Columbia. This controversy quickly became Obama’s second scandal according to the standard I’ve proposed. Indeed, it has racked up six front-page Post stories since April 17 (by comparison, the GSA scandal has had only two).* After years of avoiding scandal, President Obama is learning how easily it can engulf an administration – it’s quite a reversal.

* The Post appears to produce different versions of its front page for different editions. To maintain consistency with my research, my posts on Obama scandal coverage in the Post use the articles and page numbers archived in the Nexis news database.

April 26, 2012
Academic reforms: A four-part proposal

Academia tends to be slow to embrace change, but here are a few ideas that I think are worth considering for improving how we evaluate students, conduct research, and run our journals.

1. The pass/fail first semester

Two of the most significant problems we face in higher education are grade inflation and underprepared students. There are no easy answers to either problem, but one of the best approaches I’ve seen is the pass/fail first semester used at Swarthmore College (my alma mater). Let me quote from a blog post written by a first-year student there last fall, which I just came across on Google — it’s completely consistent with my experience:

The first semester for every first-year at Swat is pass/fail. I love this system, and it’s one of so many reasons why the approach to academics at Swarthmore is fantastic.

Taking classes pass/fail deemphasizes the importance of grades. That seems obvious, and we heard that over and over again from the administration, our advisors, and upper class students. I didn’t really internalize the significance of that, however, until just recently…

The pass/fail semester helps first-years adjust to college. With some stress removed from academics, there’s more time to focus on other aspects of college: meeting new friends, joining interesting clubs, and trying not to get lost on the way to the fitness center (I had particular trouble with that last one). I’m not saying that this first semester is a breeze, or that it should be. It’s important to learn study habits that work for college, and figuring out how to manage your time is obviously essential (for example, spending one hour online-shopping for every half hour spent reading did not end up working for me). What’s great is being about to adjust without having to simultaneously stress out about grades.

Grades will come next semester, but the class of 2015 will tackle our workload with a greater appreciation for the material learned, and an understanding of the importance of the learning process, not just the grade received at the end of the year. I’m so glad Swarthmore gave us this adjustment period.

The pass/fail semester helps students get excited about learning for learning’s sake before worrying about grades, and it provides underprepared students with a chance to catch up before their performance is recorded on their permanent transcript. It’s worth considering whether the practice should be adopted both here at Dartmouth and elsewhere in higher education.

2. The pre-accepted article

Academics face intense pressure to publish new findings in top journals. In practice, those incentives create massive publication bias. Social scientists tend to think of medical and scientific journals as being more rigorous, but even most of the results published in those journals tend to fail to replicate. While some fraud may occur, the problem is more likely to be one of self-deception — as human beings, we’re simply too good at rationalizing choices that produce the results we want.

One response to this concern is preregistration of experimental trials — a practice that is mandated in some areas of medicine and is beginning to be done voluntarily by some social science researchers conducting field experiments (particularly in development economics). The idea is that the author has publicly stated his or her hypotheses before the data have been collected and that the results are therefore less likely to be spurious. The best example of this that I know of is the Oregon Health Insurance Experiment, which publicly archived its analysis plan before any data were available and explicitly labeled all unplanned analyses in their manuscript (PDF).

Unfortunately, preregistration alone will not solve the problem of publication bias. First, authors have little incentive to engage in the practice unless it is mandated by regulators or the journal to which they are submitting. In addition, authors may still make arbitrary choices in how they code, analyze, and present the results of preregistered trials. But most fundamentally, if trial results are more likely to be published when they deliver statistically significant results, then publication bias is still likely to ensue.

In the case of experimental data, a better practice would be for journals to accept articles before the study was conducted. The article should be written up to the point of the results section, which would then be populated using a pre-specified analysis plan submitted by the author. The journal would then allow for post-hoc analysis and interpretation by the author that would be labeled as such and distinguished from the previously submitted material. By offering such an option, journals would create a positive incentive for preregistration that would avoid file drawer bias. More published articles would have null findings, but that’s how science is supposed to work. A shift to a preregistered article system would also create healthy pressure on authors, editors, and reviewers to (a) focus on topics where we care about the null hypothesis; (b) keep articles short; and (c) make sure studies have enough statistical power to have a high likelihood of capturing the effect of interest (if real).

3. The replication audit

Ideally, every journal should follow the practice of the American Economic Review and require authors to submit a full replication archive before publication. However, my colleague Brian Greenhill has suggested a way that journals or professional associations could go even further to encourage careful research practice: conduct replication audits of a random subset of published articles. At a minimum, these audits would verify that all the results in an article could be replicated. They could conceivably go further in some cases and try to recreate the author’s data and results from publicly available sources, re-run lab experiments, etc. when possible. An audit system would of course work best for journals that require replication archives to be made available — otherwise, it could discourage authors from sharing replication data.

4. A frequent flier system for journals

Journals depend on the free labor provided by academics in the peer review process. Reviewing is a largely thankless task whose burden falls disproportionately on prominent and public-minded scholars, who receive little credit for the work that they do. As a result, manuscripts are often stuck in review limbo for months, slowing the publication process and stalling both the production of knowledge and the careers of the authors in question. How can we do better?

One idea is to develop a points system for each journal analogous to frequent flier miles. Each review would earn a scholar a certain number of points with bonuses awarded by editors for especially timely or high quality reviews. The author could then cash in those points when they submit to that journal in order to request a rapid review of their own manuscript. The journal would in turn offer those points to reviewers who review the manuscript quickly, helping to speed it through the process. It would not be useful for reviewers who don’t submit to the journal in question, but for reviewers and authors who interact with a journal over a period of decades, it could help provide greater incentives for rapid and thoughtful reviewing.

Update 4/27 10:16 AM: Please see my followup post for more on pre-accepted articles.

Also, it turns out that a large group of psychologists are engaging in a collaborative replication audit of psychology articles published in top journals in 2008 called The Reproducibility Project – see this article in the Chronicle of Higher Education for more about the project.

Finally, I recently discovered that the American Medical Association offers continuing medical education credits to reviewers for Archives of Internal Medicine who “have completed their review in 21 days or less with a rating of good or better.” CME credits are presumably not as strong an incentive as faster review of one’s own articles, but I assume they’re better than nothing.

April 16, 2012
On the Media interview on presidential scandal

I was interviewed by Bob Garfield for On the Media about my research on the role of the news cycle in presidential scandal (PDF), including the GSA scandal that I posted about a couple of weeks ago. Here’s the audio for those who are interested:

April 16, 2012
New Tumblr blog

I’ve set up a Tumblr version of this blog so that those who use the service can follow me there, reblog posts, etc. Please check it out if you’re a Tumblr user!

April 13, 2012