The art of designing sound, reproducible experiments is nearly impossible to perfect. As someone who has been involved in behavioral neuroscience research both in and out of the pain field for a few years, I often find myself contemplating the multitude of factors that could skew our findings: a particular room’s environment (all seemingly identical testing rooms are not created equal), outside disturbances (anyone ever experience the mid-testing fire alarm?), and even my mood at the time of testing (hunger definitely makes one impatient!). We try our best to control for all possible variables, but irreproducibility of research is a persistent, common problem.
The ability to reproduce studies, especially in preclinical research, concerns the scientific community for obvious reasons—if the finding is only true for one lab at one time, it must not be true in general and cannot be applied to clinical research. Recent articles, such as “Trouble at the Lab” in the October 2013 edition of The Economist have made this crucial issue a public one. How can we as researchers take additional steps to address this widespread, increasingly public concern?
Although the problem is broad and will surely take time and a shift in science culture to improve, the four speakers at the “Enhancing Reproducibility of Neuroscience Studies” symposium at the Society for Neuroscience annual meeting in Washington, DC, US, each laid out clear, achievable steps to tackle the issue. Story Landis, former director of the National Institute of Neurological Disorders and Stroke (NINDS), co-chaired the session with Thomas Insel, director of the National Institute of Mental Health (NIMH), both at the US National Institutes of Health (NIH), Bethesda, Maryland. Landis was the lead author of commentary published in Nature two years ago highlighting the continued, highly collaborative efforts needed to confront this challenge (Landis et al., 2012). To open the symposium, she gave a brief introduction outlining key issues that contribute to irreproducibility, including the misuse of statistics, the intrinsic difficulty of studying complex biological processes, and a lack of transparency in reporting data—all resulting in halted progress of translational studies and clinical work.
NIH director Francis Collins, the first speaker, conveyed the NIH perspective on causes of and solutions to irreproducibility. To emphasize the enormity of the problem, he referenced a 2011 article in Nature Reviews Drug Discovery in which the authors had tried to replicate 67 published target validation studies (mostly in oncology) and found that a whopping 65 percent were not fully reproducible (Prinz et al., 2011).
Collins then went on to identify potential causes of poor reproducibility: the innate novelty of cutting-edge science, confounding variables, and differences in resources. He offered a simple equation of causes leading to irreproducibility: (Deficient experimental procedures) x (Lack of transparency in reporting) x (Publication bias) = Poor reproducibility.
To alter the equation and enhance reproducibility, Collins suggested focused education to improve deficient experimental procedures, better review processes to achieve transparent reporting, and openness in publication so that all outcomes may be published—even negative ones. More specifically, in terms of education, Collins proposed raising community awareness, improving formal training at all levels, and developing clear guidelines and courses for experimental design. In fact, there are now formal principles and guidelines for reporting preclinical research on the NIH website, to which more than 120 journals have agreed to adhere.
Regarding transparency of reporting, Collins and the NIH suggest that grant applications and publications involving in vivo work report blinding, randomization, the handling of all data, and justifications of sample size. He also urged the need to end “p-hacking”—the practice of re-analyzing data with different statistical tests to achieve a significant p-value.
Importantly, Collins emphasized the need to reduce the pressure of “publish or perish,” which would entail a major cultural change in the academic world. To initiate this attitude shift, he sees a need to focus on a scientist’s accomplishments, not just publications; to fund study replications, not just novel research; and to somehow increase job stability for investigators.
Next up, Veronique Kiermer, executive editor and head of Researcher Services at Nature Publishing Group, discussed the role of journals in tackling irreproducibility. She, as did others, stressed the topics that are rightfully addressed under the umbrella of improving reproducibility: using statistics properly, designing sound experiments, as well as eliminating bias and cherry-picking of data. Addressing the problem of irreproducibility, Kiermer said, is not about addressing scientific fraud (a whole other beast). Nor does she assume that fixing replication issues will automatically solve the lack of generalization of results across model systems. In addition, attempts to enhance reproducibility should not overlook the intrinsic complexity of biological systems—researchers can try to eliminate irreproducibility as much as possible with sound design and statistics, but many systems are inherently complicated and can introduce unexpected, even imperceptible issues.
Kiermer also discussed how Nature has already worked toward addressing the issue of reproducibility; in May 2013, the journal implemented a checklist of reporting standards, eliminated the length limit of the methods section, increased the scrutiny of statistical methods, and re-emphasized sharing of data among researchers. The role of journals regarding this problem, she summarized, should be to raise awareness, catalyze and facilitate discussion, ensure proper reporting, and respond quickly and thoroughly to criticism of published papers.
Following Kiermer, Huda Zoghbi, a professor at Baylor College of Medicine, Houston, Texas, US, and an HHMI investigator, presented approaches to enhance disease-oriented research. She outlined her method for improvement with four themes. First, she said, understand the disease: patients are our inspiration, so we must know them well to thoroughly grasp the disease. According to Zoghbi, this entails either direct evaluation of patients by the researchers themselves, or a close collaboration with a physician with first-hand knowledge of the disorder. Second, understand the values and limitations of animal models: we must recognize their limits, be careful with our interpretations, and allow only the models with clinically relevant phenotypes to persist. Third, pay careful attention to experimental design and statistics: Zoghbi emphasized that basing new work off of poorly designed studies is akin to throwing money out the window, since the new research often fails.
Zoghbi also pointed out a few statistical pitfalls particular to neuroscience. In cellular studies, for example, multiple neurons taken from the same mouse for cellular recordings are not truly independent samples, yet this is how many studies are conducted. She recognized that a new mouse for every neuron would come at great cost, both monetarily and in time spent, but that this may be the only proper solution.
Her potential solutions to irreproducibility include supporting collaborations, helping funders and editors to recognize the importance of replication, encouraging sound experimental design, and promoting the quest for truth while not fearing failure.
The last speaker of the symposium, John Morrison, dean of Basic Sciences and the Graduate School of Biomedical Sciences at Mount Sinai Hospital, New York City, US, discussed solutions via better training of students and professionals alike. At the laboratory level, he emphasized the need to discuss best practices (e.g., detailed notes and methods) and to instill these rigorous procedures in lab culture by giving them high priority. Concerning learning of laboratory methods, he recommended training with the original source if at all possible to eliminate any confusion or alterations that were passed down along the way. He also discussed the desperate need to increase statistical training for scientists at all levels, including postdoctoral fellows and onward. In addition, like others before him, Morrison brought up the need to reduce pressure for tenure; this pressure encourages many of the bad practices that can perpetuate the vast issue of irreproducibility.
To conclude the symposium, Morrison assigned required reading to everyone in the audience interested in this ever growing topic (“All of you must be interested if you’re here at 8 a.m. on a Sunday!”): “Rigor or Mortis: Best Practices for Preclinical Research in Neuroscience,” published recently in Neuron (Steward and Balice-Gordon, 2014). If you’ve made it to this point, you are likely also interested, so I suggest you read it as well. Undoubtedly, a cultural change in all realms and aspects of the scientific community is essential to improving the reproducibility of research in the field of pain, neurobiology, and science at large.
Chelsea Nickerson is a research assistant in the F.M. Kirby Neurobiology Center, Boston Children’s Hospital, US.