The findings are based on a survey in which both experts and novices were asked to predict whether mouse experiments in six prominent preclinical cancer studies conducted by the Reproducibility Project: Cancer Biology (RP:CB) would reproduce the effects observed in original studies.
On average, the researchers forecasted a 75% probability of replicating statistical significance, and a 50% probability of reproducing the same size effect as in the original study. Yet according to these criteria, none of the six studies already completed by the Reproducibility Project (the last of which was published this week in eLife) showed the same results previously reported.
One possible explanation for the optimism is that cancer scientists overestimate the replicability of major reports in their field. Another is that they underestimate the logistical and methodological complexity of independent laboratories repeating these techniques.
The work follows on numerous reports exploring biomedicine’s so-called reproducibility crisis. In the last 10 or 15 years, there have been mounting concerns that some of the techniques and practices used in biomedical research lead to inaccurate assessments of a drug’s clinical promise.
Given that not all studies reproduce, Kimmelman and his team wondered if cancer experts could at least sniff out which studies would not easily replicate. The finding that cancer researchers’ ability to do so “was really limited” suggests that there may be inefficiencies in the process by which science “self-corrects.”
There is however strong community concern that, due to process-related issues and potential methodological differences, the replication studies themselves may not be an entirely reliable measure of replication outcome. Kimmelman emphasizes that the findings don’t indicate that scientists who participated in the study don’t understand what’s going on their field — nor does it diminish the importance of funding research and making policy on the basis of scientific consensus. Some scientists were highly accurate in their predictions, and participants were new to forecasting, which can be challenging.
The results do, however, raise the possibility that training might help many scientists overcome certain cognitive biases that affect their interpretation of scientific reports.
“If the research community believes a finding to be reliable, it might start building on that finding only to later discover the foundations are rotten. If scientists suspect a claim to be spurious, they are more likely to test that claim directly before building on it.”
“This is the first study of its type, but it warrants further investigation to understand how scientists interpret major reports,” Kimmelman says. “I think there is probably good reason to think that some of the problems we have in science are not because people are sloppy at the bench, but because there is room for improvement in the way they interpret findings.”