Papers That Scientists Could Not Replicate Were Cited More, Study Says

Photo: Tim Mossholder/Unsplash

Bengaluru: Scientists cited studies that were not reproducible – and therefore were likelier to be false – more than those that were, according to a new study published on May 21, 2021.

Science is additive. Most findings we know now as facts were built over many years of research, by various groups of scientists working around the world. Sometimes generations passed before a real-world application of some early discovery came into view. But all of this progress hinged on whether each incremental finding, from the first one, were true. This very nature of science makes it necessary for scientific studies to be reproducible, given the same experimental conditions.

One of the most important tenets of the scientific method is that the results of any experiment need to be reproducible. That is, if one group of scientists conducts an experiment under certain fixed conditions, another group conducting the same experiment under the same conditions needs to arrive at the same results.

The inability to reproduce the same results has laid many a claim to bed. Some prominent examples include claims that GFAJ-1 bacteria could substitute some phosphorus atoms in their body with arsenic, that MMR vaccines cause autism, and that striking postures suggesting powerfulness could make individuals more assertive.

Such claims don’t just pad scientists’ resumes (until they’re found out). They can have debilitating real-world consequences, too. For example, scientists can develop a cancer drug only if they identify a molecule to which the drug can bind to block the disease’s spread. But efforts to develop such a drug could meet a dead end if they find themselves unable to replicate a study that claimed such a molecule contributes to worsening the disease.

In the early 2010s, the scientific community began to confront a ‘replication crisis’, some years after John Ioannidis’s famous paper. Ioannidis had argued that most published research findings may be false because of the kind of statistical analyses scientists used to test them. Scientists were unable to reproduce the results of many previously published studies under the same experimental conditions – especially in psychology, but extending to medicine and economics as well.

In 2015, one group of scientists reported that they tried to replicate a hundred psychology studies – and found that they couldn’t reproduce the results of two-thirds of them. They also wrote that in the studies they could replicate, the results were not as stark as the originals’ authors had claimed them to be.

In the new study, Marta Serra-Garcia and Uri Gneezy, members of the faculty of management at the University of California, San Diego, wanted to assess the impact of this and two other replication efforts on scientific practice.

In 2016, a team tried to reproduce 18 economics studies published in two leading journals, and failed to replicate seven. In 2018, a few others attempted to replicate 21 papers in social sciences published in Nature and Science to find that only 13 studies held up. And again, in both instances, there was evidence that the original findings could have been overstated.

“We wanted to better understand the consequences of the replication crisis – if it had changed how, and how often, the non-replicable papers were cited compared to replicable ones,” Serra-Garcia told The Wire Science.

Björn Brembs, a professor of neurogenetics at the University of Regensburg, Bavaria, said Serra-Garcia and Gneezy had made a good decision by trading off sample size for lower selection bias.

When scientists writing a paper invoke the results of another, already published, paper, it’s called a citation. And how many times a paper is cited has become a measure of the paper’s ‘impact’ in the research field. Citations also provide a rough estimate of how many other scientists are working on similar research topics. Sometimes, citations can also be negative – they can be used to highlight a study’s failure. For example:

“Even though their report triggered considerable vaccine hesitancy in the community, Wakefield et al (1998)16 failed to have their findings replicated, and their claim that the MMR vaccine causes autism is considered today to be false.”

Serra-Garcia and Gneezy classified the citations of the papers scrutinised in the projects as ‘positive/neutral’ or ‘negative’. They also tested the quality of citations – whether the same authors were citing their previous studies and the ‘impact factor’ of the journal in which the papers citing the older studies were published.

(The impact factor of a journal is a number that denotes the average number of times a paper published in the journal is cited in a year. Many journals tout their impact factors as a sign of their ‘prestige’.)

The study found that the papers that researchers could not replicate had been cited 153-times more than the replicable ones. There were also no changes in this citation trend even after researchers were able to establish that the studies couldn’t be replicated.

In addition, only 12% of the citations acknowledged a failure to replicate. The papers that cited non-replicable studies were also published in journals of similar impact factor as the ones that cited replicable studies.

This scenario prevailed even though experts could predict which studies were less likely to be replicable. Such studies were still accepted for publication and cited more than those that were more likely to be replicated. Every year, studies that researchers could not replicate were cited 16-times more than replicable ones.

The authors said that reviewers may have applied different standards in the peer-review process when the results are “interesting”.

Brembs agreed that the authors’ speculation is in line with the journals’ selection criteria before reliability became an issue. He pointed, by way of example, to Nature‘s instructions to reviewers in 2015 – which pushed the reliability of a submission’s contents to the bottom of the priority list. “What counted were big headlines, irrespective of whether they held up to scrutiny or not,” he said.

These guidelines were changed two years later when two of the three replication projects were published, according to Brembs.

Serra-Garcia said anecdotal evidence indicated a change in how research has been practised since the replication crisis. Many papers have pre-registered experiments and share the data and the analysis codes, she explained. “This in turn could have affected how referees review papers, and they may be more alert,” Serra-Garcia said. “If so, we may not be able to see this until more years have passed since the replication crisis (and all the related work, including ours, about it).”

A 2018 commentary in the journal Nature also pointed to changes in the same direction to “ease replication attempts”. The authors of this piece had surveyed mid-tier journals – as ranked by impact factor – of economics, sociology and psychology.

“Those in sociology and psychology rarely asked” ready access to data and code, the authors wrote. “By contrast, almost all of the top-tier journals in economics have policies that require software code and data to be made available to editors before publication.”

Until such measures are in place, one must actively look out for replication data, Serra-Garcia said in a statement: “Whenever researchers cite work that is more interesting or has been cited a lot, we hope they will check if replication data is available and what those findings suggest.”

Brembs suggested that we could replace journals with modern platforms and, when we do, we could implement alerts: “When did a study we cited fail to replicate? Is a study I would like to cite on a list to be replicated? Has a paper I’m in the process of citing failed to replicate?”

He also recommended implementing a citation ontology like CiTO. A citation ontology is a system that allows citations to be more indicative of the relationship between the citing and cited papers. “It’s been around for almost a decade now and nearly every journal ignores it,” Brembs said.

The problem, however, does not rest solely with publishers; the scientific community also carries some of the blame. In Brembs’s words, “we should not throw stones: we have sheepishly been making matters worse by preferentially hiring people who publish in such journals, essentially making ‘hype’ a precondition for staying in science.”

“In hindsight, this was a monumentally stupid idea. The only thing more stupid as that is it has become 2021 and not only do these journals still exist – publishing there is still massively career-relevant.”

Amir Rubin and Eran Rubin, professor of finance and associate professor of management, elucidated this bias in science in a new study. They investigated how a journal’s shuttering affected its papers’ citations. They found that papers published in the Journal of Business, before and after it was discontinued in 2006, received 20% fewer citations than similar articles published in four other so-called ‘top tier’ journals.

The authors wrote, “the exact same article with the exact same accreditation by reviewers is considered significantly more valuable when the outlet is in the ‘most-desired journals to publish in’ list compared to the period in which that is no longer the case.”

Joel P. Joseph is a science writer.

Scroll To Top