In published academic research, publication bias occurs when the outcome of an experiment or research study biases the decision to publish or otherwise distribute it. Publishing only results that show a significant finding disturbs the balance of findings in favor of positive results.[1] The study of publication bias is an important topic in metascience.
Despite similar quality of execution and design,[2] papers with statistically significant results are three times more likely to be published than those with null results.[3] This unduly motivates researchers to manipulate their practices to ensure statistically significant results, such as by data dredging.[4]
Many factors contribute to publication bias.[5] For instance, once a scientific finding is well established, it may become newsworthy to publish reliable papers that fail to reject the null hypothesis.[6] Most commonly, investigators simply decline to submit results, leading to non-response bias. Investigators may also assume they made a mistake, find that the null result fails to support a known finding, lose interest in the topic, or anticipate that others will be uninterested in the null results. The nature of these issues and the resulting problems form the five diseases that threaten science: "significosis, an inordinate focus on statistically significant results; neophilia, an excessive appreciation for novelty; theorrhea, a mania for new theory; arigorium, a deficiency of rigor in theoretical and empirical work; and finally, disjunctivitis, a proclivity to produce many redundant, trivial, and incoherent works."[7]
Attempts to find unpublished studies often prove difficult or are unsatisfactory.[8] In an effort to combat this problem, some journals require studies submitted for publication pre-register (before data collection and analysis) with organizations like the Center for Open Science.
Other proposed strategies to detect and control for publication bias include p-curve analysis[9] and disfavoring small and non-randomized studies due to high susceptibility to error and bias.
Publication bias occurs when the publication of research results depends not just on the quality of the research but also on the hypothesis tested, and the significance and direction of effects detected.[10] The subject was first discussed in 1959 by statistician Theodore Sterling to refer to fields in which "successful" research is more likely to be published. As a result, "the literature of such a field consists in substantial part of false conclusions resulting from errors of the first kind in statistical tests of significance".[11] In the worst case, false conclusions could canonize as being true if the publication rate of negative results is too low.[12]
Publication bias is sometimes called the file-drawer effect, or file-drawer problem. This term suggests that results not supporting the hypotheses of researchers often go no further than the researchers' file drawers, leading to a bias in published research.[13] The term "file drawer problem" was coined by psychologist Robert Rosenthal in 1979.[14]
Positive-results bias, a type of publication bias, occurs when authors are more likely to submit, or editors are more likely to accept, positive results than negative or inconclusive results.[15] Outcome reporting bias occurs when multiple outcomes are measured and analyzed, but the reporting of these outcomes is dependent on the strength and direction of its results. A generic term coined to describe these post-hoc choices is HARKing ("Hypothesizing After the Results are Known").[16]
There is extensive meta-research on publication bias in the biomedical field. Investigators following clinical trials from the submission of their protocols to ethics committees (or regulatory authorities) until the publication of their results observed that those with positive results are more likely to be published.[17] [18] [19] In addition, studies often fail to report negative results when published, as demonstrated by research comparing study protocols with published articles.[20] [21]
The presence of publication bias was investigated in meta-analyses. The largest such analysis investigated the presence of publication bias in systematic reviews of medical treatments from the Cochrane Library.[22] The study showed that statistically positive significant findings are 27% more likely to be included in meta-analyses of efficacy than other findings. Results showing no evidence of adverse effects have a 78% greater probability of inclusion in safety studies than statistically significant results showing adverse effects. Evidence of publication bias was found in meta-analyses published in prominent medical journals.[23]
Meta-analyses (reviews) have been performed in the field of ecology and environmental biology. In a study of 100 meta-analyses in ecology, only 49% tested for publication bias.[24] While there are multiple tests that have been developed to detect publication bias, most perform poorly in the field of ecology because of high levels of heterogeneity in the data and that often observations are not fully independent.[25]
, "No trial published in China or Russia/USSR found a test treatment to be ineffective."[26]
Where publication bias is present, published studies are no longer a representative sample of the available evidence. This bias distorts the results of meta-analyses and systematic reviews. For example, evidence-based medicine is increasingly reliant on meta-analysis to assess evidence.
Meta-analyses and systematic reviews can account for publication bias by including evidence from unpublished studies and the grey literature. The presence of publication bias can also be explored by constructing a funnel plot in which the estimate of the reported effect size is plotted against a measure of precision or sample size. The premise is that the scatter of points should reflect a funnel shape, indicating that the reporting of effect sizes is not related to their statistical significance. However, when small studies are predominately in one direction (usually the direction of larger effect sizes), asymmetry will ensue and this may be indicative of publication bias.
Because an inevitable degree of subjectivity exists in the interpretation of funnel plots, several tests have been proposed for detecting funnel plot asymmetry.[27] [28] [29] These are often based on linear regression including the popular Eggers regression test,[30] and may adopt a multiplicative or additive dispersion parameter to adjust for the presence of between-study heterogeneity. Some approaches may even attempt to compensate for the (potential) presence of publication bias,[31] [32] which is particularly useful to explore the potential impact on meta-analysis results.[33] [34] [35]
In ecology and environmental biology, a study found that publication bias impacted the effect size, statistical power, and magnitude. The prevalence of publication bias distorted confidence in meta-analytic results, with 66% of initially statistically significant meta-analytic means becoming non-significant after correcting for publication bias.[36] Ecological and evolutionary studies consistently had low statistical power (15%) with a 4-fold exaggeration of effects on average (Type M error rates = 4.4).
The presence of publication bias can be detected by Time-lag bias tests, where time-lag bias occurs when larger or statistically significant effects are published more quickly than smaller or non-statistically significant effects. It can manifest as a decline in the magnitude of the overall effect over time. The key feature of time-lag bias tests is that, as more studies accumulate, the mean effect size is expected to converge on its true value.[25]
Two meta-analyses of the efficacy of reboxetine as an antidepressant demonstrated attempts to detect publication bias in clinical trials. Based on positive trial data, reboxetine was originally passed as a treatment for depression in many countries in Europe and the UK in 2001 (though in practice it is rarely used for this indication). A 2010 meta-analysis concluded that reboxetine was ineffective and that the preponderance of positive-outcome trials reflected publication bias, mostly due to trials published by the drug manufacturer Pfizer. A subsequent meta-analysis published in 2011, based on the original data, found flaws in the 2010 analyses and suggested that the data indicated reboxetine was effective in severe depression (see Reboxetine § Efficacy). Examples of publication bias are given by Ben Goldacre[37] and Peter Wilmshurst.
In the social sciences, a study of published papers exploring the relationship between corporate social and financial performance found that "in economics, finance, and accounting journals, the average correlations were only about half the magnitude of the findings published in Social Issues Management, Business Ethics, or Business and Society journals".[38]
One example cited as an instance of publication bias is the refusal to publish attempted replications of Bem's work that claimed evidence for precognition by The Journal of Personality and Social Psychology (the original publisher of Bem's article).[39]
An analysis[40] comparing studies of gene-disease associations originating in China to those originating outside China found that those conducted within the country reported a stronger association and a more statistically significant result.[41]
John Ioannidis argues that "claimed research findings may often be simply accurate measures of the prevailing bias."[42] He lists the following factors as those that make a paper with a positive result more likely to enter the literature and suppress negative-result papers:
Other factors include experimenter bias and white hat bias.
Publication bias can be contained through better-powered studies, enhanced research standards, and careful consideration of true and non-true relationships. Better-powered studies refer to large studies that deliver definitive results or test major concepts and lead to low-bias meta-analysis. Enhanced research standards such as the pre-registration of protocols, the registration of data collections and adherence to established protocols are other techniques. To avoid false-positive results, the experimenter must consider the chances that they are testing a true or non-true relationship. This can be undertaken by properly assessing the false positive report probability based on the statistical power of the test[43] and reconfirming (whenever ethically acceptable) established findings of prior studies known to have minimal bias.
In September 2004, editors of prominent medical journals (including the New England Journal of Medicine, The Lancet, Annals of Internal Medicine, and JAMA) announced that they would no longer publish results of drug research sponsored by pharmaceutical companies, unless that research was registered in a public clinical trials registry database from the start.[44] Furthermore, some journals (e.g. Trials), encourage publication of study protocols in their journals.[45]
The World Health Organization (WHO) agreed that basic information about all clinical trials should be registered at the study's inception, and that this information should be publicly accessible through the WHO International Clinical Trials Registry Platform. Additionally, public availability of complete study protocols, alongside reports of trials, is becoming more common for studies.[46]
In a megastudy, a large number of treatments are tested simultaneously. Given inclusion of different interventions in the study, a megastudy's publication likelihood is less dependent on the statistically significant effect of any specific treatment, so it has been suggested that megastudies may be less prone to publication bias.[47] For example, an intervention found to be ineffective would be easier to publish as part of a megastudy as just one of many studied interventions, whereas it might go unreported due to the file-drawer problem if it were the sole focus of a contemplated paper. For the same reason, the megastudy research design may encourage researchers to study not only the interventions they consider more likely to be effective but also those interventions that researchers are less certain about and that they would not pick as the sole focus of the study due to the perceived high risk of a null effect.