Friday, February 25, 2011

Fighting publication bias #1

Short version: By having a hierarchy of journals that accept work partly based on a prediction of how important/novel the work seems to be to a few referees and an editor, researchers will

  • try too hard to find results that will seem to be novel/important
  • try too hard to reproduce new results and show that they too have found this new novel/important thing
  • shelve their work (because it seems flawed or because will at best be publishable only in less interesting lower-tier journals) if they fail to reproduce the new novel/important things

The current academic publishing system with peer-reviewed journals is an attempt to achieve a lot of different goals at the same time:

  • Facilitate scientific progress, by
    • ensuring quality of published research by weeding out work that is riddled with errors, poor methodology etc. through anonymous peer-review by relevant experts
    • assessing/predicting importance of research and thus how “high up” in the journal hierarchy it should be published,
    • making research results broadly accessible so that disciplines can build their way brick-by-brick to greater truths
    • promoting a convergence towards consensus by ensuring reproducibility of research and promoting academic dialogue and debate
  • Simplify the evaluation of individual researchers (given the above, the number of articles weighted by journal type is a proxy for the importance and quality of your research)
  • Generate huge profits for publishing houses (To quote an article from Journal of Economic Perspectives, “
  • The six most-cited economics journals listed in the Social Science Citation Index are all nonprofit journals, and their library subscription prices average about $180 per year. Only five of the 20 most-cited journals are owned by commercial publishers, and the average price of these five
    journals is about $1660 per year.

Now, clearly, not all of these goals are compatible – most obviously, it is hard to square rocketing subscription costs with the goal of making research results more accessible. However, the ranking of academics based on where in a hierarchy of journals they have published seems likely to lead to issues as well.

If you want to get ahead as a researcher, you need to be published, preferably in good journals. If you want to be published in a good journal you need to do something surprising and interesting. You need to either show that something  people think is smart is stupid, or that something people think is stupid is smart. As a result, you get a kind of publication bias that can be illustrated by a simple thought experiment:

Imagine that the world is exactly as we think it is. If you drew a number of random samples, the estimates for various parameters of interest would tend to be distributed rather nicely around the true values. Only the researchers “lucky” enough to draw the outlier samples whose estimated parameters were surprising would be able to write rigorously done research that supported new (and false) models of the world that were in line with these (non-representative) results. This is actually not a very subtle point: One out of twenty samples will by definition have results that reject a true null hypothesis at 5% significance level.

OK, so let us say ideological bias, fashions and trends in modeling approaches etc. are irrelevant, so the result is published. Right away, this becomes a hot new topic, and anyone else able to reproduce it (read: anyone else drawing random but non-representative samples) get published. And then, gradually, the pendulum shifts – and the interesting and novel thing is to disprove the new result.

Now, clearly the above thought model is too simple. For one thing, we don’t know the truth. But the recent New Yorker essay on “The decline effect” sounds like this might be part of what’s going on:

all sorts of well-established, multiply confirmed findings have started to look increasingly uncertain. It’s as if our facts were losing their truth: claims that have been enshrined in textbooks are suddenly unprovable. This phenomenon doesn’t yet have an official name, but it’s occurring across a wide range of fields, from psychology to ecology. In the field of medicine, the phenomenon seems extremely widespread, affecting not only antipsychotics but also therapies ranging from cardiac stents to Vitamin E and antidepressants

The essay discusses a number of explanations (some of them sort of mystical and new-agish), but also notes the explanation above. When biologist Leigh Simmons failed to replicate a new interesting result, he failed to replicate it:

“But the worst part was that when I submitted these null results I had difficulty getting them published. The journals only wanted confirming data. It was too exciting an idea to disprove, at least back then.” For Simmons, the steep rise and slow fall of fluctuating asymmetry is a clear example of a scientific paradigm, one of those intellectual fads that both guide and constrain research: after a new paradigm is proposed, the peer-review process is tilted toward positive results. But then, after a few years, the academic incentives shift—the paradigm has become entrenched—so that the most notable results are now those that disprove the theory.

It seems to me that this is an almost unavoidable result of the current journal system, but not an unavoidable result of peer-reviewed journals as such. The problem seems to me to stem from the hierarchy of journals, and from the two tasks we give to referees (assess quality and assess importance/interest). The new open-access mega-journals (PLOS One, Sage Open, etc) that aim to publish all competently done research independently of how “important” it seems should at least mitigate the problem. Not necessarily by making it less important to have a “breakthrough” paper with a seemingly important result, but by making it easier to publish null-results.