Recently, two somewhat different topics on the business of science came across my Twitter feed.
The first was yet another push for “reproducibility in science” by 72 authors whose latest prescription was to set p = 0.005 (under Neyman-Pearson hypothesis testing) as the new threshold for “significance.” This paper was picked up by the usual Nature et al. press and, of course, generated lots of thumb-time. Without irony, an accompanying Center for Open Science blog post suggests:
“the fact that this paper is authored by statisticians and scientists from a range of disciplines—[…]—indicates that the proposal now has broad support.” (my italics)
That is, the fact that 72 self-selected individuals out of hundreds of thousands of researchers signed on indicates that the proposal has broad support.
Well, regardless, reproducible science, like apple pie, has to be a good thing (maybe). The paper is careful to point out that the proposal is not about publication standards or policy but standards of evidence in science. That is, it is about science.
So then, who in science needs this proscription on the use of the word “significance?” Suppose there is a risky and expensive experiment or maybe even a new dissertation project. I am trying to imagine a student arguing (against advice of caution)
“But, but, the article said it was significant!”
In fact, even for very well executed ground-breaking studies, typical journal clubs are exercises in skepticisms, take-downs, and “what-about-isms” (probably to the detriment of the discussants). Do grant review committees take at face value an applicant’s claims of “significant” preliminary results? Do hiring committees?
Q1: To whom is regulating the use of the term “significant” valuable?
I want to bring up the second topic before trying to answer the above question. This involved another (often repeated) discussion on whether one should cite papers in preprint servers like the new bioRxiv. Without regurgitating the various of pros and cons, much of the argument against citing a paper in a preprint server came down to “giving validation” to something that might not deserve such validation. That is, it was again supposed to be about standards of evidence in science. Science will benefit when we only acknowledge that which has been okayed by three people—including the notorious Reviewer 2.
This was interesting as I always thought the role of citation was to establish the “source of information” from which I was deriving my own thesis. I thought the worry on citations was about missing possible relevant sources (and pissing off somebody) or citing sources that might be too ephemeral. We used to cite “pers. comm” which I think is okay as long as the “citee” doesn’t die. Preprints do have some possibility of the second problem, but then it is probably as durable as any online journal. So, why worry about citing non-peer reviewed papers?
Q2: To whom is regulating what is cited valuable?
Any human activity, regardless of whether it is art or science, acquires an economic structure. Efficient operation of the economy requires exchange tokens; we would rather not cart around bushels corn to exchange for milk so we make up credit papers (i.e., money). In science, the economy is supposed to be organized to enable exchange of ideas. But, we also need efficiency, so rather than read the candidates’ papers, we look up their h-index. Time saved. But, use of money or tokens like citations requires establishment of valuations (how much is that puppy in the window). So, one might think scientists who only want to cite “peer-reviewed” work are attempting to create accurate valuations—despite the Dutch Tulip prices created with drive-by-citations (see Lior Pachter’s discussion of this wonderful term from Andrew Perrin). But, what value is being assessed by citations as the valuation mechanism?
This brings me back to the above “significance” issue. Who cares about this language use? As I mentioned, I have never experienced anybody changing their negative opinion because some authors stated that their results were “significant”, statistically or otherwise, that is, if they actually read the paper. If indeed well-regulated use of the term significance leads to reproducible results in print, it should save people’s time. Well, but the original 72 authors of the p = 0.005 paper state that they are not talking about publication standards but language descriptors (valuations) and suggest adding other descriptors like “suggestive.”
But, hey, why even try to convert a metric scale (real-valued probabilities) into some vague ordinal scale? Because, the journals—more specifically the non-expert editors proliferating in current commercial journals, care. Significance is the bouncer behind the velvet rope they use to enshrine the (high impact factor) journal corpus. In fact, many of the journals explicitly ask the reviewers about “significance”. And, those polite rejection letters mentioning “more specialized journals” always mention “significance”. Yes, I know statistical significance is not the same thing as these uses of the word “significance”. Or, is it?
No one can argue against the idea that science demands everybody to do their due diligence. But, the specific concern has been focused on journal publications. Who made the printed words in peer-reviewed journals the matter of record? That is, when did putting scientific work to print become canonization instead a form of communication? And, who decided citations of such printed works should be a valuation token? I don’t know who decided all this but I do know who benefits (the answer to above Q1 and Q2): The Journals.
For the journals, there is a clear self-interest in establishing themselves as “all that is fit to print” and the “matter of record”. That is, nothing would make the journals happier than being the gate keepers of Truth. But, should this be also true for science and scientists? If we look at papers from 10, 20, 50 years ago, what percent of them hold up? Should Newton have been prevented from publishing Principia since he didn’t take gravitational curvature into account? Science is the never-ending search for, and refinement of, understandings of nature–which we pursue through exchange of ideas. We communicate these ideas to each other through the printed medium because it adds precision and distributional efficiency. We established the tradition of peer review because it helps increase (in some undefinable manner) the quality of the communications. Into this economy, commercial journals and journal empires came and, mirroring the rise of the financial sector in the real economy, established a derivative market where the number of publications, citations of publications, impact of publications, are made to replace the actual value of science itself. More they can convince us that “validated citations” are important and that only “significant” results get published in significant journals, more they solidify their position and substitute publication for science. They would like to see you bite down into Credit Default Swap papers of “Altimetrics” than an actual apple.
We are all human and to some extent efficiency and expediency makes us all admire a CV with a large list of Nature, Science, and their baby critter journals. But, we need to remember that the interests of these journals are not the same as the interests of science. Journals literally bank on our asking them to validate us.
Don’t let the journals dictate our values. Don’t let them win. Resist.
Addendum: A more serious question is what percent of published papers should be reproducible? If you think 100%, you have never thought about the problem of optimization over a rugged landscape. And, Nature is rugged indeed.