Back to p-Values

Alas ecology has slipped lower on the totem-pole of serious sciences by an article that has captured the attention of the media:

Low-Décarie, E., Chivers, C., and Granados, M. 2014. Rising complexity and falling explanatory power in ecology. Frontiers in Ecology and the Environment 12(7): 412-418. doi: 10.1890/130230.

There is much that is positive in this paper, so you should read it if only to decide whether or not to use it in a graduate seminar in statistics or in ecology. Much of what is concluded is certainly true, that there are more p-values in papers now than there were some years ago. The question then comes down to what these kinds of statistics mean and how this would justify a conclusion captured by the media that explanatory power in ecology is declining over time, and the bottom line of what to do about falling p-values. Since as far as I can see most statisticians today seem to believe that p-values are meaningless (e.g. Ioannidis 2005), one wonders what the value of showing this trend is. A second item that most statisticians agree about is that R2 values are a poor measure of anything other than the items in a particular data set. Any ecological paper that contains data to be analysed and reported summarizes many tests providing p-values and R2 values of which only some are reported. It would be interesting to do a comparison with what is recognized as a mature science (like physics or genetics) by asking whether the past revolutions in understanding and prediction power in those sciences corresponded with increasing numbers of p-values or R2 values.

To ask these questions is to ask what is the metric of scientific progress? At the present time we confuse progress with some indicators that may have little to do with scientific advancement. As journal editors we race to increase their impact factor which is interpreted as a measure of importance. For appointments to university positions we ask how many citations a person has and how many papers they have produced. We confuse scientific value with some numbers which ironically might have a very low R2 value as predictors of potential progress in a science. These numbers make sense as metrics to tell publication houses how influential their journals are, or to tell Department Heads how fantastic their job choices are, but we fool ourselves if we accept them as indicators of value to science.

If you wish to judge scientific progress you might wish to look at books that have gathered together the most important papers of the time, and examine a sequence of these from the 1950s to the present time. What is striking is that papers that seemed critically important in the 1960s or 1970s are now thought to be concerned with relatively uninteresting side issues, and conversely papers that were ignored earlier are now thought to be critical to understanding. A list of these changes might be a useful accessory to anyone asking about how to judge importance or progress in a science.

A final comment would be to look at the reasons why a relatively mature science like geology has completely failed to be able to predict earthquakes in advance and even to specify the locations of some earthquakes (Steina et al. 2012; Uyeda 2013). Progress in understanding does not of necessity dictate progress in prediction. And we ought to be wary of confusing progress with p-and R2 values.

Ioannidis, J.P.A. 2005. Why most published research findings are false. PLoS Medicine 2(8): e124.

Steina, S., Gellerb, R.J., and Liuc, M. 2012. Why earthquake hazard maps often fail and what to do about it. Tectonophysics 562-563: 1-24. doi: 10.1016/j.tecto.2012.06.047.

Uyeda, S. 2013. On earthquake prediction in Japan. Proceedings of the Japan Academy, Series B 89(9): 391-400. doi: 10.2183/pjab.89.391.

One thought on “Back to p-Values

  1. squirrelsnz

    The paper does not consider (indeed does not even mention) the increasing use of likelihood-based inference (e.g. AIC; p-values may or may not be reported) and Bayesian statistics in ecology – as just two examples. Of course, one could debate the merits of these approaches equally as much as the use of approaches that generate p and R2 values. However, their use is now common if not routine, which Low-Décarie and Granados have failed entirely to mention. Perhaps an alternative hypothesis is that ecologists long ago recognised the complexity of the ecosystems they seek to understand, and have responded by adopting a wider range of statistical approaches when analysing the results of their experiments. The use of such approaches may – or may not – have increased our power to explain and understand how the natural world works… it would therefore be a laughable exercise to search for the term ‘credible interval’ (or some such) in eminent ecological journals, and offer its increasing use (if indeed it does increase) as evidence that explanatory power in ecology is increasing!

    Reply

Leave a Reply