Category Archives: Evaluating Research Quality

On Questionable Research Practices

Ecologists and evolutionary biologists are tarred and feathered along with many scientists who are guilty of questionable research practices. So says this article in “The Conservation” on the web:
https://theconversation.com/our-survey-found-questionable-research-practices-by-ecologists-and-biologists-heres-what-that-means-94421?utm_source=twitter&utm_medium=twitterbutton

Read this article if you have time but here is the essence of what they state:

“Cherry picking or hiding results, excluding data to meet statistical thresholds and presenting unexpected findings as though they were predicted all along – these are just some of the “questionable research practices” implicated in the replication crisis psychology and medicine have faced over the last half a decade or so.

“We recently surveyed more than 800 ecologists and evolutionary biologists and found high rates of many of these practices. We believe this to be first documentation of these behaviours in these fields of science.

“Our pre-print results have certain shock value, and their release attracted a lot of attention on social media.

  • 64% of surveyed researchers reported they had at least once failed to report results because they were not statistically significant (cherry picking)
  • 42% had collected more data after inspecting whether results were statistically significant (a form of “p hacking”)
  • 51% reported an unexpected finding as though it had been hypothesised from the start (known as “HARKing”, or Hypothesising After Results are Known).”

It is worth looking at these claims a bit more analytically. First, the fact that more than 800 ecologists and evolutionary biologists were surveyed tells you nothing about the precision of these results unless you can be convinced this is a random sample. Most surveys are non-random and yet are reported as though they are a random, reliable sample.

Failing to report results is common in science for a variety of reasons that have nothing to do with questionable research practices. Many graduate theses contain results that are never published. Does this mean their data are being hidden? Many results are not reported because they did not find an expected result. This sounds awful until you realize that journals often turn down papers because they are not exciting enough, even though the results are completely reliable. Other results are not reported because the investigator realized once the study is complete that it was not carried on long enough, and the money has run out to do more research. One would have to have considerable detail about each study to know whether or not these 64% of researchers were “cherry picking”.

Alas the next problem is more serious. The 42% who are accused of “p-hacking” were possibly just using sequential sampling or using a pilot study to get the statistical parameters to conduct a power analysis. Any study which uses replication in time, a highly desirable attribute of an ecological study, would be vilified by this rule. This complaint echos the statistical advice not to use p-values at all (Ioannidis 2005, Bruns and Ioannidis 2016) and refers back to complaints about inappropriate uses of statistical inference (Armhein et al. 2017, Forstmeier et al. 2017). The appropriate solution to this problem is to have a defined experimental design with specified hypotheses and predictions rather than an open ended observational study.

The third problem about unexpected findings hits at an important aspect of science, the uncovering of interesting and important new results. It is an important point and was warned about long ago by Medewar (1963) and emphasized recently by Forstmeier et al. (2017). The general solution should be that novel results in science must be considered tentative until they can be replicated, so that science becomes a self-correcting process. But the temptation to emphasize a new result is hard to restrain in the era of difficult job searches and media attention to novelty. Perhaps the message is that you should read any “unexpected findings” in Science and Nature with a degree of skepticism.

The cited article published in “The Conversation” goes on to discuss some possible interpretations of what these survey results mean. And the authors lean over backwards to indicate that these survey results do not mean that we should not trust the conclusions of science, which unfortunately is exactly what some aspects of the public media have emphasized. Distrust of science can be a justification for rejecting climate change data and rejecting the value of immunizations against diseases. In an era of declining trust in science, these kinds of trivial surveys have shock value but are of little use to scientists trying to sort out the details about how ecological and evolutionary systems operate.

A significant source of these concerns flows from the literature that focuses on medical fads and ‘breakthroughs’ that are announced every day by the media searching for ‘news’ (e.g. “eat butter”, “do not eat butter”). The result is almost a comical model of how good scientists really operate. An essential assumption of science is that scientific results are not written in stone but are always subject to additional testing and modification or rejection. But one result is that we get a parody of science that says “you can’t trust anything you read” (e.g. Ashcroft 2017). Perhaps we just need to repeat to ourselves to be critical, that good science is evidence-based, and then remember George Bernard Shaw’s comment:

Success does not consist in never making mistakes but in never making the same one a second time.

Amrhein, V., Korner-Nievergelt, F., and Roth, T. 2017. The earth is flat (p > 0.05): significance thresholds and the crisis of unreplicable research. PeerJ  5: e3544. doi: 10.7717/peerj.3544.

Ashcroft, A. 2017. The politics of research-Or why you can’t trust anything you read, including this article! Psychotherapy and Politics International 15(3): e1425. doi: 10.1002/ppi.1425.

Bruns, S.B., and Ioannidis, J.P.A. 2016. p-Curve and p-Hacking in observational research. PLoS ONE 11(2): e0149144. doi: 10.1371/journal.pone.0149144.

Forstmeier, W., Wagenmakers, E.-J., and Parker, T.H. 2017. Detecting and avoiding likely false-positive findings – a practical guide. Biological Reviews 92(4): 1941-1968. doi: 10.1111/brv.12315.

Ioannidis, J.P.A. 2005. Why most published research findings are false. PLOS Medicine 2(8): e124. doi: 10.1371/journal.pmed.0020124.

Medawar, P.B. 1963. Is the scientific paper a fraud? Pp. 228-233 in The Threat and the Glory. Edited by P.B. Medawar. Harper Collins, New York. pp. 228-233. ISBN 978-0-06-039112-6

On Mauna Loa and Long-Term Studies

If there is one important element missing in many of our current ecological paradigms it is long-term studies. This observation boils down to the lack of proper controls for our observations. If we do not know the background of our data sets, we lack critical perspective on how to interpret short-term studies. We should have learned this from paleoecologists whose many studies of plant pollen profiles and other time series from the geological record show that models of stability which occupy most of the superstructure of ecological theory are not very useful for understanding what is happening in the real world today.

All of this got me wondering what it might have been like for Charles Keeling when he began to measure CO2 levels on Mauna Loa in Hawaii in 1958. Let us do a thought experiment and suggest that he was at that time a typical postgraduate students told by his professors to get his research done in 4 or at most 5 years and write his thesis. These would be the basic data he got if he was restricted to this framework:

Keeling would have had an interesting seasonal pattern of change that could be discussed and lead to the recommendation of having more CO2 monitoring stations around the world. And he might have thought that CO2 levels were increasing slightly but this trend would not be statistically significant, especially if he has been cut off after 4 years of work. In fact the US government closed the Mauna Loa observatory in 1964 to save money, but fortunately Keeling’s program was rescued after a few months of closure (Harris 2010).

Charles Keeling could in fact be a “patron saint” for aspiring ecology graduate students. In 1957 as a postdoc he worked on developing the best way to measure CO2 in the air by the use of an infrared gas analyzer, and in 1958 he had one of these instruments installed at the top of Mauna Loa in Hawaii (3394 m, 11,135 ft) to measure pristine air. By that time he had 3 published papers (Marx et al. 2017). By 1970 at age 42 his publication list had increased to a total of 22 papers and an accumulated total of about 50 citations to his research papers. It was not until 1995 that his citation rate began to exceed 100 citations per year, and after 1995 at age 67 his citation rate increased very much. So, if we can do a thought experiment, in the modern era he could never even apply for a postdoctoral fellowship, much less a permanent job. Marx et al. (2017) have an interesting discussion of why Keeling was undercited and unappreciated for so long on what is now considered one of the world’s most critical environmental issues.

What is the message for mere mortals? For postgraduate students, do not judge the importance of your research by its citation rate. Worry about your measurement methods. Do not conclude too much from short-term studies. For professors, let your bright students loose with guidance but without being a dictator. For granting committees and appointment committees, do not be fooled into thinking that citation rates are a sure metric of excellence. For theoretical ecologists, be concerned about the precision and accuracy of the data you build models about. And for everyone, be aware that good science was carried out before the year 2000.

And CO2 levels yesterday were 407 ppm while Nero is still fiddling.

Harris, D.C. (2010) Charles David Keeling and the story of atmospheric CO2 measurements. Analytical Chemistry, 82, 7865-7870. doi: 10.1021/ac1001492

Marx, W., Haunschild, R., French, B. & Bornmann, L. (2017) Slow reception and under-citedness in climate change research: A case study of Charles David Keeling, discoverer of the risk of global warming. Scientometrics, 112, 1079-1092. doi: 10.1007/s11192-017-2405-z

A Modest Proposal for a New Ecology Journal

I read the occasional ecology paper and ask myself how this particular paper ever got published when it is full of elementary mistakes and shows no understanding of the literature. But alas we can rarely do anything about this as individuals. If you object to what a particular paper has concluded because of its methods or analysis, it is usually impossible to submit a critique that the relevant journal will publish. After all, which editor would like to admit that he or she let a hopeless paper through the publication screen. There are some exceptions to this rule, and I list two examples below in the papers by Barraquand (2014) and Clarke (2014). But if you search the Web of Science you will find few such critiques for published ecology papers.

One solution jumped to mind for this dilemma: start a new ecology journal perhaps entitled Misleading Ecology Papers: Critical Commentary Unfurled. Papers submitted to this new journal would be restricted to a total of 5 pages and 10 references, and all polemics and personal attacks would be forbidden. The key for submissions would be to state a critique succinctly, and suggest a better way to construct the experiment or study, a new method of analysis that is more rigorous, or key papers that were missed because they were published before 2000. These rules would potentially leave a large gap for some very poor papers to avoid criticism, papers that would require a critique longer than the original paper. Perhaps one very long critique could be distinguished as a Review of the Year paper. Alternatively, some long critiques could be published in book form (Peters 1991), and not require this new journal. The Editor of the journal would require all critiques to be signed by the authors, but would permit in exceptional circumstances to have the authors be anonymous to prevent job losses or in more extreme cases execution by the Mafia. Critiques of earlier critiques would be permitted in the new journal, but an infinite regress will be discouraged. Book reviews could be the subject of a critique, and the great shortage of critical book reviews in the current publication blitz is another aspect of ecological science that is largely missing in the current journals. This new journal would of course be electronic, so there would be no page charges, and all articles would be open access. All the major bibliographic databases like the Web of Science would be encouraged to catalog the publications, and a doi: would be assigned to each paper from CrossRef.

If this new journal became highly successful, it would no doubt be purchased by Wiley-Blackwell or Springer for several million dollars, and if this occurred, the profits would accrue proportionally to all the authors who had published papers to make this journal popular. The sale of course would be contingent on the purchaser guaranteeing not to cancel the entire journal to prevent any criticism of their own published papers.

At the moment criticism of ecological science does not occur for several years after a poor paper is published and by that time the Donald Rumsfeld Effect would have occurred to apply the concept of truth to the conclusions of this poor work. For one example, most of the papers critiqued by Clarke (2014) were more than 10 years old. By making the feedback loop much tighter, certainly within one year of a poor paper appearing, budding ecologists could be intercepted before being led off course.

This journal would not be popular with everyone. Older ecologists often strive mightily to prevent any criticism of their prior conclusions, and some young ecologists make their career by pointing out how misleading some of the papers of the older generation are. This new journal would assist in creating a more egalitarian ecological world by producing humility in older ecologists and more feelings of achievements in young ecologists who must build up their status in the science. Finally, the new journal would be a focal point for graduate seminars in ecology by bringing together and identifying the worst of the current crop of poor papers in ecology. Progress would be achieved.

 

Barraquand, F. 2014. Functional responses and predator–prey models: a critique of ratio dependence. Theoretical Ecology 7(1): 3-20. doi: 10.1007/s12080-013-0201-9.

Clarke, P.J. 2014. Seeking global generality: a critique for mangrove modellers. Marine and Freshwater Research 65(10): 930-933. doi: 10.1071/MF13326.

Peters, R.H. 1991. A Critique for Ecology. Cambridge University Press, Cambridge, England. 366 pp. ISBN:0521400171

 

On Statistical Progress in Ecology

There is a general belief that science progresses over time and given that the number of scientists is increasing, this is a reasonable first approximation. The use of statistics in ecology has been one of ever increasing improvements of methods of analysis, accompanied by bandwagons. It is one of these bandwagons that I want to discuss here by raising the general question:

Has the introduction of new methods of analysis in biological statistics led to advances in ecological understanding?

This is a very general question and could be discussed at many levels, but I want to concentrate on the top levels of statistical inference by means of old-style frequentist statistics, Bayesian methods, and information theoretic methods. I am prompted to ask this question because of my reviewing of many papers submitted to ecological journals in which the data are so buried by the statistical analysis that the reader is left in a state of confusion whether or not any progress has been made. Being amazed by the methodology is not the same as being impressed by the advance in ecological understanding.

Old style frequentist statistics (read Sokal and Rohlf textbook) has been criticized for concentrating on null hypothesis testing when everyone knows the null hypothesis is not correct. This has led to refinements in methods of inference that rely on effect size and predictive power that is now the standard in new statistical texts. Information-theoretic methods came in to fill the gap by making the data primary (rather than the null hypothesis) and asking the question which of several hypotheses best fit the data (Anderson et al. 2000). The key here was to recognize that one should have prior expectations or several alternative hypotheses in any investigation, as recommended in 1897 by Chamberlin. Bayesian analysis furthered the discussion not only by having several alternative hypotheses but by the ability to use prior information in the analysis (McCarthy and Masters 2006). Implicit in both information theoretic and Bayesian analysis is the recognition that all of the alternative hypotheses might be incorrect, and that the hypothesis selected as ‘best’ might have very low predictive power.

Two problems have arisen as a result of this change of focus in model selection. The first is the problem of testability. There is an implicit disregard for the old idea that models or conclusions from an analysis should be tested with further data, preferably with data obtained independently from the original data used to find the ‘best’ model. The assumption might be made that if we get further data, we should add it to the prior data and update the model so that it somehow begins to approach the ‘perfect’ model. This was the original definition of passive adaptive management, which is now suggested to be a poor model for natural resource management. The second problem is that the model selected as ‘best’ may be of little use for natural resource management because it has little predictability. In management issues for conservation or exploitation of wildlife there may be many variables that affect population changes and it may not be possible to conduct active adaptive management for all of these variables.

The take home message is that we need in the conclusions of our papers to have a measure of progress in ecological insight whatever statistical methods we use. The significance of our research will not be measured by the number of p-values, AIC values, BIC values, or complicated tables. The key question must be: What new ecological insights have been achieved by these methods?

Anderson, D.R., Burnham, K.P., and Thompson, W.L. 2000. Null hypothesis testing: problems, prevalence, and an alternative. Journal of Wildlife Management 64(4): 912-923.

Chamberlin, T.C. 1897. The method of multiple working hypotheses. Journal of Geology 5: 837-848 (reprinted in Science 148: 754-759 in 1965). doi:10.1126/science.148.3671.754.

McCarthy, M.A., and Masters, P.I.P. 2005. Profiting from prior information in Bayesian analyses of ecological data. Journal of Applied Ecology 42(6): 1012-1019. doi:10.1111/j.1365-2664.2005.01101.x.

Walters, C. 1986. Adaptive Management of Renewable Resources. Macmillan, New York.

 

On Improving Canada’s Scientific Footprint – Breakthroughs versus insights

In Maclean’s Magazine on November 25, 2015 Professor Lee Smolin of the Perimeter Institute for Theoretical Physics, an adjunct professor of physics at the University of Waterloo, and a member of the Royal Society of Canada, wrote an article “Ten Steps to Make Canada a Leader in Science” (http://www.macleans.ca/politics/ottawa/ten-steps-to-make-canada-a-leader-in-science/ ). Some of the general points in this article are very good but some seem to support the view of science as big business and that leaves ecology and environmental science in the dust. We comment here on a few points of disagreement with Professor Smolin. The quotations are from the Maclean’s article.

  1. Choose carefully.

“Mainly invest in areas of pure science where there is a path to world leadership. This year’s Nobel prize shows that when we do this, we succeed big.” We suggest that the Nobel Prizes are possibly the worst example of scientific achievement that is currently available because of their disregard for the environment. This recommendation is at complete variance to how environmental sciences advance.

  1. Aim for breakthroughs.

“No “me-too” or catch-up science. Don’t hire the student of famous Prof. X at an elite American university just because of the proximity to greatness. Find our own path to great science by recruiting scientists who are forging their own paths to breakthroughs.” But the essence of science has always been replication. Long-term monitoring is a critical part of good ecology, as Henson (2014) points out for oceanographic research. But indeed we agree to the need to recruit excellent young scientists in all areas.

  1. Embrace risk.

“Learn from business that it takes high risk to get high payoff. Don’t waste money doing low-risk, low-payoff science. Treat science like venture capital.” That advice would remove most of the ecologists who obtain NSERC funding. It is one more economic view of science. Besides, most successful businesses are based on hard work, sound financial practices, and insights into the needs of their customers.

  1. Recruit and invest in young leaders-to-be.

“Be savvy and proactive about choosing them…. Resist supporting legacies and entitlements. Don’t waste money on people whose best work is behind them.” We agree. Spending money to fund a limited number of middle aged, white males in the Canadian Excellence in Research Chairs was the antithesis of this recommendation. See the “Folly of Big Science” by Vinay Prasad (2015). Predicting in advance who will be leaders will surely depend on diverse insights and is best evaluated by giving opportunities for success to many from which leaders will arise.

  1. Recruit internationally.

“Use graduate fellowships and postdoctoral positions as recruitment tools to bring the most ambitious and best-educated young scientists to Canada to begin their research here, and then target the most promising of these by creating mechanisms to ensure that their best opportunities to build their careers going forward are here.” This seems attractive but means Canadian scientists have little hope of obtaining jobs here, since we are < 0.1% of the world’s scientists. A better idea – how about Canada producing the “best-educated” young scientists?

  1. Resist incrementalism.

If you spread new money around widely, little new science gets done. Instead, double-down on strategic fields of research where the progress is clear and Canada can have an impact.“ Fortin and Currie (2013) show that spreading the money around is exactly the way to go since less gets wasted and no one can predict where the “breakthroughs” will happen.  This point also rests on one’s view of the world of the future and what “breakthroughs” will contribute to the sustainability of the earth.

  1. Empower ambitious, risk-taking young scientists.

Give them independence and the resources they need to develop their own ideas and directions. Postdocs are young leaders with their own ideas and research programs”. This is an excellent recommendation, but it does conflict with the recommendation of many universities around the world of bringing in old scientists to establish institutes and giving incentives for established senior scientists.

  1. Embrace diversity.

Target women and visible minorities. Let us build a Canadian scientific community that looks like Canada.” All agreed on this one.

  1. Speak the truth.

“Allow no proxies for success, no partial credit for “progress” that leaves unsolved problems unsolved. Don’t count publications or citations, count discoveries that have increased our knowledge about nature. We do research because we don’t know the answer; don’t force us to write grant proposals in which we have to pretend we do.” This confounds the scientists’ code of ethics with the requirements of bureaucracies like NSERC for accounting for the taxpayers’ dollars. Surely publications record the increased knowledge about nature recommended by Professor Smolin.

  1. Consider the way funding agencies do business.

“We scientists know that panels can discourage risk-taking, encourage me-too and catch-up science, and reinforce longstanding entitlements and legacies. Such a system may incentivize low-risk, incremental work and limit the kind of out-of-the-box ideas that….leads to real breakthroughs. So create ambitious programs, empower the program officers to pick out and incubate the brightest and most ambitious risk-takers, and reward them when the scientists they invest in make real discoveries.” What is the evidence that program officers in NSERC or NSF have the vision to pick winners? This is difficult advice for ecologists who are asked for opinions on support for research projects in fields that require long-term studies to produce increases in ecological understanding or better management of biodiversity. It does seem like a recipe for scientific charlatans.

The bottom line: We think that the good ideas in this article are overwhelmed by poor suggestions with regards to ecological research. We come from an ecological world faced with three critical problems that will determine the fate of the Earth – food security, biodiversity loss, and overpopulation. While we all like ‘breakthroughs’ that give us an IPhone 6S or an electric car, few of the discoveries that have increased our knowledge about nature would be considered a breakthrough. So do we say goodbye to taxonomic research, biodiversity monitoring, investigating climate change impacts on Canadian ecosystems, or investing in biological control of pests? Perhaps we can add the provocative word “breakthrough” to our ecological papers and media reports more frequently but our real goal is to acquire greater insights into achieving a sustainable world.

As a footnote to this discussion, Dev (2015) raises the issue of the unsolved major problems in biology. None of them involve environmental or ecological issues.

Dev, S.B. (2015) Unsolved problems in biology—The state of current thinking. Progress in Biophysics and Molecular Biology, 117, 232-239.

Fortin, J.-M. & Currie, D.J. (2013) Big science vs. little science: How scientific impact scales with funding. PLoS ONE, 8, e65263.

Prasad, V. (2015) The folly of big science. New York Times. October 2, 2015 (http://www.nytimes.com/2015/10/03/opinion/the-folly-of-big-science-awards.html?_r=0 )

Henson, S.A. (2014) Slow science: the value of long ocean biogeochemistry records. Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 372 (2025). doi: 10.1098/rsta.2013.0334.

 

A Survey of Strong Inference in Ecology Papers: Platt’s Test and Medawar’s Fraud Model

In 1897 Chamberlin wrote an article in the Journal of Geology on the method of multiple working hypotheses as a way of experimentally testing scientific ideas (Chamberlin 1897 reprinted in Science). Ecology was scarcely invented at that time and this has stimulated my quest here to see if current ecology journals subscribe to Chamberlin’s approach to science. Platt (1964) formalized this approach as “strong inference” and argued that it was the best way for science to progress rapidly. If this is the case (and some do not agree that this approach is suitable for ecology) then we might use this model to check now and then on the state of ecology via published papers.

I did a very small survey in the Journal of Animal Ecology for 2015. Most ecologists I hope would classify this as one of our leading journals. I asked the simple question of whether in the Introduction to each paper there were explicit hypotheses stated and explicit alternative hypotheses, and categorized each paper as ‘yes’ or ‘no’. There is certainly a problem here in that many papers stated a hypothesis or idea they wanted to investigate but never discussed what the alternative was, or indeed if there was an alternative hypothesis. As a potential set of covariates, I tallied how many times the word ‘hypothesis’ or ‘hypotheses’ occurred in each paper, as well as the word ‘test’, ‘prediction’, and ‘model’. Most ‘model’ and ‘test’ words were used in the context of statistical models or statistical tests of significance. Singular and plural forms of these words were all counted.

This is not a publication and I did not want to spend the rest of my life looking at all the other ecology journals and many issues, so I concentrated on the Journal of Animal Ecology, volume 84, issues 1 and 2 in 2015. I obtained these results for the 51 articles in these two issues: (number of times the word appeared per article, averaged over all articles)

Explicit hypothesis and alternative hypotheses

“Hypothesis”

“Test”

“Prediction”

“Model”

Yes

22%

Mean

3.1

7.9

6.5

32.5

No

78%

Median

1

6

4

20

No. articles

51

Range

0-23

0-37

0-27

0-163

There are lots of problems with a simple analysis like this and perhaps its utility may lie in stimulating a more sophisticated analysis of a wider variety of journals. It is certainly not a random sample of the ecology literature. But maybe it gives us a few insights into ecology 2015.

I found the results quite surprising in that many papers failed Platt’s Test for strong inference. Many papers stated hypotheses but failed to state alternative hypotheses. In some cases the implied alternative hypothesis is the now-discredited null hypothesis (Johnson 2002). One possible reason for the failure to state hypotheses clearly was discussed by Medawar many years ago (Howitt and Wilson 2014; Medawar 1963). He pointed out that most scientific papers were written backwards, analysing the data, finding out what it concluded, and then writing the introduction to the paper knowing the results to follow. A significant number of papers in these issues I have looked at here seem to have been written following Medawar’s “fraud model”.

But make of such data as you will, and I appreciate that many people write papers in a less formal style than Medawar or Platt would prefer. And many have alternative hypotheses in mind but do not write them down clearly. And perhaps many referees do not think we should be restricted to using the hypothetical deductive approach to science. All of these points of view should be discussed rather than ignored. I note that some ecological journals now turn back papers that have no clear statement of a hypothesis in the introduction to the submitted paper.

The word ‘model’ is the most common word to appear in this analysis, typically in the case of a statistical model evaluated by AIC kinds of statistics. And the word ‘test’ was most commonly used in statistical tests (‘t-test’) in a paper. Indeed virtually all of these paper overflow with statistical estimates of various kinds. Few however come back in the conclusions to state exactly what progress has been made by their paper and even less make statements about what should be done next. From this small survey there is considerable room for improvement in ecological publications.

Chamberlin, T.C. 1897. The method of multiple working hypotheses. Journal of Geology 5: 837-848 (reprinted in Science 148: 754-759 in 1965). doi:10.1126/science.148.3671.754

Howitt, S.M., and Wilson, A.N. 2014. Revisiting “Is the scientific paper a fraud?”. EMBO reports 15(5): 481-484. doi:10.1002/embr.201338302

Johnson, D.H. (2002) The role of hypothesis testing in wildlife science. Journal of Wildlife Management 66(2): 272-276. doi: 10.2307/3803159

Medawar, P.B. 1963. Is the scientific paper a fraud? In “The Threat and the Glory”. Edited by P.B. Medawar. Harper Collins, New York. pp. 228-233. (Reprinted by Harper Collins in 1990. ISBN: 9780060391126.)

Platt, J.R. 1964. Strong inference. Science 146: 347-353. doi:10.1126/science.146.3642.347

On Indices of Population Abundance

I am often surprised at ecological meetings by how many ecological studies rely on indices rather than direct measures. The most obvious cases involve population abundance. Two common criteria for declaring a species as endangered are that its population has declined more than 70% in the last ten years (or three generations) or that its population size is less than 2500 mature individuals. The criteria are many and every attempt is made to make them quantitative. But too often the methods used to estimate changes in population abundance are based on an index of population size, and all too rarely is the index calibrated against known abundances. If an index increases by 2-fold, e.g. from 20 to 40 counts, it is not at all clear that this means the population size has increased 2-fold. I think many ecologists begin their career thinking that indices are useful and reliable and end their career wondering if they are providing us with a correct picture of population changes.

The subject of indices has been discussed many times in ecology, particularly among applied ecologists. Anderson (2001) challenged wildlife ecologists to remember that indices include an unmeasured term, detectability: Anderson (2001, p. 1295) wrote:

“While common sense might suggest that one should estimate parameters of interest (e.g., population density or abundance), many investigators have settled for only a crude index value (e.g., “relative abundance”), usually a raw count. Conceptually, such an index value (c) is the product of the parameter of interest (N) and a detection or encounter probability (p): then c=pN

He noted that many indices used by ecologists make a large assumption that the probability of encounter is a constant over time and space and individual observers. Much of the discussion of detectability flowed from these early papers (Williams, Nichols & Conroy 2002; Southwell, Paxton & Borchers 2008). There is an interesting exchange over Anderson’s (2001) paper by Engeman (2003) followed by a retort by Anderson (2003) that ended with this blast at small mammal ecologists:

“Engeman (2003) notes that McKelvey and Pearson (2001) found that 98% of the small-mammal studies reviewed resulted in too little data for valid mark-recapture estimation. This finding, to me, reflects a substantial failure of survey design if these studies were conducted to estimate population size. ……..O’Connor (2000) should not wonder “why ecology lags behind biology” when investigators of small-mammal communities commonly (i.e., over 700 cases) achieve sample sizes <10. These are empirical methods; they cannot be expected to perform well without data.” (page 290)

Take that you small mammal trappers!

The warnings are clear about index data. In some cases they may be useful but they should never be used as population abundance estimates without careful validation. Even by small mammal trappers like me.

Anderson, D.R. (2001) The need to get the basics right in wildlife field studies. Wildlife Society Bulletin, 29, 1294-1297.

Anderson, D.R. (2003) Index values rarely constitute reliable information. Wildlife Society Bulletin, 31, 288-291.

Engeman, R.M. (2003) More on the need to get the basics right: population indices. Wildlife Society Bulletin, 31, 286-287.

McKelvey, K.S. & Pearson, D.E. (2001) Population estimation with sparse data: the role of estimators versus indices revisited. Canadian Journal of Zoology, 79, 1754-1765.

O’Connor, R.J. (2000) Why ecology lags behind biology. The Scientist, 14, 35.

Southwell, C., Paxton, C.G.M. & Borchers, D.L. (2008) Detectability of penguins in aerial surveys over the pack-ice off Antarctica. Wildlife Research, 35, 349-357.

Williams, B.K., Nichols, J.D. & Conroy, M.J. (2002) Analysis and Management of Animal Populations. Academic Press, New York.

Citation Analysis Gone Crazy

Perhaps we should stop and look at the evils of citation analysis in science. Citation analysis began some 15 or 20 years ago with a useful thought that it might be nice to know if one’s scientific papers were being read and used by others working in the same area. But now it has morphed into a Godzilla that has the potential to run our lives. I think the current situation rests on three principles:

  1. Your scientific ability can be measured by the number of citations you receive. This is patent nonsense.
  2. The importance of your research is determined by which journals accept your papers. More nonsense.
  3. Your long-term contribution to ecological science can be measured precisely by your h–score or some variant.

These principles appeal greatly to the administrators of science and to many people who dish out the money for scientific research. You can justify your decisions with numbers. Excellent job to make the research enterprise quantitative. The contrary view which I might hope is held by many scientists rests on three different principles:

  1. Your scientific ability is difficult to measure and can only be approximately evaluated by another scientist working in your field. Science is a human enterprise not unlike music.
  2. The importance of your research is impossible to determine in the short term of a few years, and in a subject like ecology probably will not be recognized for decades after it is published.
  3. Your long-term contribution to ecological science will have little to do with how many citations you accumulate.

It will take a good historian to evaluate these alternative views of our science.

This whole issue would not matter except for the fact that it is eroding science hiring and science funding. The latest I have heard is that Norwegian universities are now given a large amount of money by the government if they publish a paper in SCIENCE or NATURE, and a very small amount of money if they publish the same results in the CANADIAN JOURNAL OF ZOOLOGY or – God forbid – the CANADIAN FIELD NATURALIST (or equivalent ‘lower class’ journals). I am not sure how many other universities will fall under this kind of reward-based publication scores. All of this is done I think because we do not wish to involve the human judgment factor in decision making. I suppose you could argue that this is a grand experiment like climate change (with no controls) – use these scores for 30 years and then see if they worked better than the old system based on human judgment. How does one evaluate such experiments?

NSERC (Natural Sciences and Engineering Research Council) in Canada has been trending in that direction in the last several years. In the eternal good old days scientists read research proposals and made judgments about the problem, the approach, and the likelihood of success of a research program. They took time to discuss at least some of the issues. But we move now into quantitative scores that replace human judgment, which I believe to be a very large mistake.

I view ecological research and practice much like I think medical research and medical practice operate. We do not know how well certain studies and experiment will work, any more than a surgeon knows exactly whether a particular technique or treatment will work or a particular young doctor will be a good surgeon, and we gain by experience in a mostly non-quantitative manner. Meanwhile we should encourage young scientists to try new ideas and studies, to give them opportunities based on judgments rather than on counts of papers or citations. Currently we want to rank everyone and every university like sporting teams and find out the winner. This is a destructive paradigm for science. It works for tennis but not for ecology.

Bornmann, L. & Marx, W. (2014) How to evaluate individual researchers working in the natural and life sciences meaningfully? A proposal of methods based on percentiles of citations. Scientometrics, 98, 487-509.

Leimu, R. & Koricheva, J. (2005) What determines the citation frequency of ecological papers? Trends in Ecology & Evolution, 20, 28-32.

Parker, J., Lortie, C. & Allesina, S. (2010) Characterizing a scientific elite: the social characteristics of the most highly cited scientists in environmental science and ecology. Scientometrics, 85, 129-143.

Todd, P.A., Yeo, D.C.J., Li, D. & Ladle, R.J. (2007) Citing practices in ecology: can we believe our own words? Oikos, 116, 1599-1601.

Is Ecology like Economics?

One statement in Thomas Piketty’s book on economics struck me as a possible description of ecology’s development. On page 32 he states:

“To put it bluntly, the discipline of economics has yet to get over its childish passion for mathematics and for purely theoretical and often highly ideological speculation at the expense of historical research and collaboration with the other social sciences. Economists are all too often preoccupied with petty mathematical problems of interest only to themselves. This obsession with mathematics is an easy way of acquiring the appearance of scientificity without having to answer the far more complex questions posed by the world we live in.”

If this is at least a partially correct summary of ecology’s history, we could argue that finally in the last 20 years ecology has begun to analyze the far more complex questions posed by the ecological world. But it does so with a background of oversimplified models, whether verbal or mathematical, that we are continually trying to fit our data into. Square pegs into round holes.

Part of this problem arises from the hierarchy of science in which physics and in particular mathematics are ranked as the ideals of science to which we should all strive. It is another verbal model of the science world constructed after the fact with little attention to the details of how physics and the other hard sciences have actually progressed over the past three centuries.

Sciences also rank high in the public mind when they provide humans with more gadgets and better cars and airplanes, so that technology and science are always confused. Physics led to engineering which led to all our modern gadgets and progress. Biology has assisted medicine in continually improving human health, and natural history has enriched our lives by raising our appreciation of biodiversity. But ecology has provided a less clearly articulated vision for humans with a new list of commandments that seem to inhibit economic ‘progress’. Much of what we find in conservation biology and wildlife management simply states the obvious that humans have made a terrible mess of life on Earth – extinctions, overharvesting, pollution of lakes and the ocean, and invasive weeds among other things. In some sense ecologists are like the priests of old, warning us that God or some spiritual force will punish us if we violate some commandments or regulations. In our case it is the Earth that suffers from poorly thought out human alterations, and, in a nutshell, CO2 is the new god that will indeed guarantee that the end is near. No one really wants to hear or believe this, if we accept the polls taken in North America.

So the bottom line for ecologists should be to concentrate on the complex questions posed by the biological world, and try first to understand the problems and second to suggest some way to solve them. Much easier said than done, as we can see from the current economic mess in what might be a sister science.

Piketty, T. 2014. Capital in the Twenty-First Century. Belknap Press, Harvard University, Boston. 696 pp. ISBN 9780674430006

Back to p-Values

Alas ecology has slipped lower on the totem-pole of serious sciences by an article that has captured the attention of the media:

Low-Décarie, E., Chivers, C., and Granados, M. 2014. Rising complexity and falling explanatory power in ecology. Frontiers in Ecology and the Environment 12(7): 412-418. doi: 10.1890/130230.

There is much that is positive in this paper, so you should read it if only to decide whether or not to use it in a graduate seminar in statistics or in ecology. Much of what is concluded is certainly true, that there are more p-values in papers now than there were some years ago. The question then comes down to what these kinds of statistics mean and how this would justify a conclusion captured by the media that explanatory power in ecology is declining over time, and the bottom line of what to do about falling p-values. Since as far as I can see most statisticians today seem to believe that p-values are meaningless (e.g. Ioannidis 2005), one wonders what the value of showing this trend is. A second item that most statisticians agree about is that R2 values are a poor measure of anything other than the items in a particular data set. Any ecological paper that contains data to be analysed and reported summarizes many tests providing p-values and R2 values of which only some are reported. It would be interesting to do a comparison with what is recognized as a mature science (like physics or genetics) by asking whether the past revolutions in understanding and prediction power in those sciences corresponded with increasing numbers of p-values or R2 values.

To ask these questions is to ask what is the metric of scientific progress? At the present time we confuse progress with some indicators that may have little to do with scientific advancement. As journal editors we race to increase their impact factor which is interpreted as a measure of importance. For appointments to university positions we ask how many citations a person has and how many papers they have produced. We confuse scientific value with some numbers which ironically might have a very low R2 value as predictors of potential progress in a science. These numbers make sense as metrics to tell publication houses how influential their journals are, or to tell Department Heads how fantastic their job choices are, but we fool ourselves if we accept them as indicators of value to science.

If you wish to judge scientific progress you might wish to look at books that have gathered together the most important papers of the time, and examine a sequence of these from the 1950s to the present time. What is striking is that papers that seemed critically important in the 1960s or 1970s are now thought to be concerned with relatively uninteresting side issues, and conversely papers that were ignored earlier are now thought to be critical to understanding. A list of these changes might be a useful accessory to anyone asking about how to judge importance or progress in a science.

A final comment would be to look at the reasons why a relatively mature science like geology has completely failed to be able to predict earthquakes in advance and even to specify the locations of some earthquakes (Steina et al. 2012; Uyeda 2013). Progress in understanding does not of necessity dictate progress in prediction. And we ought to be wary of confusing progress with p-and R2 values.

Ioannidis, J.P.A. 2005. Why most published research findings are false. PLoS Medicine 2(8): e124.

Steina, S., Gellerb, R.J., and Liuc, M. 2012. Why earthquake hazard maps often fail and what to do about it. Tectonophysics 562-563: 1-24. doi: 10.1016/j.tecto.2012.06.047.

Uyeda, S. 2013. On earthquake prediction in Japan. Proceedings of the Japan Academy, Series B 89(9): 391-400. doi: 10.2183/pjab.89.391.