Tag Archives: replication in science

Why do Scientists Reinvent Wheels?

We may reinvent wheels by repeating research that has already been completed and published elsewhere. In one sense there is no great harm in this, and statisticians would call it replication of the first study, and the more replication the more we are convinced that the results of the study are robust. There is a problem when the repeated study reaches different results from the first study. If this occurs, there is a need to do another study to determine if there is a general pattern in the results, or if there are different habitats with different answers to the question being investigated. But after a series of studies is done, it is time to do something else since the original question has been answered and replicated. Such repeated studies are often the subject of M.Sc. or Ph.D. theses which have a limited 1-3-year time window to reach completion. The only general warning for these kinds of replicated studies is to read all the old literature on the subject. There is a failure too often on this and reviewers often notice missing references for a repeated study. Science is an ongoing process but that does not mean that all the important work has been carried out in the last 5 years.

There is a valid time and place to repeat a study when the habitat for example has been greatly fragmented or altered by human land use or when climate change has made a strong impact on the ecosystem under study. The problem in this case is to have an adequate background of data that allows you to interpret your current data. If there is a fundamental problem with ecological studies to date it is that we have an inadequate baseline for comparison for many ecosystems. We can conclude that a particular ecosystem is losing species (due to land use change or climate) only if we know what species comprised this ecosystem in past years and how much the species composition fluctuated over time. The time frame desirable for background data may be only 5 years for some species or communities but for many communities it may be 20-40 years or more. We are too often buried in the assumption that communities and ecosystems have been in equilibrium in the past so that any fluctuations now seen are unnatural. This time frame problem bedevils calls for conservation action when data are deficient.

The Living Planet Report of 2018 has been widely quoted as stating that global wildlife populations have decreased 60% in the last 4 decades. They base their analysis on the changes in 4000 vertebrate species. There are about 70,000 vertebrate species on Earth, so this statement is based on about 6% of the vertebrates. The purpose of the Living Planet Report is to educate us about conservation issues and encourage political action. No ecologist in his or her right mind would question this 60% quotation lest they be cast out of the profession, but it is a challenge to the graduate students of today to analyze this statistic to determine how reliable it is. We all ‘know’ that elephants and rhinos are declining but they are hardly a random sample. The problem in a nutshell is that we have reliable long-term data on perhaps 0.01% or less of all vertebrate species. By long term I suggest we set a minimal limit of 10 generations. As another sobering test of these kinds of statements I suggest picking your favorite animal and reading all you can on how to census the species and then locate how many studies of this species meet the criteria of a good census. The African elephant could be a good place to start, since everyone is convinced that it has declined drastically. The information in the Technical Supplement is a good starting point for a discussion about data accuracy in a conservation class.

My advice is that ecologists should not without careful thought repeat studies that have already been carried out many times on common species . Look for gaps in the current wisdom. Many of our species of concern are indeed declining and need action but we need knowledge of what kinds of management actions are helpful and possible. Many of our species have not been studied long enough to know if they are under threat or not. It is not helpful to ‘cry wolf’ if indeed there is no wolf there. We need precision and accuracy now more than ever.

World Wildlife Fund. 2018. Living Planet Report – 2018: Aiming Higher. Grooten, M. and Almond, R.E.A.(Eds). WWF, Gland, Switzerland. ISBN: 978-2-940529-90-2.
https://wwf.panda.org/knowledge_hub/all_publications/living_planet_report_2018/

On Questionable Research Practices

Ecologists and evolutionary biologists are tarred and feathered along with many scientists who are guilty of questionable research practices. So says this article in “The Conservation” on the web:
https://theconversation.com/our-survey-found-questionable-research-practices-by-ecologists-and-biologists-heres-what-that-means-94421?utm_source=twitter&utm_medium=twitterbutton

Read this article if you have time but here is the essence of what they state:

“Cherry picking or hiding results, excluding data to meet statistical thresholds and presenting unexpected findings as though they were predicted all along – these are just some of the “questionable research practices” implicated in the replication crisis psychology and medicine have faced over the last half a decade or so.

“We recently surveyed more than 800 ecologists and evolutionary biologists and found high rates of many of these practices. We believe this to be first documentation of these behaviours in these fields of science.

“Our pre-print results have certain shock value, and their release attracted a lot of attention on social media.

  • 64% of surveyed researchers reported they had at least once failed to report results because they were not statistically significant (cherry picking)
  • 42% had collected more data after inspecting whether results were statistically significant (a form of “p hacking”)
  • 51% reported an unexpected finding as though it had been hypothesised from the start (known as “HARKing”, or Hypothesising After Results are Known).”

It is worth looking at these claims a bit more analytically. First, the fact that more than 800 ecologists and evolutionary biologists were surveyed tells you nothing about the precision of these results unless you can be convinced this is a random sample. Most surveys are non-random and yet are reported as though they are a random, reliable sample.

Failing to report results is common in science for a variety of reasons that have nothing to do with questionable research practices. Many graduate theses contain results that are never published. Does this mean their data are being hidden? Many results are not reported because they did not find an expected result. This sounds awful until you realize that journals often turn down papers because they are not exciting enough, even though the results are completely reliable. Other results are not reported because the investigator realized once the study is complete that it was not carried on long enough, and the money has run out to do more research. One would have to have considerable detail about each study to know whether or not these 64% of researchers were “cherry picking”.

Alas the next problem is more serious. The 42% who are accused of “p-hacking” were possibly just using sequential sampling or using a pilot study to get the statistical parameters to conduct a power analysis. Any study which uses replication in time, a highly desirable attribute of an ecological study, would be vilified by this rule. This complaint echos the statistical advice not to use p-values at all (Ioannidis 2005, Bruns and Ioannidis 2016) and refers back to complaints about inappropriate uses of statistical inference (Armhein et al. 2017, Forstmeier et al. 2017). The appropriate solution to this problem is to have a defined experimental design with specified hypotheses and predictions rather than an open ended observational study.

The third problem about unexpected findings hits at an important aspect of science, the uncovering of interesting and important new results. It is an important point and was warned about long ago by Medewar (1963) and emphasized recently by Forstmeier et al. (2017). The general solution should be that novel results in science must be considered tentative until they can be replicated, so that science becomes a self-correcting process. But the temptation to emphasize a new result is hard to restrain in the era of difficult job searches and media attention to novelty. Perhaps the message is that you should read any “unexpected findings” in Science and Nature with a degree of skepticism.

The cited article published in “The Conversation” goes on to discuss some possible interpretations of what these survey results mean. And the authors lean over backwards to indicate that these survey results do not mean that we should not trust the conclusions of science, which unfortunately is exactly what some aspects of the public media have emphasized. Distrust of science can be a justification for rejecting climate change data and rejecting the value of immunizations against diseases. In an era of declining trust in science, these kinds of trivial surveys have shock value but are of little use to scientists trying to sort out the details about how ecological and evolutionary systems operate.

A significant source of these concerns flows from the literature that focuses on medical fads and ‘breakthroughs’ that are announced every day by the media searching for ‘news’ (e.g. “eat butter”, “do not eat butter”). The result is almost a comical model of how good scientists really operate. An essential assumption of science is that scientific results are not written in stone but are always subject to additional testing and modification or rejection. But one result is that we get a parody of science that says “you can’t trust anything you read” (e.g. Ashcroft 2017). Perhaps we just need to repeat to ourselves to be critical, that good science is evidence-based, and then remember George Bernard Shaw’s comment:

Success does not consist in never making mistakes but in never making the same one a second time.

Amrhein, V., Korner-Nievergelt, F., and Roth, T. 2017. The earth is flat (p > 0.05): significance thresholds and the crisis of unreplicable research. PeerJ  5: e3544. doi: 10.7717/peerj.3544.

Ashcroft, A. 2017. The politics of research-Or why you can’t trust anything you read, including this article! Psychotherapy and Politics International 15(3): e1425. doi: 10.1002/ppi.1425.

Bruns, S.B., and Ioannidis, J.P.A. 2016. p-Curve and p-Hacking in observational research. PLoS ONE 11(2): e0149144. doi: 10.1371/journal.pone.0149144.

Forstmeier, W., Wagenmakers, E.-J., and Parker, T.H. 2017. Detecting and avoiding likely false-positive findings – a practical guide. Biological Reviews 92(4): 1941-1968. doi: 10.1111/brv.12315.

Ioannidis, J.P.A. 2005. Why most published research findings are false. PLOS Medicine 2(8): e124. doi: 10.1371/journal.pmed.0020124.

Medawar, P.B. 1963. Is the scientific paper a fraud? Pp. 228-233 in The Threat and the Glory. Edited by P.B. Medawar. Harper Collins, New York. pp. 228-233. ISBN 978-0-06-039112-6