Reproducible Science: What, Why, How

Reproducibility is a hot topic in science nowadays (e.g. see this Nature special). Some argue that we are in the middle of a ‘reproducibility crisis’, and thus scientists are being strongly encouraged to increase the reproducibility of their research. Doing reproducible research provides also many benefits to the individual researcher.This is a recent topic, however, and there is not much literature out there to guide newcomers and people willing to improve their reproducibility (e.g. there is virtually nothing written in Spanish).

We have just published a review paper on reproducible science: what it is, why it is important, and how we can improve the reproducibility of our research. It is published in Ecosistemas, the official journal of the Spanish Terrestrial Ecology Association (AEET). Hence it is written in Spanish, although there are some bits in English (e.g. abstract, figures). Also, we made an appendix with resources to learn more on reproducible science, data management, version control…

We hope this paper is useful to make more scientists aware of the importance of doing reproducible research, and getting them started with reproducible workflows.

Rodríguez-Sánchez, F., Pérez-Luque, A.J. Bartomeus, I., Varela, S. 2016. Reproducible science: what, why, how. Ecosistemas 25(2): 83-92.

Most scientific papers are not reproducible: it is really hard, if not impossible, to understand how results are derived from data, and being able to regenerate them in the future (even by the same researchers). However, traceability and reproducibility of results are indispensable elements of high-quality science, and an increasing requirement of many journals and funding sources. Reproducible studies include code able to regenerate results from the original data. This practice not only provides a perfect record of the whole analysis but also reduces the probability of errors and facilitates code reuse, thus accelerating scientific progress. But doing reproducible science also brings many benefits to the individual researcher, including saving time and effort, improved collaborations, and higher quality and impact of final publications. In this article we introduce reproducible science, why it is important, and how we can improve the reproducibility of our work. We introduce principles and tools for data management, analysis, version control, and software management that help us achieve reproducible workflows in the context of ecology.

Keywords: data analysis; ecoinformatics; ecology; open science; programming; R; reproducibility

Francisco Rodríguez-Sánchez
Francisco Rodríguez-Sánchez

Computational Ecologist & Data Scientist.