data science

New Ecoinformatics working group

Earlier this year a few colleagues (Ignasi Bartomeus, Sara Varela, Antonio J. Pérez-Luque, and myself) created a new working group on Ecoinformatics within the Spanish Terrestrial Ecology Association (AEET). Our main goals are to promote knowledge and training and exchange experiences on all aspects of ecoinformatics, including data management, statistical modelling, programming, etc.

Reproducible Science: What, Why, How

Reproducibility is a hot topic in science nowadays (e.g. see this Nature special). Some argue that we are in the middle of a ‘reproducibility crisis’, and thus scientists are being strongly encouraged to increase the reproducibility of their research.

Reproducible workflows

As a side product (or trailer) of our paper on reproducible science, we made a video promoting reproducible workflows. Particularly, showing how using Git and Rmarkdown make your research and scientific collaboration way much easier and better, compared to a typical (non-reproducible) workflow involving Excel, Word, some figure production software, and a lot of manual steps.

Reproducible science: what, why, how

Most scientific papers are not reproducible: it is really hard, if not impossible, to understand how results are derived from data, and being able to regenerate them in the future (even by the same researchers). However, traceability and …

Writing papers in Rmarkdown

Rmarkdown is a great tool for reproducible science. You can combine text and code to produce dynamic reports that generate updated results with a single click, as in the example below.

Should supervisors review their students' code?

I have supervised my first master project this year. The project is coming to an end, and I am very happy with the results as well as the fully reproducible workflow we have followed: all developed on GitHub using R package structure and Rmarkdown.

Toward a more reproducible ecology: calculating plant cover in vegetation transects in R

Science has a big reproducibility problem: hardly anyone can reproduce (i.e. re-run, re-obtain) the results of most published papers (including authors themselves!). That is a big problem not only for science as a collective enterprise but also for scientists' everyday life (‘how did I do this?

Reproducible Research with Rmarkdown: data management, analysis and reporting all-in-one

pacotools: Miscellaneous tools I often need and forget about

My very first R package, from 2012!