Toward a more reproducible ecology: calculating plant cover in vegetation transects in R

Science has a big reproducibility problem: hardly anyone can reproduce (i.e. re-run, re-obtain) the results of most published papers (including authors themselves!). That is a big problem not only for science as a collective enterprise but also for scientists' everyday life (‘how did I do this?’ is a frequent question we ask ourselves).

In the last couple of years I have tried to implement and follow reproducible I do all my analysis in Rmarkdown, and follow R package structure as a template for my research projects.

For an ongoing project studying vegetation dynamics in Los Alcornocales Natural Park, we have been measuring plant cover in many vegetation transects across the park. This is done by simply extending a long tape in the forest floor and recording when individual species appear and disappear from the transect.

Then, individual stretches of presence are summed up (often manually, with the help of a calculator) to obtain total plant cover (in length) of each plant species. That manual step is not only time-consuming, but also prone to errors (as many many calculations are involved) and, most importantly, not reproducible.

As I could not find any function to do these calculations in R, I have written my own function to calculate plant cover (and bare ground cover too) directly from field data. Basically, the function takes a data frame with values on when a given plant species started to be present and when it disappeared:

and returns another dataframe with the total cover per species (plus bare ground, not covered by any plant species):

I have included several checks within the function to detect likely errors in the raw data, which have also shown to be very helpful to us.

In summary, it is a silly function but I hope fills a gap on a very common task for plant ecologists. Field data can now be recorded in paper or electronic notebooks, and this function will automatically calculate total plant species cover in a reproducible way. Please let me know if you find problems or have feedback on GitHub.

Francisco Rodríguez-Sánchez
Francisco Rodríguez-Sánchez

Computational Ecologist & Data Scientist.