Should supervisors review their students' code?

I have supervised my first master project this year. The project is coming to an end, and I am very happy with the results as well as the fully reproducible workflow we have followed: all developed on GitHub using R package structure and Rmarkdown.

However, I realise I have spent several weeks working full-time on revising my student’s code, before revising the actual manuscript. First thought (nothing new really) is that reviewing someone else’s code is hard. Even harder in the case of students and starting programmers, whose code needs a lot of tidying up before actually attempting to figure out if it works correctly. Most importantly, I found several bugs in the data and code that, without raising any warning or error message, were however producing wrong results, such as incorrect species richness or diversity estimates.

Don’t take me wrong: my student is awesome, and I’m sure anyone could find bugs in my own code too, so that’s not the point. What troubles me is that if I hadn’t spent so much time reviewing every line of code used in the project, the manuscript would now probably contain wrong results. And it worries me that I don’t think many supervisors review their master/PhD students code. For obvious reasons: the time drain is considerable. I’ve had to park all other projects for a few weeks (manuscripts that have to be submitted, etc). And time is so invaluable to scientists…

What happens more often is that supervisors get a Word document with inserted figures and tables, so that likely bugs will go unnoticed unless very strange results pop out. That’s also how most collaborative papers are done: as a coauthor you get a draft manuscript with inserted results, and you ‘have to’ trust they are correct. If coauthors are constructing a paper on incorrect results, nobody knows (and few seem to ask).

It’s a fact that bugs (in data, code or both) do occur. Shouldn’t we then spend more time reviewing our students/coauthors’ code, trying to minimise them?

If I, as an author, can’t be confident that the results in our paper are sound, why should future readers be?

Francisco Rodríguez-Sánchez
Francisco Rodríguez-Sánchez

Computational Ecologist & Data Scientist.