Our paper on tumor evolution in ovarian cancer (see here) came with a nice knitR file to reproduce the survival results, which I used as an example in my recent talk about reproducibility (see here).
I thought that was a nice test scenario to see if I could reproduce the results I got more than a year ago.
How reproducible am I?
Downloading the Rnw from the journal webpage (link) was easy, but -of course- it didn’t run through smoothly.
LaTeX failed and there were several R error messages.
The joys and frustrations of reproducibility
First of all, I had linked to a BibTeX file instead of just copying the bibliography in to the Rnw as I should have done.
Second, I ran into problems with the survival analysis, because one of the packages had changed.
rms::survplot() used to allow plotting a survfit object through survplot.survfit() function. However, this function has been deprecated as of version 4.2.
Luckily I found an easy workaround, just use npsurv() instead of survfit().
The updated Rnw is here on my webpage:
Together with a PDF so you can see what the output should look like.
Take-home message for me: Even with a knitR file I did myself, reproducibility is not a one-click thing.
To make reproducibility sustainable I would have to check all published analysis scripts in regular intervals (e.g. once every year or every 6 months). Am I prepared to do this? And for how long?