Making YOUR Code Reproducible: Tips and Tricks

When we were putting together the British Ecological Society’s Guide to Reproducible Code we asked the community to send us their advice on how to make code reproducible. We got a lot of excellent responses and we tried to fit as many as we could into the Guide. Unfortunately, we ran out of space and there were a few that we couldn’t include.

Luckily, we have a blog where we can post all of those tips and tricks so that you don’t miss out. A massive thanks to everyone who contributed their tips and tricks for making code reproducible – we really appreciate it. Without further ado, here’s the advice that we were sent about making code reproducible that we couldn’t squeeze into the Guide:

Organising Code

©Leejiah Dorward

“Don’t overwrite data files. If data files change, create a new file. At the top of an analysis file define paths to all data files (even if they are not read in until later in the script).” – Tim Lucas, University of Oxford

“Keep one copy of all code files, and keep this copy under revision management.” – April Wright, Iowa State University

“Learn how to write simple functions – they save your ctrl c & v keys from getting worn out.” – Bob O’Hara, NTNU

For complex figures, it can make sense to pre-compute the items to be plotted as its own intermediate output data structure. The code to do the calculation then only needs to be adjusted if an analysis changes, while the things to be plotted can be reused any number of times while you tweak how the figure looks.” – Hao Ye, UC San Diego

Use version control (github). Academic accounts have free private repos if you worry about that sort of thing.” – Brian O’Meara, University of Tennessee, Knoxville

“Shorten the name of variables and/or names of objects.” – Gbadamassi G. O. Dossa, Kunming Institute of Botany

“Write functions to do repeated operations or complex tasks. A modular code that calls different functions is much more readable and safer. It is much easier to test functions than debugging a long, unstructured script.” – Francisco Rodríguez-Sánchez, Estación Biológica de Doñana (CSIC)

“Use informative file names, and use _ and – strategically in the file names so you can use regex on filenames to subset them.” – Ben Marwick, University of Washington

“Use packrat, checkpoint or docker to ensure the versions of programs and packages used for the entire process.” – Karlo Gregório Guidoni Martins, Universidade Federal de Goiás

Writing Code

©Leda Cal

“Avoid premature optimization (even if it is fun). Use spaces instead of tabs. Be consistent in your coding style! Pick the best language for the task. Easier-to-read-but-slightly-slower code is preferable to faster-but-indecipherable code.” – Joseph Brown, University of Michigan

“You can google for answers, but talking to somebody will often save you days of frustration.” – Karl Cottenie, University of Guelph (This one’s a good tip for life in general!)

“Break your code into chunks i.e. ‘functionize’. Identify the sub-tasks involved and write a function for each one and then write a function that brings them together to complete your specific task. This makes your code much easier to read and to troubleshoot whilst also allowing you to include each sub-task into multiple pieces of code and if you need to make modifications you only need to make them once.” – Samantha Price, Clemson University

ALWAYS comment your code! Do this from the beginning of the process. Think of your coworkers who will use the code later.” – Karlo Gregório Guidoni Martins

“Keep track of the versions of libraries you used. Make note of this in your comments.” – April Wright

“Keep lines of code fairly short ( say <60 chrs), and let RStudio indent your code for you.” – Ben Marwick

“Use logical variable names, even if they are a bit longer.” – Phil Wilkes, UCL

“Document your functions using roxygen2 tags, even if you are not building your code into a package. @example is incredibly useful to show how the function should be used.” Tim Lucas

“Write code that makes sense to others, and it will definitely make sense to future you as well.” – Gbadamassi G. O. Dossa

Report Writing

©Isla Myers-Smith

Markdown is your friend. It is easy to write, and pleasant to read. PDFs are not as useful, as copy-paste often gets corrupted.” – Joseph Brown

“When writing documentation write for a novice user. Assume no prior knowledge and provide worked examples (e.g. vignettes in R) to demonstrate how to implement analyses using your code.” – Samantha Price

“Always record the software your project depends upon (including package versions). Think about ways to ensure these dependencies will be available in the future or on a different computer.” – Francisco Rodríguez-Sánchez

“Make examples simple. You are not showing off how wonderful you are at writing code.” – Bob O’Hara

RMarkdown parameterized reports are a thing now. Use them!” – Brian O’Meara

“Always download packages with dependencies. If you are using RStudio, it just a matter of one click.” – Gbadamassi G. O. Dossa

“Keep it simple. And if you’re using python, use conda.” – Phil Wilkes

“Use packrat to download the used packages and use checkpoint for package versions.” – Karlo Gregório Guidoni Martins

“Use packrat/MRAN/docker to control the versions of packages that you’re using.” – Ben Marwick

Version Control

©David J. Bird

“We all do some kind of version control. But iteratively changing file names for versioning is very inefficient, and quickly leads to excessive proliferation of files with cryptic changes. Let’s embrace much more efficient tools (like git) used and developed by expert programmers.” – Francisco Rodríguez-Sánchez

“Use Git from within RStudio, and use GitHub/GitLab/BitBucket. Commit early and often.” – Ben Marwick

“If you are making changes always consider backward compatibility. Sometimes creating a new function (with a new name) not a new version is more appropriate.” – Samantha Price

“There are several established workflows for collaborating using different version control systems. Using “branches” in Git is a good way to allow multiple people to work on variants of a common codebase without stepping on each other toes.” – Hao Ye

“Avoid versioning files with initials and numbers as a suffix.” – Luis Verde, Universidad Austral de Chile

“You can freeze the ability of other users to add code or delete code from your repository. This can be very useful when your collaborators are new to coding.” – April Wright

“If the project is complex it is beneficial to have “master” and “development” branches, merging updates to the master branch only after a code review.” – Joseph Brown

“Use version control in your project folder. This controls the input and output of files and code modifications.” – Karlo Gregório Guidoni Martins

Archiving Code

©Leejiah Dorward

“GitHub and BitBucket are they way to go. Plus they more easily enable community input.” – Joseph Brown

“GitHub is the standard “cloud” repository for code – it gives you the greatest visibility and there are lots of tutorials available. More importantly, there are good tutorials for how to link it with figshare and zenodo to create permanent repositories with DOIs. Without this, it is really difficult for others to cite your code or use it in a reproducible way.” – Hao Ye

“Use a simple and widely-known directory structure, such as an R package.” – Ben Marwick

“Make sure to read and make note of any funder requirements for sharing code and data.” – April Wright

“Annotate code so that a web search could locate particular functions or actions.” – Luis Verde

“Make the final version succinct, so it’s easy to follow the full workflow later.” – Bob O’Hara

Once again thanks to everyone who contributed their tips and tricks for making code reproducible.

The full BES Guide to Reproducible Code, like all of the BES Guides to Better Science, is freely available online.

A limited number of hard copies will be available from the BES stand at Ecology Across Borders (printing sponsored by Methods in Ecology and Evolution).


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s