Making YOUR Code Reproducible: Tips and Tricks

When we were putting together the British Ecological Society’s Guide to Reproducible Code we asked the community to send us their advice on how to make code reproducible. We got a lot of excellent responses and we tried to fit as many as we could into the Guide. Unfortunately, we ran out of space and there were a few that we couldn’t include.

Luckily, we have a blog where we can post all of those tips and tricks so that you don’t miss out. A massive thanks to everyone who contributed their tips and tricks for making code reproducible – we really appreciate it. Without further ado, here’s the advice that we were sent about making code reproducible that we couldn’t squeeze into the Guide:

Organising Code

©Leejiah Dorward

“Don’t overwrite data files. If data files change, create a new file. At the top of an analysis file define paths to all data files (even if they are not read in until later in the script).” – Tim Lucas, University of Oxford

“Keep one copy of all code files, and keep this copy under revision management.” – April Wright, Iowa State University

“Learn how to write simple functions – they save your ctrl c & v keys from getting worn out.” – Bob O’Hara, NTNU

For complex figures, it can make sense to pre-compute the items to be plotted as its own intermediate output data structure. The code to do the calculation then only needs to be adjusted if an analysis changes, while the things to be plotted can be reused any number of times while you tweak how the figure looks.” – Hao Ye, UC San Diego Continue reading

Advertisements

A Guide to Reproducible Code in Ecology and Evolution

Post provided by Natalie Cooper and Pen-Yuan Hsing

Cover image by David J. Bird

The way we do science is changing — data are getting bigger, analyses are getting more complex, and governments, funding agencies and the scientific method itself demand more transparency and accountability in research. One way to deal with these changes is to make our research more reproducible, especially our code.

Although most of us now write code to perform our analyses, it’s often not very reproducible. We’ve all come back to a piece of work we haven’t looked at for a while and had no idea what our code was doing or which of the many “final_analysis” scripts truly was the final analysis! Unfortunately, the number of tools for reproducibility and all the jargon can leave new users feeling overwhelmed, with no idea how to start making their code more reproducible. So, we’ve put together the Guide to Reproducible Code in Ecology and Evolution to help. Continue reading

How Can We Quantify the Strength of Migratory Connectivity?

Technological advancements in the past 20 years or so have spurred rapid growth in the study of migratory connectivity (the linkage of individuals and populations between seasons of the annual cycle). A new article in Methods in Ecology and Evolution provides methods to help make quantitative comparisons of migratory connectivity across studies, data types, and taxa to better understand the causes and consequences of the seasonal distributions of populations.

In a new video, Emily Cohen, Jeffrey Hostetler and Michael Hallworth explain what migratory connectivity is and how the methods in their new article – ‘Quantifying the strength of migratory connectivity‘ – can help you to study it. They also introduce and give a quick tutorial on their new R package MigConnectivity.

This video is based on the article ‘Quantifying the strength of migratory connectivity by Cohen et al.

Improved and Harmless Demethylation Method for Ecological Epigenetic Experiments

In a new Methods in Ecology and Evolution video, Javier Puy outlines a new method of experimental plant DNA demethylation for ecological epigenetic experiments. While the traditionally-used approach causes underdeveloped root systems and high mortality of treated plants, this new one overcomes the unwanted effects while maintaining the demethylation efficiency. The authors demonstrate its application for ecological epigenetic experiments: testing transgenerational effects of plant–plant competition.

This novel method could be better suited for experimental studies seeking valuable insights into ecological epigenetics. As it’s based on periodical spraying of azacytidine on established plants, it’s suitable for clonal species reproducing asexually, and it opens the possibility of community-level experimental demethylation of plants.

This video is based on the article ‘Improved demethylation in ecological epigenetic experiments: Testing a simple and harmless foliar demethylation application by Puy et al.

Solving YOUR Ecology Challenges with R: Ecology Hackathon in Ghent

©2016 The R Foundation

Scientific software is an increasingly important part of scientific research, and ecologists have been at the forefront of developing open source tools for ecological research. Much of this software is distributed via R packages – there are over 200 R packages for ecology and evolution on CRAN alone. Methods regularly publishes Application articles introducing R packages (and other software) that enable ecological research, and we’re always looking for new ways to enable even more and better ecological software.

This December, we will be teaming up with rOpenSci and special interest groups from BES, GfÖ and NecoV to hold our first Ecology Hackathon at the Ecology Across Borders conference in Ghent. The hackathon will be held as a one-day pre-conference workshop on Monday 11th December. Together, the attendees will identify some challenges for ecological research, and team up to build R packages that help solve them.

We’ve started compiling potential topics for new R packages in a collaborative document, but we need more. Are you having any difficulties in your research that could be solved with an R package? Is there a package that you wish existed but have never been able to find? If so, WE WANT TO HEAR FROM YOU!

Please take a look at our current list of challenges and add your suggestions!

The Power of Infinity: Using 3D Fractal Geometry to Study Irregular Organisms

Post provided by Jessica Reichert, André R. Backes, Patrick Schubert and Thomas Wilke

The Problem with the Shape

More than anything else, the phenotype of an organism determines how it interacts with the environment. It’s subject to natural selection, and may help to unravel the underlying evolutionary processes. So shape traits are key elements in many ecological and biological studies.

The growth form of corals is highly variable. ©Jessica Reichert

The growth form of corals is highly variable. ©Jessica Reichert

Commonly, basic parameters like distances, areas, angles, or derived ratios are used to describe and compare the shapes of organisms. These parameters usually work well in organisms with a regular body plan. The shape of irregular organisms – such as many plants, fungi, sponges or corals – is mainly determined by environmental factors and often lacks the distinct landmarks needed for traditional morphometric methods. The application of these methods is problematic and shapes are more often categorised than actually measured.

As scientists though, we favour independent statistical analyses, and there’s an urgent need for reliable shape characterisation based on numerical approaches. So, scientists often determine complexity parameters such as surface/volume ratios, rugosity, or the level of branching. However, these parameters all share the same drawback: they are delineated to a univariate number, taking information from one or few spatial scales and because of this essential information is lost. Continue reading

Issue 8.8

Issue 8.10 is now online!

The October issue of Methods is now online!

This double-sized issue contains three Applications articles and two Open Access articles. These five papers are freely available to everyone, no subscription required.

 Phylogenetic TreesThe fields of phylogenetic tree and network inference have advanced independently, with only a few attempts to bridge them. Schliep et al. provide a framework, implemented in R, to transfer information between trees and networks.

 Emon: Studies, surveys and monitoring are often costly, so small investments in preliminary data collection and systematic planning of these activities can help to make best use of resources. To meet recognised needs for accessible tools to plan some aspects of studies, surveys and monitoring, Barry et al. developed the R package emon, which includes routines for study design through power analysis and feature detection.

 Haplostrips: A tool to visualise polymorphisms of a given region of the genome in the form of independently clustered and sorted haplotypes. Haplostrips is a command-line tool written in Python and R, that uses variant call format files as input and generates a heatmap view.

Continue reading

Issue 8.8

Issue 8.8 is now online!

The August issue of Methods is now online!

This issue contains two Applications articles and two Open Access articles. These four papers are freely available to everyone, no subscription required.

 Paco: An R package that assesses the phylogenetic congruence, or evolutionary dependence, of two groups of interacting species using both ecological interaction networks and their phylogenetic history.

 Open MEE: Open Meta-analyst for Ecology and Evolution (Open MEE) addresses the need for advanced, easy-to-use software for meta-analysis and meta-regression.It offers a suite of advanced meta-analysis and meta-regression methods for synthesizing continuous and categorical data, including meta-regression with multiple covariates and their interactions, phylogenetic analyses, and simple missing data imputation.

Continue reading

Conditional Occupancy Design Explained

Occupancy surveys are widely used in ecology to study wildlife and plant habitat use. To account for imperfect detection probability many researchers use occupancy models. But occupancy probability estimates for rare species tend to be biased because we’re unlikely to observe the animals at all and as a result, the data aren’t very informative.

In their new article – ‘Occupancy surveys with conditional replicates: An alternative sampling design for rare species‘ – Specht et al. developed a new “conditional” occupancy survey design to improve occupancy estimates for rare species, They also compare it to standard and removal occupancy study designs. In this video two of the authors, Hannah Specht and Henry Reich, explain how their new conditional occupancy survey design works. 

This video is based on the article ‘Occupancy surveys with conditional replicates: An alternative sampling design for rare species‘ by Specht et al.