Code-Based Methods and the Problem of Accessibility

Post provided by Jamie M. Kass, Matthew E. Aiello-Lammens, Bruno Vilela, Robert Muscarella, Cory Merow and Robert P. Anderson

The namesake of our software and founder of the field of biogeography, Alfred Russel Wallace. Photo ©G. W. Beccaloni

The namesake of our software and founder of the field of biogeography, Alfred Russel Wallace. Photo ©G. W. Beccaloni

In ecology, new methods are increasingly being accompanied by code, and sometimes even full command-line software packages (usually in R). This is great, as it makes analyses more reproducible and transparent, which is essential for the development of open science. In an ideal world, code would have informative annotation, generalized functions for multipurpose use, and be written in a legible and consistent manner. After all, the code may be used by ecologists with a wide range of programming experience.

In reality, code is often poorly commented (or not commented at all!), hard to reuse for other projects, and difficult to interpret. To add to that, most code isn’t actively maintained, so users are on their own if they try to commandeer it for new purposes. Further, ecologists with little or no programming knowledge are unlikely to benefit from methods that exist only as poorly documented code. In a positive development, some new methods are accessible through software with graphic user interfaces (GUIs) developed by programmers spending significant time and effort. But too often these end up as tools with flashy controls and insufficient instruction manuals. Continue reading


Issue 9.2

Issue 9.2 is now online!

The February issue of Methods is now online!

This double-size issue contains six Applications articles (one of which is Open Access) and two Open Access research articles. These eight papers are freely available to everyone, no subscription required.

 Temperature Manipulation: Welshofer et al. present a modified International Tundra Experiment (ITEX) chamber design for year-round outdoor use in warming taller-stature plant communities up to 1.5 m tall.This design is a valuable tool for examining the effects of in situ warming on understudied taller-stature plant communities

 ZoonThe disjointed nature of the current species distribution modelling (SDM) research environment hinders evaluation of new methods, synthesis of current knowledge and the dissemination of new methods to SDM users. The zoon R package aims to overcome these problems by providing a modular framework for constructing reproducible SDM workflows.

 BEIN R Package: The Botanical Information and Ecology Network (BIEN) database comprises an unprecedented wealth of cleaned and standardised botanical data. The bien r package allows users to access the multiple types of data in the BIEN database. This represents a significant achievement in biological data integration, cleaning and standardisation.

Continue reading

Policy on Publishing Code Virtual Issue

In January 2018, Methods in Ecology and Evolution launched a Policy on Publishing Code. The main objective of this policy is to make sure that high quality code is readily available to our readers. set out four key principles to help achieve this, as well as explaining what code outputs we publish, giving some examples of things that make it easier to review code, and giving some advice on how to store code once it’s been published.

To help people to understand how to meet the guidelines and principles of the new policy, a group of our Applications Associate Editors (Nick Golding, Sarah Goslee, Tim Poisot and Samantha Price) have put together a Virtual Issue of Applications articles published over the past couple of years that have followed at least one aspect of the guidelines particularly well. Continue reading

Ending the Terror of R Errors

Post provided by Paul Mensink

Last year, I introduced R to petrified first-year biology students in a set of tutorials. I quickly realised that students were getting bogged down in error messages (even on very simple tasks), so most of my time was spent jumping between students like a wayward Markov chain. I would often find a desperate face at the end of a raised hand looking hopelessly towards their R console muttering some version of “What the $%# does this mean?”. I instantly morphed from teacher to translator and our class progress was slower than a for-loop caught in the second Circle.

Error messages are often not very helpful

Error messages are often not very helpful

Fast forward to Ecology Across Borders last December in Ghent, where rOpenSci and special interest groups from the BESGfÖ and NecoV  and Methods in Ecology and Evolution  co-hosted a pre-conference R hackathon. I was elated to see that one of the challenges was focused on translating R error messages into “Plain English” (thanks to @DanMcGlinn for the original suggestion!). Continue reading

The BES Quantitative Ecology SIG: Who We Are, What We Do and What to Look Out for at #EAB2017

Post provided by Susan Jarvis and Laura Graham

Ecologists are increasingly in need of quantitative skills and the British Ecological Society Quantitative Ecology Special Interest Group (QE SIG) aims to support skills development, sharing of good practice and highlighting novel methods development within quantitative ecology. We run events throughout the year, as well as contributing to the Annual Meeting and providing a mailing list to share events, jobs and quantitative news.

Ecology Hackathon

The run up to the Ecology Across Borders joint Annual Meeting in Ghent this month is an exciting time for the SIG as we look forward to catching up with existing members as well as hopefully meeting some new recruits! Several of our SIG committee members will be in attendance and if you’ve been lucky enough to get a place at the Hackathon on the Monday you’ll meet most of us there. The Hackathon has been jointly developed by us and two of our allied groups; the GfÖ Computational Ecology Working Group and the NecoV Ecological Informatics SIG and is being sponsored by Methods in Ecology and Evolution. We’ll be challenging participants to work together to produce R packages suggested by the ecological community. You can see the list of package suggestions here. If you weren’t able to book a place at the Hackathon, but are interested in writing your own packages, you may be interested in the new Guide to Reproducible Code from the BES. Continue reading

Making YOUR Code Reproducible: Tips and Tricks

When we were putting together the British Ecological Society’s Guide to Reproducible Code we asked the community to send us their advice on how to make code reproducible. We got a lot of excellent responses and we tried to fit as many as we could into the Guide. Unfortunately, we ran out of space and there were a few that we couldn’t include.

Luckily, we have a blog where we can post all of those tips and tricks so that you don’t miss out. A massive thanks to everyone who contributed their tips and tricks for making code reproducible – we really appreciate it. Without further ado, here’s the advice that we were sent about making code reproducible that we couldn’t squeeze into the Guide:

Organising Code

©Leejiah Dorward

“Don’t overwrite data files. If data files change, create a new file. At the top of an analysis file define paths to all data files (even if they are not read in until later in the script).” – Tim Lucas, University of Oxford

“Keep one copy of all code files, and keep this copy under revision management.” – April Wright, Iowa State University

“Learn how to write simple functions – they save your ctrl c & v keys from getting worn out.” – Bob O’Hara, NTNU

For complex figures, it can make sense to pre-compute the items to be plotted as its own intermediate output data structure. The code to do the calculation then only needs to be adjusted if an analysis changes, while the things to be plotted can be reused any number of times while you tweak how the figure looks.” – Hao Ye, UC San Diego Continue reading

A Guide to Reproducible Code in Ecology and Evolution

Post provided by Natalie Cooper and Pen-Yuan Hsing

Cover image by David J. Bird

The way we do science is changing — data are getting bigger, analyses are getting more complex, and governments, funding agencies and the scientific method itself demand more transparency and accountability in research. One way to deal with these changes is to make our research more reproducible, especially our code.

Although most of us now write code to perform our analyses, it’s often not very reproducible. We’ve all come back to a piece of work we haven’t looked at for a while and had no idea what our code was doing or which of the many “final_analysis” scripts truly was the final analysis! Unfortunately, the number of tools for reproducibility and all the jargon can leave new users feeling overwhelmed, with no idea how to start making their code more reproducible. So, we’ve put together the Guide to Reproducible Code in Ecology and Evolution to help. Continue reading

Solving YOUR Ecology Challenges with R: Ecology Hackathon in Ghent

©2016 The R Foundation

Scientific software is an increasingly important part of scientific research, and ecologists have been at the forefront of developing open source tools for ecological research. Much of this software is distributed via R packages – there are over 200 R packages for ecology and evolution on CRAN alone. Methods regularly publishes Application articles introducing R packages (and other software) that enable ecological research, and we’re always looking for new ways to enable even more and better ecological software.

This December, we will be teaming up with rOpenSci and special interest groups from BES, GfÖ and NecoV to hold our first Ecology Hackathon at the Ecology Across Borders conference in Ghent. The hackathon will be held as a one-day pre-conference workshop on Monday 11th December. Together, the attendees will identify some challenges for ecological research, and team up to build R packages that help solve them.

We’ve started compiling potential topics for new R packages in a collaborative document, but we need more. Are you having any difficulties in your research that could be solved with an R package? Is there a package that you wish existed but have never been able to find? If so, WE WANT TO HEAR FROM YOU!

Please take a look at our current list of challenges and add your suggestions!

Editor Recommendation – HistMapR: Rapid Digitization of Historical Land-Use Maps in R

Post provided by Sarah Goslee

For an ecologist interested in long-term dynamics, one of the most thrilling experiences is discovering a legacy dataset stashed away somewhere.

For an ecologist interested in long-term dynamics, one of the most daunting experiences is figuring how to turn that box full of paper into usable data.

The new tool HistMapR, described in ’HistMapR: Rapid digitization of historical land-use maps in R’ by Alistair Auffret and colleagues, makes one part of that task much easier.

Examples of input (©Lantmäteriet) and output maps from (a–b) the District Economic map and (c–d) the Economic map.

Examples of input (©Lantmäteriet) and output maps from (a–b) the District Economic map and (c–d) the Economic map.

Historical maps with coloured areas denoting different land cover or use are a valuable record, but difficult to analyse. This R package automates much of the time-consuming and tedious process of turning paper maps into classified categorical raster maps.

A map is scanned, imported into R, and the software is trained by clicking in different areas of each category. It then automatically classifies pixels based on which colour they are most similar to. The resulting classification is assessed manually. The process can be repeated with slightly different parameters until a good fit is achieved.

The authors found 80-90% agreement between HistMapR classification and manual digitisation (sources of error included clarity of original maps and scan quality). Using HistMapR reduced the time needed for digitising a series of historical land cover maps from two months to two days. Ecologists interested in long-term dynamics should be cheering!

The HistMapR package is available on GitHub and you can find example scripts on Figshare, so you can get right to work.

HistMapR: Rapid digitization of historical land-use maps in R‘ by Auffret et al. is a freely available Applications article (no subscription required).

Building Universal PCR Primers for Aquatic Ecosystem Assessments

Post provided by Vasco Elbrecht

Many things can negatively affect stream ecosystems – water abstraction, eutrophication and fine sediment influx are just a few. However, only intact freshwater ecosystems can sustainably deliver the ecosystem services – such as particle filtration, food biomass production and the supply of drinking water – that we rely on. Because of this, stream management and restoration has often been in the focus of environmental legislation world-wide. Macrozoobenthic communities are often key biological components of stream ecosystems. As many taxa within these communities are sensitive to negative stressors introduced by humans, they’re ideal for assessing the quality of water.

Unfortunately, most macrozoobenthic taxa – such as stone-, may-, and caddisflies as well as most other invertebrates – are often found in juvenile larval life stages in these ecosystems, so they’re often difficult to identify based on morphology. With the DNA based metabarcoding method though, almost all taxa in a stream can be reliably identified up to species level using a standardised gene fragment. One key component of this strategy is the development of universal markers, which allow detection of the diverse macrozoobenthic groups.

Our new R package PrimerMiner provides a framework for obtaining sequence data from available reference databases and identifying suitable primer binding sites for marker amplification. The package makes this process quicker and easier. In the following pictures, we summarise the key steps of DNA metabarcoding.

To find out more about PrimerMiner, read our Methods in Ecology and Evolution article ‘PrimerMiner: an r package for development and in silico validation of DNA metabarcoding primers’. Like all Applications articles, this paper is freely available to everyone.