Post provided by Carola Gómez-Rodríguez & Alfried P. Vogler
Time flies… in the blink of an eye! And even more so in science. The molecular lab work we were used to two decades ago seems like ancient history to today’s PhD students. The speed of change in sequencing technology is so overwhelming that imagination usually fails to foresee how our daily work will be in 10 years’ time. But in the field of biodiversity assessment, we have very good clues. Next Generation Sequencing is quickly becoming our workhorse for ambitious projects of species and genetic inventories.
One by One Approach to Studying Biodiversity
For decades, most initiatives measured biodiversity in the same way: collect a sample of many individuals in the field, sort the specimens, identify them to a Linnaean species one at a time (if there was a good taxonomist in the group which, unfortunately, it is kind of lucky these days!), and count them. Or, if identification was based on molecular data, the specimen was subject to DNA extraction, to sequence one (or several) short DNA markers. This involved countless hours of work that could be saved if, instead of inventorying biodiversity specimen-by-specimen, we followed a sample-by-sample approach. To do this now, we just have to make a “biodiversity soup”.
But, what’s a biodiversity soup? Basically, all specimens in the sample are kept together to allow a pooled DNA extraction. You can focus on a short portion of the genome and amplify it with universal primers (as in the metabarcoding approach, e.g. Yu et al. 2012) or you can directly shotgun sequence their mixed genomes after they have been chopped into pieces of no more than 300 bp. The result: a crazy puzzle of millions of short DNA reads of multiple specimens and species all mixed together.
Amazingly though, standard genome assemblers are able to string together these reads and reconstruct (at least partially) the mitochondrial genomes of a vast number of the species present in your sample. This ‘genome skimming’ approach, called mitochondrial metagenomics [MMG], Zhou et al. 2013, Crampton-Platt et al. 2015 favours the assembly of mitochondrial genomes over most nuclear DNA regions because of their high copy numbers and lack of allelic variation.
Validating Mitochondrial Metagenomics
Our recent papers in Methods in Ecology and Evolution – ‘Validating the power of mitochondrial metagenomics for community ecology and phylogenetics of complex assemblages’ and ‘Intraspecific genetic variation in complex assemblages from mitochondrial metagenomics: comparison with DNA barcodes’ – have successfully validated this approach under a challenging scenario: highly diverse samples containing on average 250 specimens of some 45 species, all from a single beetle family, the leaf beetles (Chrysomelidae).
First, the assembler software produced generally one mitogenome assembly per species: this is a critical feature that allows us to conduct diversity assessment at the species level. Notably, different species in the mixture usually each produced a separate mitogenome while closely related individuals (i.e. different haplotypes of the same species) were almost always subsumed into a single sequence. In most cases, this assembled mitogenome matched an actual haplotype present in the population and, if there were differences, they resulted from chimera formation among very closely similar haplotypes, which were small enough to disregard. Next, the proportion of reads in the sample was correlated with the species’ biomass. This allowed us to make an estimation of relative abundance, which is an assemblage property at the core of many ecological theories (McGill et al. 2007).
The resulting mitochondrial genomes were strong phylogenetic markers that defined the evolutionary context for each species in the ‘soup’. Once we had these, we could study species richness patterns, such as the differentiation among sites at species (beta diversity) and clade (phylogenetic beta diversity) levels. Understanding community differentiation is a major step towards understanding the distribution of biodiversity. Finally, and here it becomes more tricky, genetic variation for each species can be extracted, but more on this below.
Back to the Future of Biodiversity Assessment
So, are we already ‘in the future’? The paradigm shift brought about by next-generation biodiversity assessment is the focus on the community, instead of the individual, as mass-trapped specimens are subjected to genetic analysis without being pre-selected. This not only changes the work flow, but also the questions we may want to ask about biodiversity. For example, a problem of conventional single-species studies was that we were just studying unique evolutionary histories, but biodiversity is fuzzy and only the study of many such histories will reveal the broader trends, such as the correlation of species richness with climate, the drivers of community assembly, or the evolutionary origin of local and regional species assemblages.
Similarly, comparative phylogeography seeks to infer palaeoclimatic events from common distributions of multiple species. Novel ecological patterns can also be unveiled if biodiversity is assessed at multiple hierarchical levels of genotypes, genealogies and species, as each of them would represent different scales of evolutionary time and geographical area (‘multihierarchical macroecology’; Baselga et al. 2013, Baselga et al. 2015). Having these goals in mind, a sample-by-sample approach through metagenomics can be a game changer for the study of biodiversity.
So far, virtually all of the promises of the mitochondrial metagenomics approach for studying biodiversity patterns have been upheld. The last major challenge was the analysis of genetic variation, which is the focus of our most recent paper – ‘Intraspecific genetic variation in complex assemblages from mitochondrial metagenomics: comparison with DNA barcodes’. As discussed above, the assembly-based approach aiming to reconstruct the mitogenomes precludes the assessment of intra-specific variation because, inherently to the method, the variation among very similar reads is erased by the assembly. Instead, we used a read-based approach, by which reads are mapped against a reference library of genomes (cox1-5’ barcodes in our case), and the genetic diversity is assessed based on single nucleotide polymorphisms (SNPs). As we analyse each sample that corresponds to a local assemblage, measuring intraspecific diversity for all local species seems to be within reach.
In our pilot study, nucleotide diversity for each species in the bulk sample was strongly correlated with “traditional” assessments based on Sanger sequencing of single specimens. This allowed reliable estimates of genetic diversity gradients for particular species of leaf beetles in the Iberian Peninsula. For example, Exosoma lusitanicum showed consistently greater genetic diversity, and was present in more locations, than Smaragdina concolor. This suggests that Smaragdina concolor have smaller population sizes or less migration (stronger genetic drift), or both, consistent with predictions from population genetics.
The community-level analysis of genetic variation offers new exciting opportunities, such as the detection of commonalities in population differentiation across regions or ecosystems, the correlation of genetic diversity with species diversity in communities, or the interactions of genetic diversity among species across the trophic cascade.
We cannot finish without saying how excited we are about the new avenues that NGS offers to biodiversity assessment, even below the species level, as we believe the holistic analysis of community diversity will reveal novel, emergent properties of biodiversity at multiple scales from genotypes to ecosystems. Today we can make biodiversity soups from our field samples and easily assess species occurrence, phylogenetic relationships, abundance and even the genetic diversity of individuals in the sample. Isn’t that what we would ask of the future?
To find out more about mitochondrial metagenomics, read our Methods in Ecology and Evolution articles ‘Validating the power of mitochondrial metagenomics for community ecology and phylogenetics of complex assemblages’ and ‘Intraspecific genetic variation in complex assemblages from mitochondrial metagenomics: comparison with DNA barcodes‘.