Phylogenies, Trait Evolution and Fancy Glasses

Post provided by Daniel S. Caetano

Phylogenetic trees represent the evolutionary relationships among different lineages. These trees give us two crucial pieces of information:

  1. the relationships between lineages (which we can tell from the pattern of the branches (i.e., topology))
  2. the point when lineages separated from a common ancestor (which we can tell from the length of the branches, when estimated from genetic sequences and fossils).
Phylogeny of insects inferred from genetic sequences showing the time of divergence between ants and bees.
Phylogeny of insects inferred from genetic sequences showing the time of divergence between ants and bees.

As systematic biologists, we are interested in the evolutionary history of life. We use phylogenetic trees to uncover the past, understand the present, and predict the future of biodiversity on the planet. Among the tools for this thrilling job are the comparative methods, a broad set of statistical tools built to help us understand and interpret the tree of life.

Here’s a Tree, Now Tell Me Something

The comparative methods we use to study the evolution of traits are mainly based on the idea that since species share a common evolutionary history, the traits observed on these lineages will share this same history. In the light of phylogenetics, we can always make a good bet about how a species will look if we know how closely related it is to another species or group. Comparative models aim to quantify the likelihood of our bet being right and use the same principle to estimate how fast evolutionary changes accumulate over time.

Simply put, we try to work out the difference in trait values between species, taking the time since their divergence into account. We’re looking for the amount of divergence accumulated between two lineages after evolving independently for a given time interval. The rates of evolution should  be higher in lineages that become more distinct than others in the same amount of time.

This basic concept has led to the development of many phylogenetic comparative models of trait evolution to deal with a variety of questions. In the last decade, the number of such models has expanded rapidly. For example, we now have comparative models to deal with discrete traits, continuous traits, the relationship between traits and lineage diversification, among others. Most of these methods were designed to deal with a single trait fitted to the phylogenetic tree. But, it’s hard to represent the morphology of any species with a single trait and, importantly, the evolutionary correlation among many traits is common and can arise from different processes such as correlated selection gradients, growth constraints or shared genetics (i.e. pleiotropy).

Multivariate Models of Trait Evolution

The main focus of multivariate models of trait evolution is to accommodate several traits in the same analysis and allow researchers to study the evolutionary interactions among these traits throughout the branches of the phylogeny. There are a lot of multivariate models currently available and, like the majority of tools for comparative methods, they are implemented in the R statistical environment. Revell and Harmon (2008) described a model for multivariate Brownian Motion, Bartoszek and colleagues (2012) for fitting multivariate Ornstein–Uhlenbeck  models, and Clavel et. al. (2015) made a comprehensive package to estimate a plethora of multivariate models of trait evolution.

Luke Harmon and I also recently published our own multivariate implementation in a paper that appears in Methods in Ecology Evolution. Unlike previous efforts, our method uses Bayesian statistics as a means to redirect the focus of the analyses to estimating the parameters of the models (rather than choosing between different models) and to naturally incorporate the uncertainty around such estimates.

The common framework among these models is that we can use a variance-covariance matrix to estimate the rates of trait evolution through time, following a phylogenetic tree. Variance is a measure of the overall spread of data. As a rate, we can use variance to measure how fast the data (i.e. the species traits) spread over time, which allows us to translate evolution (i.e. changes over time) into statistics. Covariance describes the joint variation of two elements. We use the covariances among each pair of traits as a measure of how tightly linked they evolve (their evolutionary correlation). Different strengths of evolutionary correlation among traits can cause very distinct outcomes to trait evolution.

Two traits simulated to evolve following the phylogenetic tree on the left. Simulation 1 (top) shows results with no correlation between the traits across the whole tree. Simulation 2 (bottom) shows the result of strong positive evolutionary correlation between the traits on the red clade (shift in the mode). Matrices show the correspondent variances (on the diagonal) and covariances (on the off-diagonals) used in the simulations. ©DS Caetano
Two traits simulated to evolve following the phylogenetic tree on the left. Simulation 1 (top) shows results with no correlation between the traits across the whole tree. Simulation 2 (bottom) shows the result of strong positive evolutionary correlation between the traits on the red clade (shift in the mode). Matrices show the correspondent variances (on the diagonal) and covariances (on the off-diagonals) used in the simulations. ©DS Caetano

Estimating Parameters is Just like Going to Your Optometrist

A model is a simplification of what happens in nature. Often there’s a weak link between the rules that dictate a model and the processes that generated the data. A model can still help us predict and interpret the patterns we observe though. This is especially relevant to comparative models, since we’re interested in the evolution of traits in the scale of thousands to millions of years, when we often only have information about topology, divergence times and trait data for living species. We estimate the parameters for a model to find parameter values that make the model explain the variation of the observed data as best as possible. I like to think of a model as the lenses in a pair of glasses used to view the world. A good fit to the data results in clean, perfect vision. With a bad fit though, things can get rather blurry.

In fact, fitting the parameters of a model to observed data is very similar to a first visit to an optometrist. If you wear glasses (or even just contact lenses) you’ll be familiar with the following scene: You’re trying to read letters on a wall while the optometrist tests multiple combinations of lenses and asks you which is the best. Most lenses aren’t going to help. Some will be good, perhaps a few are very good. At some point, it becomes difficult to distinguish among the top three or four pairs of lenses because they all seem to have a similar effect. In this example, ‘the different lenses’ can be interpreted as different combinations of parameter values of a model and ‘how clear you can see the letters’ is similar to how well the model fits to the data.

Representation of fitting a model to the data. Each piece that the optometrist needs to adjust is a parameter of the model. Looks complicated, right?
Representation of fitting a model to the data. Each piece that the optometrist needs to adjust is a parameter of the model. Looks complicated, right?

This analogy also applies to how our ‘ratematrix’ package searches for the evolutionary variance-covariance matrix that describes rates of evolution and the evolutionary integration among traits. A Markov-chain Monte Carlo (MCMC) acts as the optometrist trying different combinations of lenses and the likelihood function of the model informs whether the fit is good or bad. Each proposal of the MCMC tries a new combination of lenses, we keep the ones that help and reject the ones that are bad. The posterior distribution returned by our method is a collection of parameter combinations weighted by how well they fit to the data. It’s as if you could come home from your optometrist visit with all the top three lenses that seemed just right for you instead of making a hard decision in a rush.

Care for Fancy Frames or Do You Just Want to See Better?

In our article, Luke and I estimate the parameters of a multivariate Brownian motion model of trait evolution allowing for discrete shift of rates in the tree using Bayesian MCMC. Our focus is on the parameters values rather than on answering which is the best among a set of candidate models (i.e. model selection). This is different to some maximum likelihood approaches which often select a single best model from a pool of models by using model choice methods, such as the Akaike Information Criteria (AIC).

Glasses come in very different frames. They all function in the same way. Frames are just for the style. © Patrick Murphy
Glasses come in very different frames. They all function in the same way. Frames are just for the style. © Patrick Murphy

Our method compares the rates of evolution on different regions of the phylogenetic tree and asks whether there’s enough information in the observed data to support the claim that rates are distinct by contrasting parameter values using summary statistics. Model choice focuses on the trade-off between model complexity and goodness-of-fit by asking whether a model with say, two rates adds enough information in order to compensate for the extra parameters when compared to a model with only a single rate. Also, our use of a posterior distribution of parameter estimates lets us compare rate estimates while taking into account the uncertainty in the fit of the model. Using AIC does not directly incorporate this information.

When fitting phylogenetic comparative models to trait data we want to better understand the patterns and, hopefully, the processes that lead to the evolution of biodiversity as we see it today. Our models are simplifications that may or may not fit the observed data well, but they certainly do not perfectly capture the evolutionary processes working generation after generation over millions of years. In the same way that finding frames that suit your style shouldn’t be your main aim when going to the optometrist, the question of whether we can pick a specific model in a pool that best fits our data might not be the most useful question. Our focus should be on how informative the parameter values of the models are and how well they can help us understand the world. We should aim at a pair of glasses that help us see more clearly — at the end of the day it doesn’t matter too much which frame you use.

 

To find out more read our Methods in Ecology and Evolution article ‘ratematrix: An R package for studying evolutionary integration among several traits on phylogenetic trees’ (No Subscription Required).

Leave a comment