Demography and Big Data

Post provided by BRITTANY TELLER, KRISTIN HULVEY and ELISE GORNISH

Follow Brittany (@BRITTZINATOR) and Elise (@RESTORECAL) on Twitter

To understand how species survive in nature, demographers pair field-collected life history data on survival, growth and reproduction with statistical inference. Demographic approaches have significantly contributed to our understanding of population biology, invasive species dynamics, community ecology, evolutionary biology and much more.

As ecologists begin to ask questions about demography at broader spatial and temporal scales and collect data at higher resolutions, demographic analyses and new statistical methods are likely to shed even more light on important ecological mechanisms.

Population Processes

Midsummer Opuntia cactus in eastern Idaho, USA. © B. Teller.
Midsummer Opuntia cactus in eastern Idaho, USA. © B. Teller.

Traditionally, demographers collect life history data on species in the field under one or more environmental conditions. This approach has significantly improved our understanding of basic biological processes. For example, rosette size is a significant predictor of survival for plants like wild teasel (Werner 1975 – links to all articles are at the end of the post), and desert annual plants hedge their bets against poor years by optimizing germination strategies (Gremer & Venable 2014).

Demographers also include temporal and spatial variability in their models to help make realistic predictions of population dynamics. We now know that temporal variability in carrying capacity dramatically improves population growth rates for perennial grasses and provides a better fit to data than models with varying growth rates because of this (Fowler & Pease 2010). Moreover, spatial heterogeneity and environmental stochasticity have similar consequences for plant populations (Crone 2016).

Understanding basic demographic processes can give us critical insights into how to manage ecological systems. In the management of endangered species, guidelines for reintroducing captive mammals to the wild must consider demographic parameters (Kleiman 1989), and evaluations of endangered species recovery should consider age structure and recruitment rates (Gerber & Hatch 2002, Mace & Lande 1991, Mace et al. 2008).

In loggerhead sea turtles, demographic approaches identified limitations in conventional management strategies and suggested more effective approaches (Crouse et al. 1987). In plants systems, regeneration of oak woodlands may depend on recruitment rates and understanding the effects of climate on demographic rates could be critical for managing these systems in the future (Zavaleta et al. 2007). Significantly for human health, demographic approaches have helped us understand the seasonality and spread of childhood diseases (Metcalf et al. 2009) and that human population age structure can mediate the effectiveness of vaccination strategies in different parts of the world (McKee et al. 2015).

In the future, studies that inform predictions of long-term temporal and spatial patterns are likely to improve our ability to both conserve species and improve human health.

Evolutionary Processes

Juvenile pronghorn covered with parasites in Grand Teton National Park, WY, USA. © B. Teller.
Juvenile pronghorn covered with parasites in Grand Teton National Park, WY, USA. © B. Teller.

Demographic research has much to contribute to the study of evolution as well (Metcalf & Pavard 2007). For example, climactic extremes can select for the evolution of self-compatibility (Evans et al. 2011), and growth form is strongly correlated with the diversification of seed mass (Moles et al. 2005). The importance of life history transitions has significant phylogenetic components (Burns et al. 2010). Evolutionary processes, genetics and demography are also likely to be important in the context of species invasions (Turner et al. 2014), especially during colonization phases (Szucs et al. 2014).

Evaluating the role of evolution in ecological systems could help us understand how environmental factors, like climate change or invasive species, will affect species genetic diversity.

Invasive Species

Invasive species are an interesting problem for ecologists because they perform well in regions where they didn’t originally evolve and often cause economic harm (Parker et al. 1999). In an effort to determine what makes invasive species successful, it has been shown that invasive species abundance in the native range is a good predictor of abundance in introduced ranges (Firn 2011). It has also been found that humans often play a direct role in mediating invasion success (Buckley & Catford 2016).

Pollinators visit flowers of the invasive thistle species, Carduus nutans, in central PA, USA. © B. Teller.
Pollinators visit flowers of the invasive thistle species, Carduus nutans, in central PA, USA. © B. Teller.

Invasive species perform differently in their invasive versus native ranges (Williams et al. 2008). Changes in climate and elevated soil nitrogen may affect invasive species performance and have long-term effects on community composition (Zhang et al. 2011, Gornish 2014). Broad community context may be critical to understanding how invasive species succeed in new communities (Shea & Chesson 2002), and how to manage invasive species appropriately (Louda et al. 2003). Management of invaders can be challenging when population density and age structure interact with invasive species management strategies (Pardini et al. 2009), but management plans for some invasive species, like pines, can be robust despite uncertainty in some demographic parameters (Buckley et al. 2005).

As ecologists accumulate more data about invasive species population persistence and spread through new and differing landscapes, valuable opportunities will arise to evaluate the roles of spatial heterogeneity and community structure in invasive species success. Demographic methods and models will be key to these studies.

Trophic and Community Processes

The usefulness of demographic studies extends well beyond the population level, and can also contribute insights into processes that occur among competitors, mutualists and/or trophic levels. Competition is a major structuring process in many ecological communities and these effects can be mediated by demography (Goldberg & Barton 1992, Lee et al 2011). For example, in tropical forests, crowding affects sapling growth and can ultimately affect community diversity (Uriarte et al. 2004). Global observations of density-dependent mortality show patterns along latitude and may have broader effects on patterns of latitudinal diversity (Hille Ris Lambers et al. 2002).

Midsummer sagebrush steppe landscape in eastern Idaho, USA. © B. Teller.
Midsummer sagebrush steppe landscape in eastern Idaho, USA. © B. Teller.

Positive associations between species (such as mutualisms and commensalisms) extend species’ environmental tolerances (Afkhami et al. 2014), and mutualist pollinators can affect, and/or be affected by, ecological invasions (Harmon-Threatt et al. 2009, Powell et al. 2011, Russo et al. 2016). Antagonistic trophic interactions by predators or herbivores can also be linked to demography. Long-lived native plants in temperate understory communities become threatened when herbivory is high for example (Knight et al. 2009), and endangered lupines are heavily threatened by invasion due to competition, which is influenced by a native rodent (Dangremond et al. 2010). Also, the interactions between demography and disease transmission affect community composition in grasslands and forests (Borer et al. 2014, Needham et al. 2016).

Future work that can effectively examine the multivariate composition of ecological communities, and the mechanisms that allow these species to coexist, will help us better understand the factors that contribute to diversity and ecosystem stability.

New Methods for a New Era

Ecologists who ask questions about basic biology, conservation and management are beginning to conduct research at larger spatial and temporal scales (e.g. Angert et al. 2011). This is partially because ecological populations and communities are increasingly threatened by human-driven global change (Parmesan et al. 1999, Hellmann et al. 2008). As in the past, demographic methods are likely to provide valuable insights into the mechanisms by which species persist and coexist in variable landscapes.

As the resolution and extent of available data becomes larger, the number of potential covariates also increases (such as daily precipitation and temperature records, or the spatial locations of many differently sized individuals). Under these circumstances, statistical model selection becomes difficult because even large demographic data sets could have fewer observations than potential covariates (Daeglish et al. 2011). So, there is a need in demography for new analytical approaches that can accommodate large numbers of covariates.

Functional data analysis is a branch of statistics that can identify mechanisms that occur in continuous domains. These methods could provide solutions to big data problems in ecology and evolutionary biology (e.g. Ding & Wang 2008, Heckman 2003). To improve models of long-term demographic data, our contribution to the Demography Beyond the Population Special Feature (Teller et al. 2016) compares the utility of functional spline models with other machine learning methods (LASSO and random forests).

In our additive spline models, the sum of the fitted coefficients multiplied by the lagged covariates represents the contribution of precipitation or temperature to growth rates or survival. In a natural system, this model allows us to see whether the amount of rain that falls effects growth proportionately to how sensitive individuals are to water addition at that time. Models could show that water addition in the winter when plants are dormant may not contribute much to growth (and have a coefficient of zero), but additional summer precipitation may contribute strongly to growth (and have a high and positive coefficient).

Since the function that represents the coefficients must be smooth, coefficients for lags that are close in time must be similar, but could be very different from coefficients at other times of the year (see figures above). We show that functional models are proficient with data at higher resolutions (such as weekly lagged covariates), though fitting the models takes more computational time. We conclude that splines (Wahba 1990, Heckman 2012) are likely to be useful for analysing demography in the context of local competition and climate precisely because spline methods have the notable advantage of assuming that covariates are smooth functions of time or space (Senturk & Muller 2010, Staicu et al. 2012).

Functional linear models could provide an exciting view of ecological mechanisms for demographers, but not all machine learning methods preformed equally well in our tests. As a result, we recommend exercising caution when using novel statistical methods and using simulation approaches to identify methods that successfully return correct answers in realistic, simulated data.

Ecological systems are necessarily multivariate because they involve many genetically non-identical individuals, different species, and different trophic levels. They’re seasonal, occur under widely varying conditions and are often affected by history. As more data about these systems are collected, analyses are likely to become more complex too. In the future, data-driven approaches like functional data analysis could help demographers identify important mechanisms in their own big data, and continue to advance ecological discoveries for basic science, conservation and management.

This article references 50 publications led by first authors with female-identifying first names.

  1. Afkhami, M.E., McIntyre, P.J. and Strauss, S.Y., 2014. Mutualist‐mediated effects on species’ range limits across large geographic scales. Ecology Letters, 17(10), pp.1265-1273.
  2. Angert, A.L., Crozier, L.G., Rissler, L.J., Gilman, S.E., Tewksbury, J.J. and Chunco, A.J., 2011. Do species’ traits predict recent shifts at expanding range edges? Ecology Letters, 14(7), pp.677-689.
  3. Borer, E.T., Hosseini, P.R., Seabloom, E.W. and Dobson, A.P., 2007. Pathogen-induced reversal of native dominance in a grassland community. Proceedings of the National Academy of Sciences, 104(13), pp.5473-5478.
  4. Buckley, Y.M. and Catford, J., 2016. Does the biogeographic origin of species matter? Ecological effects of native and non‐native species and the use of origin to guide management. Journal of Ecology, 104(1), pp.4-17.
  5. Buckley, Y.M., Brockerhoff, E., Langer, L., Ledgard, N., North, H. and Rees, M., 2005. Slowing down a pine invasion despite uncertainty in demography and dispersal. Journal of Applied Ecology, 42(6), pp.1020-1030.
  6. Burns, J.H., Blomberg, S.P., Crone, E.E., Ehrlen, J., Knight, T.M., Pichancourt, J.B., Ramula, S., Wardle, G.M. and Buckley, Y.M., 2010. Empirical tests of life‐history evolution theory using phylogenetic analysis of plant demography. Journal of Ecology, 98(2), pp.334-344.
  7. Crone, E.E., 2016. Contrasting effects of spatial heterogeneity and environmental stochasticity on population dynamics of a perennial wildflower. Journal of Ecology, 104(2), pp.281-291.
  8. Crouse, D.T., Crowder, L.B. and Caswell, H., 1987. A stage-based population model for loggerhead sea turtles and implications for conservation. Ecology, 68(5), pp.1412-1423.
  9. Dalgleish, H.J., Koons, D.N., Hooten, M.B., Moffet, C.A. and Adler, P.B., 2011. Climate influences the demography of three dominant sagebrush steppe plants. Ecology, 92(1), pp.75-85.
  10. Dangremond, E.M., Pardini, E.A. and Knight, T.M., 2010. Apparent competition with an invasive plant hastens the extinction of an endangered lupine. Ecology, 91(8), pp.2261-2271.
  11. Ding, J. and Wang, J.L., 2008. Modeling longitudinal data with nonparametric multiplicative random effects jointly with survival data. Biometrics, 64(2), pp.546-556.
  12. Evans, M.E., Hearn, D.J., Theiss, K.E., Cranston, K., Holsinger, K.E. and Donoghue, M.J., 2011. Extreme environments select for reproductive assurance: evidence from evening primroses (Oenothera). New Phytologist, 191(2), pp.555-563.
  13. Firn, J., Moore, J.L., MacDougall, A.S., Borer, E.T., Seabloom, E.W., HilleRisLambers, J., Harpole, W.S., Cleland, E.E., Brown, C.S., Knops, J.M. and Prober, S.M., 2011. Abundance of introduced species at home predicts abundance away in herbaceous communities. Ecology Letters, 14(3), pp.274-281.
  14. Fowler, N.L. and Pease, C.M., 2010. Temporal variation in the carrying capacity of a perennial grass population. The American Naturalist, 175(5), pp.504-512.
  15. Gerber, L.R. and Hatch, L.T., 2002. Are we recovering? An evaluation of recovery criteria under the US Endangered Species Act. Ecological Applications, 12(3), pp.668-673.
  16. Goldberg, D.E. and Barton, A.M., 1992. Patterns and consequences of interspecific competition in natural communities: a review of field experiments with plants. American Naturalist, pp.771-801.
  17. Gornish, E.S., 2014. Demographic effects of warming, elevated soil nitrogen and thinning on the colonization of a perennial plant. Population Ecology, 56(4), pp.645-656.
  18. Gremer, J.R. and Venable, D.L., 2014. Bet hedging in desert winter annual plants: optimal germination strategies in a variable environment. Ecology Letters, 17(3), pp.380-387.
  19. Harmon-Threatt, A.N., Burns, J.H., Shemyakina, L.A. and Knight, T.M., 2009. Breeding system and pollination ecology of introduced plants compared to their native relatives. American Journal of Botany, 96(8), pp.1544-1550.
  20. Heckman, N.E., 2003. Functional data analysis in evolutionary biology. Recent Advances and Trends in Nonparametric Statistics, pp.49-60.
  21. Heckman, N., 2012. The theory and application of penalized methods or Reproducing Kernel Hilbert Spaces made easy. Statistics Surveys, 6, pp.113-141.
  22. Hellmann, J.J., Byers, J.E., Bierwagen, B.G. and Dukes, J.S., 2008. Five potential consequences of climate change for invasive species. Conservation Biology, 22(3), pp.534-543.
  23. Kleiman, D.G., 1989. Reintroduction of Captive Mammals for Conservation Guidelines for reintroducing endangered species into the wild. BioScience, 39(3), pp.152-161.
  24. Knight, T.M., Caswell, H. and Kalisz, S., 2009. Population growth rate of a common understory herb decreases non-linearly across a gradient of deer herbivory. Forest Ecology and Management, 257(3), pp.1095-1103.
  25. Lambers, J.H.R., Clark, J.S. and Beckage, B., 2002. Density-dependent mortality and the latitudinal gradient in species diversity. Nature, 417(6890), pp.732-735.
  26. Lee, C.T., Miller, T.E. and Inouye, B.D., 2011. Consumer effects on the vital rates of their resource can determine the outcome of competition between consumers. The American Naturalist, 178(4), pp.452-463.
  27. Louda, S.M., Pemberton, R.W., Johnson, M.T. and Follett, P., 2003. Nontarget Effects-The Achilles’ heel of Biological Control? Retrospective Analyses to Reduce Risk Associated with Biocontrol Introductions. Annual Review of Entomology, 48(1), pp.365-396.
  28. Mace, G.M. and Lande, R., 1991. Assessing extinction threats: toward a reevaluation of IUCN threatened species categories. Conservation Biology, 5(2), pp.148-157.
  29. Mace, G.M., Collar, N.J., Gaston, K.J., Hilton‐Taylor, C., Akçakaya, H.R., Leader‐Williams, N., Milner‐Gulland, E.J. and Stuart, S.N., 2008. Quantification of extinction risk: IUCN’s system for classifying threatened species. Conservation Biology, 22(6), pp.1424-1442. (link)
  30. Metcalf, C.J.E. and Pavard, S., 2007. Why evolutionary biologists should be demographers. Trends in Ecology & Evolution, 22(4), pp.205-212.
  31. Metcalf, C.J.E., Bjørnstad, O.N., Grenfell, B.T. and Andreasen, V., 2009. Seasonality and comparative dynamics of six childhood infections in pre-vaccination Copenhagen. Proceedings of the Royal Society of London B: Biological Sciences, 276(1676), pp.4111-4118.
  32. McKee, A., Ferrari, M.J. and Shea, K., 2015. The effects of maternal immunity and age structure on population immunity to measles. Theoretical Ecology, 8(2), pp.261-271.
  33. Moles, A.T. and Westoby, M., 2004. What do seedlings die from and what are the implications for evolution of seed size? Oikos, 106(1), pp.193-199.
  34. Needham, J., Merow, C., Butt, N., Malhi, Y., Marthews, T.R., Morecroft, M. and McMahon, S.M., 2016. Forest community response to invasive pathogens: the case of ash dieback in a British woodland. Journal of Ecology, 104(2), pp.315-330.
  35. Parker, I.M., Simberloff, D., Lonsdale, W.M., Goodell, K., Wonham, M., Kareiva, P.M., Williamson, M.H., Von Holle, B.M.P.B., Moyle, P.B., Byers, J.E. and Goldwasser, L., 1999. Impact: toward a framework for understanding the ecological effects of invaders. Biological Invasions, 1(1), pp.3-19.
  36. Pardini, E.A., Drake, J.M., Chase, J.M. and Knight, T.M., 2009. Complex population dynamics and control of the invasive biennial Alliaria petiolata (garlic mustard). Ecological Applications, 19(2), pp.387-397.
  37. Parmesan, C., Ryrholm, N., Stefanescu, C., Hill, J.K., Thomas, C.D., Descimon, H., Huntley, B., Kaila, L., Kullberg, J., Tammaru, T. and Tennent, W.J., 1999. Poleward shifts in geographical ranges of butterfly species associated with regional warming. Nature, 399(6736), pp.579-583.
  38. Powell, K.I., Krakos, K.N. and Knight, T.M., 2011. Comparing the reproductive success and pollination biology of an invasive plant to its rare and common native congeners: a case study in the genus Cirsium (Asteraceae). Biological Invasions, 13(4), pp.905-917.
  39. Russo, L.A., Nichol, C., Shea, K. 2016. Pollinator floral provisioning by a plant invader: quantifying beneficial effects of detrimental species. Diversity & Distributions, 22(2), pp. 189-198.
  40. Şentürk, D. and Müller, H.G., 2010. Functional varying coefficient models for longitudinal data. Journal of the American Statistical Association, 105(491), pp.1256-1264.
  41. Shea, K. and Chesson, P., 2002. Community ecology theory as a framework for biological invasions. Trends in Ecology & Evolution, 17(4), pp.170-176.
  42. Staicu, A.M., Crainiceanu, C.M., Reich, D.S. and Ruppert, D., 2012. Modeling functional data with spatially heterogeneous shape characteristics. Biometrics, 68(2), pp.331-343.
  43. Szűcs, M., Melbourne, B.A., Tuff, T. and Hufbauer, R.A., 2014. The roles of demography and genetics in the early stages of colonization. Proceedings of the Royal Society of London B: Biological Sciences, 281(1792), p.2014
  44. Teller, B. J., Adler, P. B., Edwards, C. B., Hooker, G., Ellner, S. P. (2016), Linking demography with drivers: climate and competition. Methods in Ecology and Evolution, 7: 171–183. doi: 10.1111/2041-210X.12486
  45. Turner, K.G., Hufbauer, R.A. and Rieseberg, L.H., 2014. Rapid evolution of an invasive weed. New Phytologist, 202(1), pp.309-321.
  46. Uriarte, M., Condit, R., Canham, C.D. and Hubbell, S.P., 2004. A spatially explicit model of sapling growth in a tropical forest: does the identity of neighbours matter? Journal of Ecology, 92(2), pp.348-360.
  47. Wahba, G., 1990. Spline models for observational data (Vol. 59). Siam.
  48. Werner, P.A., 1975. Predictions of fate from rosette size in teasel (Dipsacus fullonum L.). Oecologia, 20(3), pp.197-201.
  49. Williams, J.L., Auge, H. and Maron, J.L., 2008. Different gardens, different results: native and introduced populations exhibit contrasting phenotypes across common gardens. Oecologia, 157(2), pp.239-248.
  50. Zhang, R., Jongejans, E. and Shea, K., 2011. Warming increases the spread of an invasive thistle. PLoS One, 6(6), p.e21725.
  51. Zavaleta, E.S., Hulvey, K.B. and Fulfrost, B., 2007. Regional patterns of recruitment success and failure in two endemic California oaks. Diversity and Distributions, 13(6), pp.735-745.

3 thoughts on “Demography and Big Data

Leave a comment