Home > Research > Publications & Outputs > Species distribution models

Electronic data

  • Methods_comparison_paper_V5

    886 KB, Word document

    Available under license: None

Links

Text available via DOI:

View graph of relations

Species distribution models: A comparison of statistical approaches for livestock and disease epidemics

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published
Close
Article numbere0183626
<mark>Journal publication date</mark>24/08/2017
<mark>Journal</mark>PLoS ONE
Issue number8
Volume12
Number of pages19
Publication StatusPublished
<mark>Original language</mark>English

Abstract

In livestock industries, reliable up-to-date spatial distribution and abundance records for animals and farms are critical for governments to manage and respond to risks. Yet few, if any, countries can afford to maintain comprehensive, up-to-date agricultural census data. Statistical modelling can be used as a proxy for such data but comparative modelling studies have rarely been undertaken for livestock populations. Widespread species, including livestock, can be difficult to model effectively due to complex spatial distributions that do not respond predictably to environmental gradients. We assessed three machine learning species distribution models (SDM) for their capacity to estimate national-level farm animal population numbers within property boundaries: boosted regression trees (BRT), random forests (RF) and K-nearest neighbour (K-NN). The models were built from a commercial livestock database and environmental and socio-economic predictor data for New Zealand. We used two spatial data stratifications to test (i) support for decision making in an emergency response situation, and (ii) the ability for the models to predict to new geographic regions. The performance of the three model types varied substantially, but the best performing models showed very high accuracy. BRTs had the best performance overall, but RF performed equally well or better in many simulations; RFs were superior at predicting livestock numbers for all but very large commercial farms. K-NN performed poorly relative to both RF and BRT in all simulations. The predictions of both multi species and single species models for farms and within hypothetical quarantine zones were very close to observed data. These models are generally applicable for livestock estimation with broad applications in disease risk modelling, biosecurity, policy and planning.