Threshold-free statistical methods for the analysis of continuous health outcomes, with applications to malaria serology

Data Science Institute

Associated organisational unit

DSI - Health

Electronic data

2021kyomuhangiphd.pdf
Final published version, 3.09 MB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Text available via DOI:

https://doi.org/10.17635/lancaster/thesis/1491
Final published version

Keywords

binary data, geostatistics, prevalence, malaria serology, reversible catalytic model, antibody acquisition model, malaria, seroprevalence, disease mapping, mixture model

View graph of relations

Research output: Thesis › Doctoral Thesis

Published

Irene Kyomuhangi

More...

Publication date	2021
Number of pages	77
Qualification	PhD
Awarding Institution	Lancaster University
Supervisors/Advisors	Giorgi, Emanuele, Supervisor Keegan, Thomas, Supervisor
Publisher	Lancaster University
<mark>Original language</mark>	English

Abstract

Continuous measurements of health outcome data are often dichotomized into binary ( i.e. positive/negative) data for diagnosis and subsequent statistical analysis. The disadvantages of dichotomizing continuous data for statistical inference are well established in the literature, yet this practice is commonplace in health research.

In this thesis, we investigate the impact of dichotomization of data when the aim of analysis is to determine disease prevalence and risk, and propose solutions to some of the main challenges introduced by dichotomization in the context of global heath research.

First, using model-based geostatistics, we show how dichotomization reduces the predictive performance of geostatistical models through loss of information and by reducing the reliability of parameter estimates. We demonstrate this using a simulation study, as well as mapping prevalence and risk of anaemia in Ethiopia, and stunting in Ghana.

We then explore the limitations dichotomization introduces to estimation of malaria transmission in serology models, and propose a novel flexible and unified modelling framework which uses continuous antibody measurements instead of dichotomized data to estimate transmission intensity. Using Western Kenya, we demonstrate the properties of this new approach.

Finally, we address the use of thresholds for dichotomization of continuous antibody measurements when the goal is to estimate malaria seroprevalence. We utilize the principles of the unified modelling framework to develop a threshold-free approach to estimating seroprevalence. Using the same Western Kenyan data-set, we show how this new approach improves model fit and provides more consistent estimates than traditional methods.

Together, these investigations demonstrate the significant impact dichotomization of continuous data has on statistical inference across different areas of health research, and that this practice should be avoided where possible.

Research

Associated organisational unit

Electronic data

Text available via DOI:

Keywords