Home > Research > Publications & Outputs > Mapping soil thickness by accounting for right‐...

Links

Text available via DOI:

View graph of relations

Mapping soil thickness by accounting for right‐censored data with survival probabilities and machine learning

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Mapping soil thickness by accounting for right‐censored data with survival probabilities and machine learning. / van der Westhuizen, Stephan; Heuvelink, Gerard B. M.; Hofmeyr, David P. et al.
In: European Journal of Soil Science, Vol. 75, No. 5, e13589, 31.10.2024.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

van der Westhuizen, S, Heuvelink, GBM, Hofmeyr, DP, Poggio, L, Nussbaum, M & Brungard, C 2024, 'Mapping soil thickness by accounting for right‐censored data with survival probabilities and machine learning', European Journal of Soil Science, vol. 75, no. 5, e13589. https://doi.org/10.1111/ejss.13589

APA

van der Westhuizen, S., Heuvelink, G. B. M., Hofmeyr, D. P., Poggio, L., Nussbaum, M., & Brungard, C. (2024). Mapping soil thickness by accounting for right‐censored data with survival probabilities and machine learning. European Journal of Soil Science, 75(5), Article e13589. https://doi.org/10.1111/ejss.13589

Vancouver

van der Westhuizen S, Heuvelink GBM, Hofmeyr DP, Poggio L, Nussbaum M, Brungard C. Mapping soil thickness by accounting for right‐censored data with survival probabilities and machine learning. European Journal of Soil Science. 2024 Oct 31;75(5):e13589. Epub 2024 Oct 7. doi: 10.1111/ejss.13589

Author

van der Westhuizen, Stephan ; Heuvelink, Gerard B. M. ; Hofmeyr, David P. et al. / Mapping soil thickness by accounting for right‐censored data with survival probabilities and machine learning. In: European Journal of Soil Science. 2024 ; Vol. 75, No. 5.

Bibtex

@article{ebb2db592bca4d13a3ef0443c2284510,
title = "Mapping soil thickness by accounting for right‐censored data with survival probabilities and machine learning",
abstract = "In digital soil mapping, modelling soil thickness poses a challenge due to the prevalent issue of right‐censored data. This means that the true soil thickness exceeds the depth of sampling, and neglecting to account for the censored nature of the data can lead to poor model performance and underestimation of the true soil thickness. Survival analysis is a well‐established domain of statistical modelling that can deal with censored data. The random survival forest is a notable example of a survival‐related machine learning approach used to address right‐censored soil property data in digital soil mapping. Previous studies that employed this model either focused on mapping the probability of soil thickness exceeding certain depths, and thereby not mapping soil thickness itself, or dismissed it due to perceived poor performance. In this study, we propose an alternative survival model to map soil thickness that is based on the inverse probability of censoring weighting. In this approach, calibration data are weighted by the inverse of the probability that soil thickness exceeds a certain depth, that is, a survival probability. These weights can then be used with most machine learning models. We used the weights with a regular random forest, and compared it with a random survival forest, and other strategies for handling right‐censored data, through a comprehensive synthetic simulation study and two real‐world case studies. The results suggest that the weighted random forest model produces competitive predictions, establishing it as a viable option for mapping right‐censored soil property data.",
keywords = "random survival forest, inverse probability of censoring weighting, digital soil mapping, survival analysis, soil depth",
author = "{van der Westhuizen}, Stephan and Heuvelink, {Gerard B. M.} and Hofmeyr, {David P.} and Laura Poggio and Madlene Nussbaum and Colby Brungard",
year = "2024",
month = oct,
day = "31",
doi = "10.1111/ejss.13589",
language = "English",
volume = "75",
journal = "European Journal of Soil Science",
issn = "1351-0754",
publisher = "Wiley-Blackwell",
number = "5",

}

RIS

TY - JOUR

T1 - Mapping soil thickness by accounting for right‐censored data with survival probabilities and machine learning

AU - van der Westhuizen, Stephan

AU - Heuvelink, Gerard B. M.

AU - Hofmeyr, David P.

AU - Poggio, Laura

AU - Nussbaum, Madlene

AU - Brungard, Colby

PY - 2024/10/31

Y1 - 2024/10/31

N2 - In digital soil mapping, modelling soil thickness poses a challenge due to the prevalent issue of right‐censored data. This means that the true soil thickness exceeds the depth of sampling, and neglecting to account for the censored nature of the data can lead to poor model performance and underestimation of the true soil thickness. Survival analysis is a well‐established domain of statistical modelling that can deal with censored data. The random survival forest is a notable example of a survival‐related machine learning approach used to address right‐censored soil property data in digital soil mapping. Previous studies that employed this model either focused on mapping the probability of soil thickness exceeding certain depths, and thereby not mapping soil thickness itself, or dismissed it due to perceived poor performance. In this study, we propose an alternative survival model to map soil thickness that is based on the inverse probability of censoring weighting. In this approach, calibration data are weighted by the inverse of the probability that soil thickness exceeds a certain depth, that is, a survival probability. These weights can then be used with most machine learning models. We used the weights with a regular random forest, and compared it with a random survival forest, and other strategies for handling right‐censored data, through a comprehensive synthetic simulation study and two real‐world case studies. The results suggest that the weighted random forest model produces competitive predictions, establishing it as a viable option for mapping right‐censored soil property data.

AB - In digital soil mapping, modelling soil thickness poses a challenge due to the prevalent issue of right‐censored data. This means that the true soil thickness exceeds the depth of sampling, and neglecting to account for the censored nature of the data can lead to poor model performance and underestimation of the true soil thickness. Survival analysis is a well‐established domain of statistical modelling that can deal with censored data. The random survival forest is a notable example of a survival‐related machine learning approach used to address right‐censored soil property data in digital soil mapping. Previous studies that employed this model either focused on mapping the probability of soil thickness exceeding certain depths, and thereby not mapping soil thickness itself, or dismissed it due to perceived poor performance. In this study, we propose an alternative survival model to map soil thickness that is based on the inverse probability of censoring weighting. In this approach, calibration data are weighted by the inverse of the probability that soil thickness exceeds a certain depth, that is, a survival probability. These weights can then be used with most machine learning models. We used the weights with a regular random forest, and compared it with a random survival forest, and other strategies for handling right‐censored data, through a comprehensive synthetic simulation study and two real‐world case studies. The results suggest that the weighted random forest model produces competitive predictions, establishing it as a viable option for mapping right‐censored soil property data.

KW - random survival forest

KW - inverse probability of censoring weighting

KW - digital soil mapping

KW - survival analysis

KW - soil depth

U2 - 10.1111/ejss.13589

DO - 10.1111/ejss.13589

M3 - Journal article

VL - 75

JO - European Journal of Soil Science

JF - European Journal of Soil Science

SN - 1351-0754

IS - 5

M1 - e13589

ER -