Home > Research > Publications & Outputs > A Comparison of Feature Selection and Forecasti...

Links

Text available via DOI:

View graph of relations

A Comparison of Feature Selection and Forecasting Machine Learning Algorithms for Predicting Glycaemia in Type 1 Diabetes Mellitus

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

A Comparison of Feature Selection and Forecasting Machine Learning Algorithms for Predicting Glycaemia in Type 1 Diabetes Mellitus. / Rodriguez-Rodriguez, Ignacio; Rodriguez, José-Victor; Woo, Wai Lok et al.
In: Applied Sciences, Vol. 11, No. 4, 1742, 16.02.2021.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Rodriguez-Rodriguez, I, Rodriguez, J-V, Woo, WL, Wei, B & Pardo-Quiles, D-J 2021, 'A Comparison of Feature Selection and Forecasting Machine Learning Algorithms for Predicting Glycaemia in Type 1 Diabetes Mellitus', Applied Sciences, vol. 11, no. 4, 1742. https://doi.org/10.3390/app11041742

APA

Rodriguez-Rodriguez, I., Rodriguez, J-V., Woo, W. L., Wei, B., & Pardo-Quiles, D-J. (2021). A Comparison of Feature Selection and Forecasting Machine Learning Algorithms for Predicting Glycaemia in Type 1 Diabetes Mellitus. Applied Sciences, 11(4), Article 1742. https://doi.org/10.3390/app11041742

Vancouver

Rodriguez-Rodriguez I, Rodriguez J-V, Woo WL, Wei B, Pardo-Quiles D-J. A Comparison of Feature Selection and Forecasting Machine Learning Algorithms for Predicting Glycaemia in Type 1 Diabetes Mellitus. Applied Sciences. 2021 Feb 16;11(4):1742. doi: 10.3390/app11041742

Author

Rodriguez-Rodriguez, Ignacio ; Rodriguez, José-Victor ; Woo, Wai Lok et al. / A Comparison of Feature Selection and Forecasting Machine Learning Algorithms for Predicting Glycaemia in Type 1 Diabetes Mellitus. In: Applied Sciences. 2021 ; Vol. 11, No. 4.

Bibtex

@article{d75af30763dd40d18bdd6b44a81ed5a3,
title = "A Comparison of Feature Selection and Forecasting Machine Learning Algorithms for Predicting Glycaemia in Type 1 Diabetes Mellitus",
abstract = "Type 1 diabetes mellitus (DM1) is a metabolic disease derived from falls in pancreatic insulin production resulting in chronic hyperglycemia. DM1 subjects usually have to undertake a number of assessments of blood glucose levels every day, employing capillary glucometers for the monitoring of blood glucose dynamics. In recent years, advances in technology have allowed for the creation of revolutionary biosensors and continuous glucose monitoring (CGM) techniques. This has enabled the monitoring of a subject{\textquoteright}s blood glucose level in real time. On the other hand, few attempts have been made to apply machine learning techniques to predicting glycaemia levels, but dealing with a database containing such a high level of variables is problematic. In this sense, to the best of the authors{\textquoteright} knowledge, the issues of proper feature selection (FS)—the stage before applying predictive algorithms—have not been subject to in-depth discussion and comparison in past research when it comes to forecasting glycaemia. Therefore, in order to assess how a proper FS stage could improve the accuracy of the glycaemia forecasted, this work has developed six FS techniques alongside four predictive algorithms, applying them to a full dataset of biomedical features related to glycaemia. These were harvested through a wide-ranging passive monitoring process involving 25 patients with DM1 in practical real-life scenarios. From the obtained results, we affirm that Random Forest (RF) as both predictive algorithm and FS strategy offers the best average performance (Root Median Square Error, RMSE = 18.54 mg/dL) throughout the 12 considered predictive horizons (up to 60 min in steps of 5 min), showing Support Vector Machines (SVM) to have the best accuracy as a forecasting algorithm when considering, in turn, the average of the six FS techniques applied (RMSE = 20.58 mg/dL).",
keywords = "diabetes mellitus, machine learning, feature selection, time series forecasting",
author = "Ignacio Rodriguez-Rodriguez and Jos{\'e}-Victor Rodriguez and Woo, {Wai Lok} and Bo Wei and Domingo-Javier Pardo-Quiles",
year = "2021",
month = feb,
day = "16",
doi = "10.3390/app11041742",
language = "English",
volume = "11",
journal = "Applied Sciences",
issn = "2076-3417",
publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",
number = "4",

}

RIS

TY - JOUR

T1 - A Comparison of Feature Selection and Forecasting Machine Learning Algorithms for Predicting Glycaemia in Type 1 Diabetes Mellitus

AU - Rodriguez-Rodriguez, Ignacio

AU - Rodriguez, José-Victor

AU - Woo, Wai Lok

AU - Wei, Bo

AU - Pardo-Quiles, Domingo-Javier

PY - 2021/2/16

Y1 - 2021/2/16

N2 - Type 1 diabetes mellitus (DM1) is a metabolic disease derived from falls in pancreatic insulin production resulting in chronic hyperglycemia. DM1 subjects usually have to undertake a number of assessments of blood glucose levels every day, employing capillary glucometers for the monitoring of blood glucose dynamics. In recent years, advances in technology have allowed for the creation of revolutionary biosensors and continuous glucose monitoring (CGM) techniques. This has enabled the monitoring of a subject’s blood glucose level in real time. On the other hand, few attempts have been made to apply machine learning techniques to predicting glycaemia levels, but dealing with a database containing such a high level of variables is problematic. In this sense, to the best of the authors’ knowledge, the issues of proper feature selection (FS)—the stage before applying predictive algorithms—have not been subject to in-depth discussion and comparison in past research when it comes to forecasting glycaemia. Therefore, in order to assess how a proper FS stage could improve the accuracy of the glycaemia forecasted, this work has developed six FS techniques alongside four predictive algorithms, applying them to a full dataset of biomedical features related to glycaemia. These were harvested through a wide-ranging passive monitoring process involving 25 patients with DM1 in practical real-life scenarios. From the obtained results, we affirm that Random Forest (RF) as both predictive algorithm and FS strategy offers the best average performance (Root Median Square Error, RMSE = 18.54 mg/dL) throughout the 12 considered predictive horizons (up to 60 min in steps of 5 min), showing Support Vector Machines (SVM) to have the best accuracy as a forecasting algorithm when considering, in turn, the average of the six FS techniques applied (RMSE = 20.58 mg/dL).

AB - Type 1 diabetes mellitus (DM1) is a metabolic disease derived from falls in pancreatic insulin production resulting in chronic hyperglycemia. DM1 subjects usually have to undertake a number of assessments of blood glucose levels every day, employing capillary glucometers for the monitoring of blood glucose dynamics. In recent years, advances in technology have allowed for the creation of revolutionary biosensors and continuous glucose monitoring (CGM) techniques. This has enabled the monitoring of a subject’s blood glucose level in real time. On the other hand, few attempts have been made to apply machine learning techniques to predicting glycaemia levels, but dealing with a database containing such a high level of variables is problematic. In this sense, to the best of the authors’ knowledge, the issues of proper feature selection (FS)—the stage before applying predictive algorithms—have not been subject to in-depth discussion and comparison in past research when it comes to forecasting glycaemia. Therefore, in order to assess how a proper FS stage could improve the accuracy of the glycaemia forecasted, this work has developed six FS techniques alongside four predictive algorithms, applying them to a full dataset of biomedical features related to glycaemia. These were harvested through a wide-ranging passive monitoring process involving 25 patients with DM1 in practical real-life scenarios. From the obtained results, we affirm that Random Forest (RF) as both predictive algorithm and FS strategy offers the best average performance (Root Median Square Error, RMSE = 18.54 mg/dL) throughout the 12 considered predictive horizons (up to 60 min in steps of 5 min), showing Support Vector Machines (SVM) to have the best accuracy as a forecasting algorithm when considering, in turn, the average of the six FS techniques applied (RMSE = 20.58 mg/dL).

KW - diabetes mellitus

KW - machine learning

KW - feature selection

KW - time series forecasting

U2 - 10.3390/app11041742

DO - 10.3390/app11041742

M3 - Journal article

VL - 11

JO - Applied Sciences

JF - Applied Sciences

SN - 2076-3417

IS - 4

M1 - 1742

ER -