Accepted author manuscript, 1.97 MB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License
Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Data-Driven Insights
T2 - Boosting Algorithms to Uncover Electricity Theft Patterns in AMI
AU - Khan, Inam Ullah
AU - Ali, Arshid
AU - Taylor, C. James
AU - Ma, Xiandong
PY - 2025
Y1 - 2025
N2 - This study introduces a sophisticated supervised machine learning method for electric theft detection utilizing a customized Histogram Gradient Boosting (HGB) algorithm. Comprehensive preprocessing, including imputation, normalization, outlier management, and resampling, ensures the time-series data is accurately prepared for analysis. The SMOTE-ENN algorithm corrects class imbalances, preparing the data for the feature optimization stage, in which key features are selected and extracted. The HGB algorithm, enhanced through Bayesian optimization, is central to the training process, resulting in a model that precisely classifies electricity consumption patterns as genuine or fraudulent. The robustness of the model is evaluated against other recognized boosting methods, such as Adaptive Boosting (ADB), Gradient Boosting Decision Tree (GBDT), and LightGBM, alongside various ensemble and traditional machine learning models. Utilizing key performance metrics like accuracy, F1 score, and AUC for validation, the proposed model yields very promising results, with 93% accuracy, 95% F1 score, and 98% AUC, outperforming the comparison group under similar dataset and hyperparameter conditions. This underscores the model’s potential as a highly accurate tool for combating electricity theft within an advanced metering infrastructure (AMI).
AB - This study introduces a sophisticated supervised machine learning method for electric theft detection utilizing a customized Histogram Gradient Boosting (HGB) algorithm. Comprehensive preprocessing, including imputation, normalization, outlier management, and resampling, ensures the time-series data is accurately prepared for analysis. The SMOTE-ENN algorithm corrects class imbalances, preparing the data for the feature optimization stage, in which key features are selected and extracted. The HGB algorithm, enhanced through Bayesian optimization, is central to the training process, resulting in a model that precisely classifies electricity consumption patterns as genuine or fraudulent. The robustness of the model is evaluated against other recognized boosting methods, such as Adaptive Boosting (ADB), Gradient Boosting Decision Tree (GBDT), and LightGBM, alongside various ensemble and traditional machine learning models. Utilizing key performance metrics like accuracy, F1 score, and AUC for validation, the proposed model yields very promising results, with 93% accuracy, 95% F1 score, and 98% AUC, outperforming the comparison group under similar dataset and hyperparameter conditions. This underscores the model’s potential as a highly accurate tool for combating electricity theft within an advanced metering infrastructure (AMI).
KW - Electricity Theft Detection
KW - Class Balancing
KW - Feature Engineering
KW - Boosting Algorithms
KW - Advanced Metering Infrastructure
KW - Smart Grid
U2 - 10.1109/TIM.2025.3557097
DO - 10.1109/TIM.2025.3557097
M3 - Journal article
VL - 74
SP - 1
EP - 12
JO - IEEE Transactions on Instrumentation and Measurement
JF - IEEE Transactions on Instrumentation and Measurement
SN - 0018-9456
M1 - 2524212
ER -