Rights statement: ©2021 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Accepted author manuscript, 1.05 MB, PDF document
Available under license: CC BY-NC-ND: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
Final published version
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - Big Data Analytics for Electricity Theft Detection in Smart Grids
AU - Khan, Inam Ullah
AU - Javaid, Nadeem
AU - Taylor, C. James
AU - Gamage, Kelum
AU - Ma, Xiandong
PY - 2021/7/29
Y1 - 2021/7/29
N2 - In Smart Grids (SG), Electricity Theft Detection (ETD) is of great importance because it makes the SG cost-efficient. Existing methods for ETD cannot efficiently handle data imbalance, missing values, variance and non-linear data problems in the smart meter data. Therefore, an effective integrated strategy is required to address underlying issues and accurately detect electricity theft using big data. In this work, a simple yet effective approach is proposed by integrating two different modules, such as data pre-processing and classification, in a single framework. The first module involves data imputation, outliers handling, standardization and class balancing steps to generate quality data for classifier training. The second module classifies honest and dishonest users with a Support Vector Machine (SVM) classifier. To improve the classifier’s learning trend and accuracy, a Bayesian optimization algorithm is used to tune SVM’s hyperparameters. Simulation results confirm that the proposed framework for ETD significantly outperforms previous machine learning approaches such as random forest, logistic regression and SVM in terms of accuracy.
AB - In Smart Grids (SG), Electricity Theft Detection (ETD) is of great importance because it makes the SG cost-efficient. Existing methods for ETD cannot efficiently handle data imbalance, missing values, variance and non-linear data problems in the smart meter data. Therefore, an effective integrated strategy is required to address underlying issues and accurately detect electricity theft using big data. In this work, a simple yet effective approach is proposed by integrating two different modules, such as data pre-processing and classification, in a single framework. The first module involves data imputation, outliers handling, standardization and class balancing steps to generate quality data for classifier training. The second module classifies honest and dishonest users with a Support Vector Machine (SVM) classifier. To improve the classifier’s learning trend and accuracy, a Bayesian optimization algorithm is used to tune SVM’s hyperparameters. Simulation results confirm that the proposed framework for ETD significantly outperforms previous machine learning approaches such as random forest, logistic regression and SVM in terms of accuracy.
KW - Big data
KW - Electricity theft detection
KW - Feature engineering
KW - Data classification
KW - Smart grid
U2 - 10.1109/PowerTech46648.2021.9495000
DO - 10.1109/PowerTech46648.2021.9495000
M3 - Conference contribution/Paper
BT - 2021 IEEE Madrid PowerTech - 14th IEEE Power and Energy Society PowerTech Conference
PB - IEEE
ER -