Evaluating and selecting features via information theoretic lower bounds of feature inner correlations for high-dimensional data

Management Science

Electronic data

EOR16774
Rights statement: This is the author’s version of a work that was accepted for publication in European Journal of Operational Research. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in European Journal of Operational Research, 290, 1, 2020 DOI: 10.1016/j.ejor.2020.09.028
Accepted author manuscript, 1.3 MB, PDF document
Available under license: CC BY-NC-ND: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

Text available via DOI:

https://doi.org/10.1016/j.ejor.2020.09.028
Final published version

Keywords

Data mining, Feature selection, Redundancy, Complementarity, Lower bounds

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

E-pub ahead of print

Standard

Evaluating and selecting features via information theoretic lower bounds of feature inner correlations for high-dimensional data. / Zhang, Yishi; Zhu, Ruilin; Chen, Zhijun et al.
In: European Journal of Operational Research, Vol. 290, No. 1, 06.10.2020, p. 235-247.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Zhang, Y, Zhu, R, Chen, Z, Gao, J & Xia, D 2020, 'Evaluating and selecting features via information theoretic lower bounds of feature inner correlations for high-dimensional data', European Journal of Operational Research, vol. 290, no. 1, pp. 235-247. https://doi.org/10.1016/j.ejor.2020.09.028

APA

Zhang, Y., Zhu, R., Chen, Z., Gao, J., & Xia, D. (2020). Evaluating and selecting features via information theoretic lower bounds of feature inner correlations for high-dimensional data. European Journal of Operational Research, 290(1), 235-247. Advance online publication. https://doi.org/10.1016/j.ejor.2020.09.028

Vancouver

Zhang Y, Zhu R, Chen Z, Gao J, Xia D. Evaluating and selecting features via information theoretic lower bounds of feature inner correlations for high-dimensional data. European Journal of Operational Research. 2020 Oct 6;290(1):235-247. Epub 2020 Oct 6. doi: 10.1016/j.ejor.2020.09.028

Author

Zhang, Yishi ; Zhu, Ruilin ; Chen, Zhijun et al. / Evaluating and selecting features via information theoretic lower bounds of feature inner correlations for high-dimensional data. In: European Journal of Operational Research. 2020 ; Vol. 290, No. 1. pp. 235-247.

Bibtex

@article{c9a52f7430a24a7baa62f6b0ce0c21d9,

title = "Evaluating and selecting features via information theoretic lower bounds of feature inner correlations for high-dimensional data",

abstract = "Feature selection is an important preprocessing and interpretable method in the fields where big data plays an essential role. In this paper, we first reformulate and analyze some representative information theoretic feature selection methods from the perspective of approximations of feature inner correlations, and indicate that many of these methods cannot guarantee any theoretical bounds of feature inner correlations. We thus introduce two lower bounds that have very simple forms for feature redundancy and complementarity, and verify that they are closer to the optima than the existing lower bounds applied by some state-of-the-art information theoretic methods. A simple and effective feature selection method based on the proposed lower bounds is then proposed and empirically verified with a wide scope of real-world datasets. The experimental results show that the proposed method achieves promising improvement on feature selection, indicating the effectiveness of the feature criterion consisting of the proposed lower bounds of redundancy and complementarity.",

keywords = "Data mining, Feature selection, Redundancy, Complementarity, Lower bounds",

author = "Yishi Zhang and Ruilin Zhu and Zhijun Chen and Jie Gao and De Xia",

note = "This is the author{\textquoteright}s version of a work that was accepted for publication in European Journal of Operational Research. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in European Journal of Operational Research, 290, 1, 2020 DOI: 10.1016/j.ejor.2020.09.028 ",

year = "2020",

month = oct,

day = "6",

doi = "10.1016/j.ejor.2020.09.028",

language = "English",

volume = "290",

pages = "235--247",

journal = "European Journal of Operational Research",

issn = "0377-2217",

publisher = "Elsevier Science B.V.",

number = "1",

}

RIS

TY - JOUR

T1 - Evaluating and selecting features via information theoretic lower bounds of feature inner correlations for high-dimensional data

AU - Zhang, Yishi

AU - Zhu, Ruilin

AU - Chen, Zhijun

AU - Gao, Jie

AU - Xia, De

N1 - This is the author’s version of a work that was accepted for publication in European Journal of Operational Research. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in European Journal of Operational Research, 290, 1, 2020 DOI: 10.1016/j.ejor.2020.09.028

PY - 2020/10/6

Y1 - 2020/10/6

N2 - Feature selection is an important preprocessing and interpretable method in the fields where big data plays an essential role. In this paper, we first reformulate and analyze some representative information theoretic feature selection methods from the perspective of approximations of feature inner correlations, and indicate that many of these methods cannot guarantee any theoretical bounds of feature inner correlations. We thus introduce two lower bounds that have very simple forms for feature redundancy and complementarity, and verify that they are closer to the optima than the existing lower bounds applied by some state-of-the-art information theoretic methods. A simple and effective feature selection method based on the proposed lower bounds is then proposed and empirically verified with a wide scope of real-world datasets. The experimental results show that the proposed method achieves promising improvement on feature selection, indicating the effectiveness of the feature criterion consisting of the proposed lower bounds of redundancy and complementarity.

AB - Feature selection is an important preprocessing and interpretable method in the fields where big data plays an essential role. In this paper, we first reformulate and analyze some representative information theoretic feature selection methods from the perspective of approximations of feature inner correlations, and indicate that many of these methods cannot guarantee any theoretical bounds of feature inner correlations. We thus introduce two lower bounds that have very simple forms for feature redundancy and complementarity, and verify that they are closer to the optima than the existing lower bounds applied by some state-of-the-art information theoretic methods. A simple and effective feature selection method based on the proposed lower bounds is then proposed and empirically verified with a wide scope of real-world datasets. The experimental results show that the proposed method achieves promising improvement on feature selection, indicating the effectiveness of the feature criterion consisting of the proposed lower bounds of redundancy and complementarity.

KW - Data mining

KW - Feature selection

KW - Redundancy

KW - Complementarity

KW - Lower bounds

U2 - 10.1016/j.ejor.2020.09.028

DO - 10.1016/j.ejor.2020.09.028

M3 - Journal article

VL - 290

SP - 235

EP - 247

JO - European Journal of Operational Research

JF - European Journal of Operational Research

SN - 0377-2217

IS - 1

ER -

Research

Electronic data

Links

Text available via DOI:

Keywords