Rights statement: The final publication is available at Springer via http://dx.doi.org/10.1007/s12530-017-9195-7
Accepted author manuscript, 626 KB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License
Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - A new type of distance metric and its use for clustering
AU - Gu, Xiaowei
AU - Angelov, Plamen Parvanov
AU - Kangin, Dmitry
AU - Principe, Jose
N1 - The final publication is available at Springer via http://dx.doi.org/10.1007/s12530-017-9195-7
PY - 2017/9
Y1 - 2017/9
N2 - In order to address high dimensional problems, a new ‘direction-aware’ metric is introduced in this paper. This new distance is a combination of two components: i) the traditional Euclidean distance and ii) an angular/directional divergence, derived from the cosine similarity. The newly introduced metric combines the advantages of the Euclidean metric and cosine similarity, and is defined over the Euclidean space domain. Thus, it is able to take the advantage from both spaces, while preserving the Euclidean space domain. The direction-aware distance has wide range of applicability and can be used as an alternative distance measure for various traditional clustering approaches to enhance their ability of handling high dimensional problems. A new evolving clustering algorithm using the proposed distance is also proposed in this paper. Numerical examples with benchmark datasets reveal that the direction-aware distance can effectively improve the clustering quality of the k-means algorithm for high dimensional problems and demonstrate the proposed evolving clustering algorithm to be an effective tool for high dimensional data streams processing.
AB - In order to address high dimensional problems, a new ‘direction-aware’ metric is introduced in this paper. This new distance is a combination of two components: i) the traditional Euclidean distance and ii) an angular/directional divergence, derived from the cosine similarity. The newly introduced metric combines the advantages of the Euclidean metric and cosine similarity, and is defined over the Euclidean space domain. Thus, it is able to take the advantage from both spaces, while preserving the Euclidean space domain. The direction-aware distance has wide range of applicability and can be used as an alternative distance measure for various traditional clustering approaches to enhance their ability of handling high dimensional problems. A new evolving clustering algorithm using the proposed distance is also proposed in this paper. Numerical examples with benchmark datasets reveal that the direction-aware distance can effectively improve the clustering quality of the k-means algorithm for high dimensional problems and demonstrate the proposed evolving clustering algorithm to be an effective tool for high dimensional data streams processing.
KW - cosine similarity
KW - distance metric
KW - metric space
KW - clustering
KW - high dimensional data streams processing.
U2 - 10.1007/s12530-017-9195-7
DO - 10.1007/s12530-017-9195-7
M3 - Journal article
VL - 8
SP - 167
EP - 177
JO - Evolving Systems
JF - Evolving Systems
SN - 1868-6478
IS - 3
ER -