Incremental estimation of low-density separating hyperplanes for clustering large data sets

School Of Mathematical Sciences

Text available via DOI:

https://doi.org/10.1016/j.patcog.2023.109471
Final published version

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Incremental estimation of low-density separating hyperplanes for clustering large data sets. / Hofmeyr, David P.
In: Pattern Recognition, Vol. 139, 109471, 31.07.2023.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Bibtex

@article{237ebb77e20b4f2fae943d075dc70998,

title = "Incremental estimation of low-density separating hyperplanes for clustering large data sets",

abstract = "An efficient unsupervised method for obtaining low-density hyperplane separators is proposed. The method is based on a modified stochastic gradient descent applied on a convolution of the empirical distribution function with a smoothing kernel. Low-density hyperplanes are motivated by the fact that they avoid intersecting high density regions, and so tend to pass between high density clusters, thus separating them from one another, while keeping the individual clusters intact. Multiple hyperplanes can be combined in a hierarchical model to obtain a complete clustering solution. A simple post-processing of solutions induced by large collections of hyperplanes yields an efficient and accurate clustering method, capable of automatically selecting the number of clusters. Experiments show that the proposed method is highly competitive in terms of both speed and accuracy when compared with relevant benchmarks. Code is available in the form of an R package at https://github.com/DavidHofmeyr/iMDH",

author = "Hofmeyr, {David P.}",

year = "2023",

month = jul,

day = "31",

doi = "10.1016/j.patcog.2023.109471",

language = "English",

volume = "139",

journal = "Pattern Recognition",

issn = "0031-3203",

publisher = "Elsevier Ltd",

}

RIS

TY - JOUR

T1 - Incremental estimation of low-density separating hyperplanes for clustering large data sets

AU - Hofmeyr, David P.

PY - 2023/7/31

Y1 - 2023/7/31

N2 - An efficient unsupervised method for obtaining low-density hyperplane separators is proposed. The method is based on a modified stochastic gradient descent applied on a convolution of the empirical distribution function with a smoothing kernel. Low-density hyperplanes are motivated by the fact that they avoid intersecting high density regions, and so tend to pass between high density clusters, thus separating them from one another, while keeping the individual clusters intact. Multiple hyperplanes can be combined in a hierarchical model to obtain a complete clustering solution. A simple post-processing of solutions induced by large collections of hyperplanes yields an efficient and accurate clustering method, capable of automatically selecting the number of clusters. Experiments show that the proposed method is highly competitive in terms of both speed and accuracy when compared with relevant benchmarks. Code is available in the form of an R package at https://github.com/DavidHofmeyr/iMDH

AB - An efficient unsupervised method for obtaining low-density hyperplane separators is proposed. The method is based on a modified stochastic gradient descent applied on a convolution of the empirical distribution function with a smoothing kernel. Low-density hyperplanes are motivated by the fact that they avoid intersecting high density regions, and so tend to pass between high density clusters, thus separating them from one another, while keeping the individual clusters intact. Multiple hyperplanes can be combined in a hierarchical model to obtain a complete clustering solution. A simple post-processing of solutions induced by large collections of hyperplanes yields an efficient and accurate clustering method, capable of automatically selecting the number of clusters. Experiments show that the proposed method is highly competitive in terms of both speed and accuracy when compared with relevant benchmarks. Code is available in the form of an R package at https://github.com/DavidHofmeyr/iMDH

U2 - 10.1016/j.patcog.2023.109471

DO - 10.1016/j.patcog.2023.109471

M3 - Journal article

VL - 139

JO - Pattern Recognition

JF - Pattern Recognition

SN - 0031-3203

M1 - 109471

ER -

Research

Links

Text available via DOI: