Clustering by Minimum Cut Hyperplanes

School Of Mathematical Sciences

Text available via DOI:

https://doi.org/10.1109/TPAMI.2016.2609929
Final published version

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Clustering by Minimum Cut Hyperplanes. / Hofmeyr, David.
In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, No. 8, 31.08.2017, p. 1547-1560.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Hofmeyr, D 2017, 'Clustering by Minimum Cut Hyperplanes', IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 8, pp. 1547-1560. https://doi.org/10.1109/TPAMI.2016.2609929

APA

Hofmeyr, D. (2017). Clustering by Minimum Cut Hyperplanes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(8), 1547-1560. https://doi.org/10.1109/TPAMI.2016.2609929

Vancouver

Hofmeyr D. Clustering by Minimum Cut Hyperplanes. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017 Aug 31;39(8):1547-1560. Epub 2016 Sept 14. doi: 10.1109/TPAMI.2016.2609929

Author

Hofmeyr, David. / Clustering by Minimum Cut Hyperplanes. In: IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017 ; Vol. 39, No. 8. pp. 1547-1560.

Bibtex

@article{244b0b3cef4141238737579a7ebccc2c,

title = "Clustering by Minimum Cut Hyperplanes",

abstract = "Minimum normalised graph cuts are highly effective ways of partitioning unlabeled data, having been made popular by the success of spectral clustering. This work presents a novel method for learning hyperplane separators which minimise this graph cut objective, when data are embedded in Euclidean space. The optimisation problem associated with the proposed method can be formulated as a sequence of univariate subproblems, in which the optimal hyperplane orthogonal to a given vector is determined. These subproblems can be solved in log-linear time, by exploiting the trivial factorisation of the exponential function. Experimentation suggests that the empirical runtime of the overall algorithm is also log-linear in the number of data. Asymptotic properties of the minimum cut hyperplane, both for a finite sample, and for an increasing sample assumed to arise from an underlying probability distribution are discussed. In the finite sample case the minimum cut hyperplane converges to the maximum margin hyperplane as the scaling parameter is reduced to zero. Applying the proposed methodology, both for fixed scaling, and the large margin asymptotes, is shown to produce high quality clustering models in comparison with state-of-the-art clustering algorithms in experiments using a large collection of benchmark datasets.",

author = "David Hofmeyr",

year = "2017",

month = aug,

day = "31",

doi = "10.1109/TPAMI.2016.2609929",

language = "English",

volume = "39",

pages = "1547--1560",

journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",

issn = "0162-8828",

publisher = "IEEE Computer Society",

number = "8",

}

RIS

TY - JOUR

T1 - Clustering by Minimum Cut Hyperplanes

AU - Hofmeyr, David

PY - 2017/8/31

Y1 - 2017/8/31

N2 - Minimum normalised graph cuts are highly effective ways of partitioning unlabeled data, having been made popular by the success of spectral clustering. This work presents a novel method for learning hyperplane separators which minimise this graph cut objective, when data are embedded in Euclidean space. The optimisation problem associated with the proposed method can be formulated as a sequence of univariate subproblems, in which the optimal hyperplane orthogonal to a given vector is determined. These subproblems can be solved in log-linear time, by exploiting the trivial factorisation of the exponential function. Experimentation suggests that the empirical runtime of the overall algorithm is also log-linear in the number of data. Asymptotic properties of the minimum cut hyperplane, both for a finite sample, and for an increasing sample assumed to arise from an underlying probability distribution are discussed. In the finite sample case the minimum cut hyperplane converges to the maximum margin hyperplane as the scaling parameter is reduced to zero. Applying the proposed methodology, both for fixed scaling, and the large margin asymptotes, is shown to produce high quality clustering models in comparison with state-of-the-art clustering algorithms in experiments using a large collection of benchmark datasets.

AB - Minimum normalised graph cuts are highly effective ways of partitioning unlabeled data, having been made popular by the success of spectral clustering. This work presents a novel method for learning hyperplane separators which minimise this graph cut objective, when data are embedded in Euclidean space. The optimisation problem associated with the proposed method can be formulated as a sequence of univariate subproblems, in which the optimal hyperplane orthogonal to a given vector is determined. These subproblems can be solved in log-linear time, by exploiting the trivial factorisation of the exponential function. Experimentation suggests that the empirical runtime of the overall algorithm is also log-linear in the number of data. Asymptotic properties of the minimum cut hyperplane, both for a finite sample, and for an increasing sample assumed to arise from an underlying probability distribution are discussed. In the finite sample case the minimum cut hyperplane converges to the maximum margin hyperplane as the scaling parameter is reduced to zero. Applying the proposed methodology, both for fixed scaling, and the large margin asymptotes, is shown to produce high quality clustering models in comparison with state-of-the-art clustering algorithms in experiments using a large collection of benchmark datasets.

U2 - 10.1109/TPAMI.2016.2609929

DO - 10.1109/TPAMI.2016.2609929

M3 - Journal article

VL - 39

SP - 1547

EP - 1560

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

SN - 0162-8828

IS - 8

ER -

Research

Links

Text available via DOI: