Home > Research > Publications & Outputs > Minimum spectral connectivity projection pursuit

Electronic data

  • SCPP

    Accepted author manuscript, 2.29 MB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

Text available via DOI:

View graph of relations

Minimum spectral connectivity projection pursuit: Divisive clustering using optimal projections for spectral clustering

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Minimum spectral connectivity projection pursuit: Divisive clustering using optimal projections for spectral clustering. / Hofmeyr, David; Pavlidis, Nicos Georgios; Eckley, Idris Arthur.
In: Statistics and Computing, Vol. 29, No. 2, 01.03.2019, p. 391–414.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

APA

Vancouver

Hofmeyr D, Pavlidis NG, Eckley IA. Minimum spectral connectivity projection pursuit: Divisive clustering using optimal projections for spectral clustering. Statistics and Computing. 2019 Mar 1;29(2):391–414. Epub 2018 May 4. doi: 10.1007/s11222-018-9814-6

Author

Bibtex

@article{aa1c851cd5b14bd191d7afc74cbe8c67,
title = "Minimum spectral connectivity projection pursuit: Divisive clustering using optimal projections for spectral clustering",
abstract = "We study the problem of determining the optimal low-dimensional projection for maximising the separability of a binary partition of an unlabelled dataset, as measured by spectral graph theory. This is achieved by finding projections which minimise the second eigenvalue of the graph Laplacian of the projected data, which corresponds to a non-convex, non-smooth optimisation problem. We show that the optimal univariate projection based on spectral connectivity converges to the vector normal to the maximum margin hyperplane through the data, as the scaling parameter is reduced to zero. This establishes a connection between connectivity as measured by spectral graph theory and maximal Euclidean separation. The computational cost associated with each eigen problem is quadratic in the number of data. To mitigate this issue, we propose an approximation method using microclusters with provable approximation error bounds. Combining multiple binary partitions within a divisive hierarchical model allows us to construct clustering solutions admitting clusters with varying scales and lying within different subspaces. We evaluate the performance of the proposed method on a large collection of benchmark datasets and find that it compares favourably with existing methods for projection pursuit and dimension reduction for data clustering. Applying the proposed approach for a decreasing sequence of scaling parameters allows us to obtain large margin clustering solutions, which are found to be competitive with those from dedicated maximum margin clustering algorithms.",
keywords = "Spectral Clustering, dimension reduction, projection pursuit, maximum margin clustering",
author = "David Hofmeyr and Pavlidis, {Nicos Georgios} and Eckley, {Idris Arthur}",
year = "2019",
month = mar,
day = "1",
doi = "10.1007/s11222-018-9814-6",
language = "English",
volume = "29",
pages = "391–414",
journal = "Statistics and Computing",
issn = "0960-3174",
publisher = "Springer Netherlands",
number = "2",

}

RIS

TY - JOUR

T1 - Minimum spectral connectivity projection pursuit

T2 - Divisive clustering using optimal projections for spectral clustering

AU - Hofmeyr, David

AU - Pavlidis, Nicos Georgios

AU - Eckley, Idris Arthur

PY - 2019/3/1

Y1 - 2019/3/1

N2 - We study the problem of determining the optimal low-dimensional projection for maximising the separability of a binary partition of an unlabelled dataset, as measured by spectral graph theory. This is achieved by finding projections which minimise the second eigenvalue of the graph Laplacian of the projected data, which corresponds to a non-convex, non-smooth optimisation problem. We show that the optimal univariate projection based on spectral connectivity converges to the vector normal to the maximum margin hyperplane through the data, as the scaling parameter is reduced to zero. This establishes a connection between connectivity as measured by spectral graph theory and maximal Euclidean separation. The computational cost associated with each eigen problem is quadratic in the number of data. To mitigate this issue, we propose an approximation method using microclusters with provable approximation error bounds. Combining multiple binary partitions within a divisive hierarchical model allows us to construct clustering solutions admitting clusters with varying scales and lying within different subspaces. We evaluate the performance of the proposed method on a large collection of benchmark datasets and find that it compares favourably with existing methods for projection pursuit and dimension reduction for data clustering. Applying the proposed approach for a decreasing sequence of scaling parameters allows us to obtain large margin clustering solutions, which are found to be competitive with those from dedicated maximum margin clustering algorithms.

AB - We study the problem of determining the optimal low-dimensional projection for maximising the separability of a binary partition of an unlabelled dataset, as measured by spectral graph theory. This is achieved by finding projections which minimise the second eigenvalue of the graph Laplacian of the projected data, which corresponds to a non-convex, non-smooth optimisation problem. We show that the optimal univariate projection based on spectral connectivity converges to the vector normal to the maximum margin hyperplane through the data, as the scaling parameter is reduced to zero. This establishes a connection between connectivity as measured by spectral graph theory and maximal Euclidean separation. The computational cost associated with each eigen problem is quadratic in the number of data. To mitigate this issue, we propose an approximation method using microclusters with provable approximation error bounds. Combining multiple binary partitions within a divisive hierarchical model allows us to construct clustering solutions admitting clusters with varying scales and lying within different subspaces. We evaluate the performance of the proposed method on a large collection of benchmark datasets and find that it compares favourably with existing methods for projection pursuit and dimension reduction for data clustering. Applying the proposed approach for a decreasing sequence of scaling parameters allows us to obtain large margin clustering solutions, which are found to be competitive with those from dedicated maximum margin clustering algorithms.

KW - Spectral Clustering

KW - dimension reduction

KW - projection pursuit

KW - maximum margin clustering

U2 - 10.1007/s11222-018-9814-6

DO - 10.1007/s11222-018-9814-6

M3 - Journal article

VL - 29

SP - 391

EP - 414

JO - Statistics and Computing

JF - Statistics and Computing

SN - 0960-3174

IS - 2

ER -