Home > Research > Publications & Outputs > A fully autonomous data density based clusterin...
View graph of relations

A fully autonomous data density based clustering algorithm

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published
Publication date9/12/2014
Host publicationEvolving and Autonomous Learning Systems (EALS), 2014 IEEE Symposium on
Place of PublicationPiscataway, N.J.
PublisherIEEE
Pages116-123
Number of pages8
ISBN (print)9781479944958
<mark>Original language</mark>English
EventIEEE - Florida, Orlando, United States
Duration: 9/12/201412/12/2014

Conference

ConferenceIEEE
Country/TerritoryUnited States
CityOrlando
Period9/12/1412/12/14

Conference

ConferenceIEEE
Country/TerritoryUnited States
CityOrlando
Period9/12/1412/12/14

Abstract

A recently introduced data density based approach to clustering, known as Data Density based Clustering has been presented which automatically determines the number of clusters. By using the Recursive Density Estimation for each point the number of calculations is significantly reduced in offline mode and, further, the method is suitable for online use. The Data Density based Clustering method however requires an initial cluster radius to be entered.
A different radius per feature/ dimension creates hyper-ellipsoid clusters which are axis-orthogonal. This results in a greater differentiation between clusters where the clusters are highly asymmetrical. In this paper we update the DDC method to automatically derive suitable initial radii. The selection is data driven and requires no user input.
We compare the performance of DDCAR with DDC and other standard clustering techniques by comparing the results across a selection of standard datasets and test datasets designed to test the abilities of the technique. By automatically estimating the initial radii we show that we can effectively cluster data with no user input. The results demonstrate the validity of the proposed approach as an autonomous, data driven clustering technique. We also demonstrate the speed and accuracy of the method on large datasets.