Home > Research > Publications & Outputs > Discriminant analysis with singular covariance ...
View graph of relations

Discriminant analysis with singular covariance matrices. A method incorporating cross-validation and efficient randomized permutation tests

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published
Close
<mark>Journal publication date</mark>1996
<mark>Journal</mark>Journal of Chemometrics
Issue number3
Volume10
Number of pages25
Pages (from-to)189-213
Publication StatusPublished
<mark>Original language</mark>English

Abstract

A computationally efficient approach has been developed to perform two-group linear discriminant analysis using high-dimensional data. The analysis is based on Fisher's method and incorporates two important validation stages: 1, full leave-one-observation-out cross-validation; 2, randomized permutation distribution testing. The resulting algorithm and software are known as CREDIT (cross-validated random-permutation-tested efficient discrimination based on an adjusted generalized inverse for the sample total covariance matrix). The algorithm has been implemented in the SAS/IML matrix programming language and provides dramatic improvements in computational efficiency compared with existing software for discriminant analysis incorporating validation stages 1 and 2 above. Application of CREDIT to nine multivariate data sets indicates that the predictive performance of the approach, assessed using cross-validation, is comparable with that of other methods for discriminant analysis. Comparisons with two specific methods are included. Randomized permutation tests show that success rates using the true response classes are almost always better than success rates using random permutations of the classes. This gives confidence that there is a useful linear discriminant relationship present in the data being analysed. For a randomly selected training set (used to construct the discriminant rule) the success rates for CREDIT are unbiased predictive success rates for allocating other observations to groups. Predicting group memberships for future observations using any discriminant model based on singular estimates of covariance matrices must be performed with great care. A discussion of methods to test the concordance of future observations with the training set is given.