Rights statement: The final publication is available at Springer via http://dx.doi.org/10.1007/s40300-015-0068-1
Accepted author manuscript, 1.9 MB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License
Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - A new Bayesian approach for determining the number of components in a finite mixture
AU - Aitkin, Murray
AU - Vu, Duy
AU - Francis, Brian
N1 - (included in attached document) The final publication is available at Springer via http://dx.doi.org/10.1007/s40300-015-0068-1
PY - 2015/8/21
Y1 - 2015/8/21
N2 - This article evaluates a new Bayesian approach to determining the number of components in a finite mixture. We evaluate through simulation studies mixtures of normals and latent class mixtures of Bernoulli responses. For normal mixtures we use a “gold standard” set of population models based on a well-known “testbed” data set – the galaxy recession velocity data set of Roeder (1990). For Bernoulli latent class mixtures we consider models for psychiatric diagnosis (Berkhof, van Mechelen and Gelman 2003). The new approach is based on comparing models with different numbers of components through their posterior deviance distributions, based on non-informative or diffuse priors.Simulations show that even large numbers of closely spaced normal components can be identified with sufficiently large samples, while for atent classes with Bernoulli responses identification is more complex, though it again improves with increasing sample size.
AB - This article evaluates a new Bayesian approach to determining the number of components in a finite mixture. We evaluate through simulation studies mixtures of normals and latent class mixtures of Bernoulli responses. For normal mixtures we use a “gold standard” set of population models based on a well-known “testbed” data set – the galaxy recession velocity data set of Roeder (1990). For Bernoulli latent class mixtures we consider models for psychiatric diagnosis (Berkhof, van Mechelen and Gelman 2003). The new approach is based on comparing models with different numbers of components through their posterior deviance distributions, based on non-informative or diffuse priors.Simulations show that even large numbers of closely spaced normal components can be identified with sufficiently large samples, while for atent classes with Bernoulli responses identification is more complex, though it again improves with increasing sample size.
KW - finite mixture models
KW - number of groups
KW - Bayesian
KW - latent class analysis
KW - Number of components
KW - Deviance distribution
U2 - 10.1007/s40300-015-0068-1
DO - 10.1007/s40300-015-0068-1
M3 - Journal article
VL - 73
SP - 155
EP - 175
JO - Metron
JF - Metron
SN - 0026-1424
IS - 2
ER -