Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions

Hemant Ishwaran, Lancelot F. James, Jiayang Sun

Research output: Contribution to journalArticle

53 Citations (Scopus)

Abstract

We consider the problem of estimating the number of components d and the unknown mixing distribution in a finite mixture model, in which d is bounded by some fixed finite number N. Our approach relies on the use of a prior over the space of mixing distributions with at most N components. By decomposing the resulting marginal density under this prior, we discover a weighted Bayes factor method for consistently estimating d that can be implemented by an iid generalized weighted Chinese restaurant (GWCR) Monte Carlo algorithm. We also discuss a Gibbs sampling method (the blocked Gibbs sampler) for estimating d and also the mixing distribution. We show that our resulting posterior is consistent and achieves the frequentist optimal Op(n-1/4) rate of estimation. We compare the performance of the new GWCR model selection procedure with that of the Akaike information criterion and the Bayes information criterion implemented through an EM algorithm. Applications of our methods to five real datasets and simulations are considered.

Original languageEnglish
Pages (from-to)1316-1332
Number of pages17
JournalJournal of the American Statistical Association
Volume96
Issue number456
StatePublished - Dec 1 2001
Externally publishedYes

Fingerprint

Bayesian Model Selection
Mixing Distribution
Finite Mixture
Decompose
Bayes Information Criterion
Finite Mixture Models
Bayes Factor
Akaike Information Criterion
Gibbs Sampler
Gibbs Sampling
Monte Carlo Algorithm
Selection Procedures
Number of Components
Sampling Methods
EM Algorithm
Model Selection
Unknown
Decomposition
Bayesian model
Finite mixture

Keywords

  • Blocked Gibbs sampler
  • Dirichlet prior
  • Generalized weighted Chinese restaurant
  • Identification
  • Partition
  • Uniformly exponentially consistent test
  • Weighted Bayes factor

ASJC Scopus subject areas

  • Mathematics(all)
  • Statistics and Probability

Cite this

Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions. / Ishwaran, Hemant; James, Lancelot F.; Sun, Jiayang.

In: Journal of the American Statistical Association, Vol. 96, No. 456, 01.12.2001, p. 1316-1332.

Research output: Contribution to journalArticle

@article{59281203a5f64432bdd1a10277984ada,
title = "Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions",
abstract = "We consider the problem of estimating the number of components d and the unknown mixing distribution in a finite mixture model, in which d is bounded by some fixed finite number N. Our approach relies on the use of a prior over the space of mixing distributions with at most N components. By decomposing the resulting marginal density under this prior, we discover a weighted Bayes factor method for consistently estimating d that can be implemented by an iid generalized weighted Chinese restaurant (GWCR) Monte Carlo algorithm. We also discuss a Gibbs sampling method (the blocked Gibbs sampler) for estimating d and also the mixing distribution. We show that our resulting posterior is consistent and achieves the frequentist optimal Op(n-1/4) rate of estimation. We compare the performance of the new GWCR model selection procedure with that of the Akaike information criterion and the Bayes information criterion implemented through an EM algorithm. Applications of our methods to five real datasets and simulations are considered.",
keywords = "Blocked Gibbs sampler, Dirichlet prior, Generalized weighted Chinese restaurant, Identification, Partition, Uniformly exponentially consistent test, Weighted Bayes factor",
author = "Hemant Ishwaran and James, {Lancelot F.} and Jiayang Sun",
year = "2001",
month = "12",
day = "1",
language = "English",
volume = "96",
pages = "1316--1332",
journal = "Journal of the American Statistical Association",
issn = "0162-1459",
publisher = "Taylor and Francis Ltd.",
number = "456",

}

TY - JOUR

T1 - Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions

AU - Ishwaran, Hemant

AU - James, Lancelot F.

AU - Sun, Jiayang

PY - 2001/12/1

Y1 - 2001/12/1

N2 - We consider the problem of estimating the number of components d and the unknown mixing distribution in a finite mixture model, in which d is bounded by some fixed finite number N. Our approach relies on the use of a prior over the space of mixing distributions with at most N components. By decomposing the resulting marginal density under this prior, we discover a weighted Bayes factor method for consistently estimating d that can be implemented by an iid generalized weighted Chinese restaurant (GWCR) Monte Carlo algorithm. We also discuss a Gibbs sampling method (the blocked Gibbs sampler) for estimating d and also the mixing distribution. We show that our resulting posterior is consistent and achieves the frequentist optimal Op(n-1/4) rate of estimation. We compare the performance of the new GWCR model selection procedure with that of the Akaike information criterion and the Bayes information criterion implemented through an EM algorithm. Applications of our methods to five real datasets and simulations are considered.

AB - We consider the problem of estimating the number of components d and the unknown mixing distribution in a finite mixture model, in which d is bounded by some fixed finite number N. Our approach relies on the use of a prior over the space of mixing distributions with at most N components. By decomposing the resulting marginal density under this prior, we discover a weighted Bayes factor method for consistently estimating d that can be implemented by an iid generalized weighted Chinese restaurant (GWCR) Monte Carlo algorithm. We also discuss a Gibbs sampling method (the blocked Gibbs sampler) for estimating d and also the mixing distribution. We show that our resulting posterior is consistent and achieves the frequentist optimal Op(n-1/4) rate of estimation. We compare the performance of the new GWCR model selection procedure with that of the Akaike information criterion and the Bayes information criterion implemented through an EM algorithm. Applications of our methods to five real datasets and simulations are considered.

KW - Blocked Gibbs sampler

KW - Dirichlet prior

KW - Generalized weighted Chinese restaurant

KW - Identification

KW - Partition

KW - Uniformly exponentially consistent test

KW - Weighted Bayes factor

UR - http://www.scopus.com/inward/record.url?scp=1542469706&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=1542469706&partnerID=8YFLogxK

M3 - Article

VL - 96

SP - 1316

EP - 1332

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

SN - 0162-1459

IS - 456

ER -