### Abstract

We consider the problem of estimating the number of components d and the unknown mixing distribution in a finite mixture model, in which d is bounded by some fixed finite number N. Our approach relies on the use of a prior over the space of mixing distributions with at most N components. By decomposing the resulting marginal density under this prior, we discover a weighted Bayes factor method for consistently estimating d that can be implemented by an iid generalized weighted Chinese restaurant (GWCR) Monte Carlo algorithm. We also discuss a Gibbs sampling method (the blocked Gibbs sampler) for estimating d and also the mixing distribution. We show that our resulting posterior is consistent and achieves the frequentist optimal O_{p}(n^{-1/4}) rate of estimation. We compare the performance of the new GWCR model selection procedure with that of the Akaike information criterion and the Bayes information criterion implemented through an EM algorithm. Applications of our methods to five real datasets and simulations are considered.

Original language | English |
---|---|

Pages (from-to) | 1316-1332 |

Number of pages | 17 |

Journal | Journal of the American Statistical Association |

Volume | 96 |

Issue number | 456 |

State | Published - Dec 1 2001 |

Externally published | Yes |

### Fingerprint

### Keywords

- Blocked Gibbs sampler
- Dirichlet prior
- Generalized weighted Chinese restaurant
- Identification
- Partition
- Uniformly exponentially consistent test
- Weighted Bayes factor

### ASJC Scopus subject areas

- Mathematics(all)
- Statistics and Probability

### Cite this

*Journal of the American Statistical Association*,

*96*(456), 1316-1332.

**Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions.** / Ishwaran, Hemant; James, Lancelot F.; Sun, Jiayang.

Research output: Contribution to journal › Article

*Journal of the American Statistical Association*, vol. 96, no. 456, pp. 1316-1332.

}

TY - JOUR

T1 - Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions

AU - Ishwaran, Hemant

AU - James, Lancelot F.

AU - Sun, Jiayang

PY - 2001/12/1

Y1 - 2001/12/1

N2 - We consider the problem of estimating the number of components d and the unknown mixing distribution in a finite mixture model, in which d is bounded by some fixed finite number N. Our approach relies on the use of a prior over the space of mixing distributions with at most N components. By decomposing the resulting marginal density under this prior, we discover a weighted Bayes factor method for consistently estimating d that can be implemented by an iid generalized weighted Chinese restaurant (GWCR) Monte Carlo algorithm. We also discuss a Gibbs sampling method (the blocked Gibbs sampler) for estimating d and also the mixing distribution. We show that our resulting posterior is consistent and achieves the frequentist optimal Op(n-1/4) rate of estimation. We compare the performance of the new GWCR model selection procedure with that of the Akaike information criterion and the Bayes information criterion implemented through an EM algorithm. Applications of our methods to five real datasets and simulations are considered.

AB - We consider the problem of estimating the number of components d and the unknown mixing distribution in a finite mixture model, in which d is bounded by some fixed finite number N. Our approach relies on the use of a prior over the space of mixing distributions with at most N components. By decomposing the resulting marginal density under this prior, we discover a weighted Bayes factor method for consistently estimating d that can be implemented by an iid generalized weighted Chinese restaurant (GWCR) Monte Carlo algorithm. We also discuss a Gibbs sampling method (the blocked Gibbs sampler) for estimating d and also the mixing distribution. We show that our resulting posterior is consistent and achieves the frequentist optimal Op(n-1/4) rate of estimation. We compare the performance of the new GWCR model selection procedure with that of the Akaike information criterion and the Bayes information criterion implemented through an EM algorithm. Applications of our methods to five real datasets and simulations are considered.

KW - Blocked Gibbs sampler

KW - Dirichlet prior

KW - Generalized weighted Chinese restaurant

KW - Identification

KW - Partition

KW - Uniformly exponentially consistent test

KW - Weighted Bayes factor

UR - http://www.scopus.com/inward/record.url?scp=1542469706&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=1542469706&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:1542469706

VL - 96

SP - 1316

EP - 1332

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

SN - 0162-1459

IS - 456

ER -