Generalized weighted Chinese restaurant processes for species sampling mixture models

Hemant Ishwaran, Lancelot F. James

Research output: Contribution to journalArticle

97 Citations (Scopus)

Abstract

The class of species sampling mixture models is introduced as an extension of semiparametric models based on the Dirichlet process to models based on the general class of species sampling priors, or equivalently the class of all exchangeable urn distributions. Using Fubini calculus in conjunction with Pitman (1995, 1996), we derive characterizations of the posterior distribution in terms of a posterior partition distribution that extend the results of Lo (1984) for the Dirichlet process. These results provide a better understanding of models and have both theoretical and practical applications. To facilitate the use of our models we generalize the work in Brunner, Chan, James and Lo (2001) by extending their weighted Chinese restaurant (WCR) Monte Carlo procedure, an i.i.d. sequential importance sampling (SIS) procedure for approximating posterior mean functionals based on the Dirichlet process, to the case of approximation of mean functionals and additionally their posterior laws in species sampling mixture models. We also discuss collapsed Gibbs sampling, Pólya urn Gibbs sampling and a Pólya urn SIS scheme. Our framework allows for numerous applications, including multiplicative counting process models subject to weighted gamma processes, as well as nonparametric and semiparametric hierarchical models based on the Dirichlet process, its two-parameter extension, the Pitman-Yor process and finite dimensional Dirichlet priors.

Original languageEnglish
Pages (from-to)1211-1235
Number of pages25
JournalStatistica Sinica
Volume13
Issue number4
StatePublished - Oct 1 2003
Externally publishedYes

Fingerprint

Dirichlet Process
Mixture Model
Sequential Importance Sampling
Gibbs Sampling
Semiparametric Model
Model-based
Dirichlet Prior
Gamma Process
Posterior Mean
Counting Process
Hierarchical Model
Process Parameters
Posterior distribution
Process Model
Two Parameters
Multiplicative
Calculus
Partition
Generalise
Chinese restaurant

Keywords

  • Dirichlet process
  • Exchangeable partition
  • Finite dimensional Dirichlet prior
  • Prediction rule
  • Random probability measure
  • Species sampling sequence
  • Two-parameter Poisson-Dirichlet process

ASJC Scopus subject areas

  • Mathematics(all)
  • Statistics and Probability

Cite this

Generalized weighted Chinese restaurant processes for species sampling mixture models. / Ishwaran, Hemant; James, Lancelot F.

In: Statistica Sinica, Vol. 13, No. 4, 01.10.2003, p. 1211-1235.

Research output: Contribution to journalArticle

@article{4193fdb842074fa1be03d33853b2116a,
title = "Generalized weighted Chinese restaurant processes for species sampling mixture models",
abstract = "The class of species sampling mixture models is introduced as an extension of semiparametric models based on the Dirichlet process to models based on the general class of species sampling priors, or equivalently the class of all exchangeable urn distributions. Using Fubini calculus in conjunction with Pitman (1995, 1996), we derive characterizations of the posterior distribution in terms of a posterior partition distribution that extend the results of Lo (1984) for the Dirichlet process. These results provide a better understanding of models and have both theoretical and practical applications. To facilitate the use of our models we generalize the work in Brunner, Chan, James and Lo (2001) by extending their weighted Chinese restaurant (WCR) Monte Carlo procedure, an i.i.d. sequential importance sampling (SIS) procedure for approximating posterior mean functionals based on the Dirichlet process, to the case of approximation of mean functionals and additionally their posterior laws in species sampling mixture models. We also discuss collapsed Gibbs sampling, P{\'o}lya urn Gibbs sampling and a P{\'o}lya urn SIS scheme. Our framework allows for numerous applications, including multiplicative counting process models subject to weighted gamma processes, as well as nonparametric and semiparametric hierarchical models based on the Dirichlet process, its two-parameter extension, the Pitman-Yor process and finite dimensional Dirichlet priors.",
keywords = "Dirichlet process, Exchangeable partition, Finite dimensional Dirichlet prior, Prediction rule, Random probability measure, Species sampling sequence, Two-parameter Poisson-Dirichlet process",
author = "Hemant Ishwaran and James, {Lancelot F.}",
year = "2003",
month = "10",
day = "1",
language = "English",
volume = "13",
pages = "1211--1235",
journal = "Statistica Sinica",
issn = "1017-0405",
publisher = "Institute of Statistical Science",
number = "4",

}

TY - JOUR

T1 - Generalized weighted Chinese restaurant processes for species sampling mixture models

AU - Ishwaran, Hemant

AU - James, Lancelot F.

PY - 2003/10/1

Y1 - 2003/10/1

N2 - The class of species sampling mixture models is introduced as an extension of semiparametric models based on the Dirichlet process to models based on the general class of species sampling priors, or equivalently the class of all exchangeable urn distributions. Using Fubini calculus in conjunction with Pitman (1995, 1996), we derive characterizations of the posterior distribution in terms of a posterior partition distribution that extend the results of Lo (1984) for the Dirichlet process. These results provide a better understanding of models and have both theoretical and practical applications. To facilitate the use of our models we generalize the work in Brunner, Chan, James and Lo (2001) by extending their weighted Chinese restaurant (WCR) Monte Carlo procedure, an i.i.d. sequential importance sampling (SIS) procedure for approximating posterior mean functionals based on the Dirichlet process, to the case of approximation of mean functionals and additionally their posterior laws in species sampling mixture models. We also discuss collapsed Gibbs sampling, Pólya urn Gibbs sampling and a Pólya urn SIS scheme. Our framework allows for numerous applications, including multiplicative counting process models subject to weighted gamma processes, as well as nonparametric and semiparametric hierarchical models based on the Dirichlet process, its two-parameter extension, the Pitman-Yor process and finite dimensional Dirichlet priors.

AB - The class of species sampling mixture models is introduced as an extension of semiparametric models based on the Dirichlet process to models based on the general class of species sampling priors, or equivalently the class of all exchangeable urn distributions. Using Fubini calculus in conjunction with Pitman (1995, 1996), we derive characterizations of the posterior distribution in terms of a posterior partition distribution that extend the results of Lo (1984) for the Dirichlet process. These results provide a better understanding of models and have both theoretical and practical applications. To facilitate the use of our models we generalize the work in Brunner, Chan, James and Lo (2001) by extending their weighted Chinese restaurant (WCR) Monte Carlo procedure, an i.i.d. sequential importance sampling (SIS) procedure for approximating posterior mean functionals based on the Dirichlet process, to the case of approximation of mean functionals and additionally their posterior laws in species sampling mixture models. We also discuss collapsed Gibbs sampling, Pólya urn Gibbs sampling and a Pólya urn SIS scheme. Our framework allows for numerous applications, including multiplicative counting process models subject to weighted gamma processes, as well as nonparametric and semiparametric hierarchical models based on the Dirichlet process, its two-parameter extension, the Pitman-Yor process and finite dimensional Dirichlet priors.

KW - Dirichlet process

KW - Exchangeable partition

KW - Finite dimensional Dirichlet prior

KW - Prediction rule

KW - Random probability measure

KW - Species sampling sequence

KW - Two-parameter Poisson-Dirichlet process

UR - http://www.scopus.com/inward/record.url?scp=0346338291&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0346338291&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0346338291

VL - 13

SP - 1211

EP - 1235

JO - Statistica Sinica

JF - Statistica Sinica

SN - 1017-0405

IS - 4

ER -