The E-MS Algorithm: Model Selection With Incomplete Data

Jiming Jiang, Thuan Nguyen, Jonnagadda S Rao

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

We propose a procedure associated with the idea of the E-M algorithm for model selection in the presence of missing data. The idea extends the concept of parameters to include both the model and the parameters under the model, and thus allows the model to be part of the E-M iterations. We develop the procedure, known as the E-MS algorithm, under the assumption that the class of candidate models is finite. Some special cases of the procedure are considered, including E-MS with the generalized information criteria (GIC), and E-MS with the adaptive fence (AF; Jiang et al.). We prove numerical convergence of the E-MS algorithm as well as consistency in model selection of the limiting model of the E-MS convergence, for E-MS with GIC and E-MS with AF. We study the impact on model selection of different missing data mechanisms. Furthermore, we carry out extensive simulation studies on the finite-sample performance of the E-MS with comparisons to other procedures. The methodology is also illustrated on a real data analysis involving QTL mapping for an agricultural study on barley grains. Supplementary materials for this article are available online.

Original languageEnglish (US)
Pages (from-to)1136-1147
Number of pages12
JournalJournal of the American Statistical Association
Volume110
Issue number511
DOIs
StatePublished - Jul 3 2015

Fingerprint

Incomplete Data
Model Selection
Information Criterion
Missing Data Mechanism
Quantitative Trait Loci
Model
Barley
EM Algorithm
Missing Data
Data analysis
Limiting
Incomplete data
Model selection
Simulation Study
Iteration
Methodology
Missing data
Information criterion

Keywords

  • Backcross experiments
  • Conditional sampling
  • Consistency
  • Convergence
  • Missing data mechanism
  • Model selection
  • Regression

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

The E-MS Algorithm : Model Selection With Incomplete Data. / Jiang, Jiming; Nguyen, Thuan; Rao, Jonnagadda S.

In: Journal of the American Statistical Association, Vol. 110, No. 511, 03.07.2015, p. 1136-1147.

Research output: Contribution to journalArticle

@article{c19346d0ab0e49dc8f8050756071c944,
title = "The E-MS Algorithm: Model Selection With Incomplete Data",
abstract = "We propose a procedure associated with the idea of the E-M algorithm for model selection in the presence of missing data. The idea extends the concept of parameters to include both the model and the parameters under the model, and thus allows the model to be part of the E-M iterations. We develop the procedure, known as the E-MS algorithm, under the assumption that the class of candidate models is finite. Some special cases of the procedure are considered, including E-MS with the generalized information criteria (GIC), and E-MS with the adaptive fence (AF; Jiang et al.). We prove numerical convergence of the E-MS algorithm as well as consistency in model selection of the limiting model of the E-MS convergence, for E-MS with GIC and E-MS with AF. We study the impact on model selection of different missing data mechanisms. Furthermore, we carry out extensive simulation studies on the finite-sample performance of the E-MS with comparisons to other procedures. The methodology is also illustrated on a real data analysis involving QTL mapping for an agricultural study on barley grains. Supplementary materials for this article are available online.",
keywords = "Backcross experiments, Conditional sampling, Consistency, Convergence, Missing data mechanism, Model selection, Regression",
author = "Jiming Jiang and Thuan Nguyen and Rao, {Jonnagadda S}",
year = "2015",
month = "7",
day = "3",
doi = "10.1080/01621459.2014.948545",
language = "English (US)",
volume = "110",
pages = "1136--1147",
journal = "Journal of the American Statistical Association",
issn = "0162-1459",
publisher = "Taylor and Francis Ltd.",
number = "511",

}

TY - JOUR

T1 - The E-MS Algorithm

T2 - Model Selection With Incomplete Data

AU - Jiang, Jiming

AU - Nguyen, Thuan

AU - Rao, Jonnagadda S

PY - 2015/7/3

Y1 - 2015/7/3

N2 - We propose a procedure associated with the idea of the E-M algorithm for model selection in the presence of missing data. The idea extends the concept of parameters to include both the model and the parameters under the model, and thus allows the model to be part of the E-M iterations. We develop the procedure, known as the E-MS algorithm, under the assumption that the class of candidate models is finite. Some special cases of the procedure are considered, including E-MS with the generalized information criteria (GIC), and E-MS with the adaptive fence (AF; Jiang et al.). We prove numerical convergence of the E-MS algorithm as well as consistency in model selection of the limiting model of the E-MS convergence, for E-MS with GIC and E-MS with AF. We study the impact on model selection of different missing data mechanisms. Furthermore, we carry out extensive simulation studies on the finite-sample performance of the E-MS with comparisons to other procedures. The methodology is also illustrated on a real data analysis involving QTL mapping for an agricultural study on barley grains. Supplementary materials for this article are available online.

AB - We propose a procedure associated with the idea of the E-M algorithm for model selection in the presence of missing data. The idea extends the concept of parameters to include both the model and the parameters under the model, and thus allows the model to be part of the E-M iterations. We develop the procedure, known as the E-MS algorithm, under the assumption that the class of candidate models is finite. Some special cases of the procedure are considered, including E-MS with the generalized information criteria (GIC), and E-MS with the adaptive fence (AF; Jiang et al.). We prove numerical convergence of the E-MS algorithm as well as consistency in model selection of the limiting model of the E-MS convergence, for E-MS with GIC and E-MS with AF. We study the impact on model selection of different missing data mechanisms. Furthermore, we carry out extensive simulation studies on the finite-sample performance of the E-MS with comparisons to other procedures. The methodology is also illustrated on a real data analysis involving QTL mapping for an agricultural study on barley grains. Supplementary materials for this article are available online.

KW - Backcross experiments

KW - Conditional sampling

KW - Consistency

KW - Convergence

KW - Missing data mechanism

KW - Model selection

KW - Regression

UR - http://www.scopus.com/inward/record.url?scp=84946917450&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946917450&partnerID=8YFLogxK

U2 - 10.1080/01621459.2014.948545

DO - 10.1080/01621459.2014.948545

M3 - Article

AN - SCOPUS:84946917450

VL - 110

SP - 1136

EP - 1147

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

SN - 0162-1459

IS - 511

ER -