Identifying differentially expressed genes in microarray experiments with model-based variance estimation

Xiaodong Cai, Georgios B. Giannakis

Research output: Contribution to journalArticle

8 Scopus citations

Abstract

Statistical tests have been employed to identify genes differentially expressed under different conditions using data from microarray experiments. The variance of gene expression levels is often required in various statistical tests; however, due to the small number of replicates, the variance estimated from the sample variance is not accurate, which causes large false positive and negative errors. More accurate and robust variance estimation is thus highly desirable to improve the performance of statistical tests. In this paper, cluster analysis was performed on the microarray data using a model-based clustering method. The variance for each gene was then estimated from cluster variances. Since cluster variances are estimated from multiple genes whose microarray data have similar variance, the proposed estimation method pools the relevant genes together; this effectively increases the number of samples in variance estimation, thereby improving variance estimation. Using simulated data, it is shown that with the novel variance estimation, the performance of the t-test, regularized t-test, and a variant of SAM test, which is called the t-test here, can be improved. Using colon microarray data of Alon, it is demonstrated that the proposed method offers better or comparable performance compared with other gene pooling methods. Using the IHF microarray data of Arfin, it is shown that the proposed novel variance estimation decreases the significance of those genes having a small fold change but a high significant score assigned by the t-test using the sample variance, which potentially reduces false positive probability.

Original languageEnglish (US)
Pages (from-to)2418-2426
Number of pages9
JournalIEEE Transactions on Signal Processing
Volume54
Issue number6 II
DOIs
StatePublished - Jun 1 2006

    Fingerprint

Keywords

  • Clustering
  • Microarray
  • Mixture model
  • Statistical test
  • Variance estimation

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing

Cite this