Abstract
Finite mixtures of regressions have been used to analyze data that come from a heterogeneous population. When more than one response is observed, accommodating a multivariate response can be useful. In this article, we go a step further and introduce a multivariate extension that includes a latent overlapping cluster indicator variable that allows for potential overdispersion. A generalized mixture of multivariate regressions in connection with the proposed model and a new EM algorithm for fitting are provided. In addition, we allow for high-dimensional predictors via shrinkage estimation. This model proves particularly useful in the analysis of complex data like the search for cancer therapeutic biomarkers. We demonstrate this using the genomics of drug sensitivity in cancer resource.
Original language | English (US) |
---|---|
Pages (from-to) | 4301-4324 |
Number of pages | 24 |
Journal | Statistics in Medicine |
Volume | 39 |
Issue number | 28 |
DOIs | |
State | Published - Dec 10 2020 |
Externally published | Yes |
Keywords
- EM algorithm
- Lasso
- cancer biomarkers
- mixture of multivariate regression models
- overlapping clustering
ASJC Scopus subject areas
- Epidemiology
- Statistics and Probability