Extreme value theory in some statistical analysis of genomic sequences

Lily Wang, Pranab K. Sen

Research output: Contribution to journalArticle

Abstract

Because similarities in biological sequences often suggest similarities in structures and functions, profile searches using multiple alignment of families of related biological sequences provide useful hints for starting points for experimental investigations in molecular biology. Strategies are formulated for determining statistical significance of scores obtained by searching multiple alignment profiles with databanks, while accommodating for gaps in the profile. The methodology is validated with derivation of asymptotic distribution of the maximum of profile scores, even under weakly dependence conditions. Simulation studies show the proposed method is adequate for moderate sample sizes. The methodology is illustrated with an immunoglobulin protein domain study example.

Original languageEnglish (US)
Pages (from-to)295-310
Number of pages16
JournalExtremes
Volume8
Issue number4
DOIs
StatePublished - Dec 1 2005
Externally publishedYes

Keywords

  • Maximum profile scores
  • Protein profile
  • Sequence alignment
  • Statistical significance
  • Weakly dependent

ASJC Scopus subject areas

  • Statistics and Probability
  • Engineering (miscellaneous)
  • Economics, Econometrics and Finance (miscellaneous)

Fingerprint Dive into the research topics of 'Extreme value theory in some statistical analysis of genomic sequences'. Together they form a unique fingerprint.

  • Cite this