Evaluating the absolute quality of a single protein model using structural features and support vector machines

Zheng Wang, Allison N. Tegge, Jianlin Cheng

Research output: Contribution to journalArticlepeer-review

76 Scopus citations


Knowing the quality of a protein structure model is important for its appropriate usage. We developed a model evaluation method to assess the absolute quality of a single protein model using only structural features with support vector machine regression. The method assigns an absolute quantitative score (i.e. GDT-TS) to a model by comparing its secondary structure, relative solvent accessibility, contact map, and beta sheet structure with their counterparts predicted from its primary sequence. We trained and tested the method on the CASP6 dataset using cross-validation. The correlation between predicted and true scores is 0.82. On the independent CASP7 dataset, the correlation averaged over 95 protein targets is 0.76; the average correlation for template-based and ab initio targets is 0.82 and 0.50, respectively. Furthermore, the predicted absolute quality scores can be used to rank models effectively. The average difference (or loss) between the scores of the top-ranked models and the best models is 5.70 on the CASP7 targets. This method performs favorably when compared with the other methods used on the same dataset. Moreover, the predicted absolute quality scores are comparable across models for different proteins. These features make the method a valuable tool for model quality assurance and ranking.

Original languageEnglish (US)
Pages (from-to)638-647
Number of pages10
JournalProteins: Structure, Function and Bioinformatics
Issue number3
StatePublished - May 15 2009
Externally publishedYes


  • Machine learning
  • Protein model evaluation
  • Protein model quality assurance
  • Protein structure prediction
  • Support vector machine

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology


Dive into the research topics of 'Evaluating the absolute quality of a single protein model using structural features and support vector machines'. Together they form a unique fingerprint.

Cite this