SeqRate

Sequence-based protein folding type classification and rates prediction

Guan N. Lin, Zheng Wang, Dong Xu, Jianlin Cheng

Research output: Contribution to journalArticle

18 Citations (Scopus)

Abstract

Background: Protein folding rate is an important property of a protein. Predicting protein folding rate is useful for understanding protein folding process and guiding protein design. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. And most methods do not distinguish the different kinetic nature (two-state folding or multi-state folding) of the proteins. Here we developed a method, SeqRate, to predict both protein folding kinetic type (two-state versus multi-state) and real-value folding rate using sequence length, amino acid composition, contact order, contact number, and secondary structure information predicted from only protein sequence with support vector machines.Results: We systematically studied the contributions of individual features to folding rate prediction. On a standard benchmark dataset, the accuracy of folding kinetic type classification is 80%. The Pearson correlation coefficient and the mean absolute difference between predicted and experimental folding rates (sec-1) in the base-10 logarithmic scale are 0.81 and 0.79 for two-state protein folders, and 0.80 and 0.68 for three-state protein folders. SeqRate is the first sequence-based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods. Its performance can be further enhanced with additional information, such as structure-based geometric contacts, as inputs.Conclusions: Both the web server and software of predicting folding rate are publicly available at http://casp.rnet.missouri.edu/fold_rate/index.html.

Original languageEnglish (US)
Article numberS1
JournalBMC Bioinformatics
Volume11
Issue numberSUPPL. 3
DOIs
StatePublished - Apr 29 2010
Externally publishedYes

Fingerprint

Protein folding
Protein Folding
Folding
Proteins
Protein
Prediction
Multi-state
Kinetics
Contact
Fold
Benchmarking
Pearson Correlation
Web Server
Protein Sequence
Secondary Structure
Tertiary Protein Structure
Correlation coefficient
Support vector machines
Amino Acids
Amino acids

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

SeqRate : Sequence-based protein folding type classification and rates prediction. / Lin, Guan N.; Wang, Zheng; Xu, Dong; Cheng, Jianlin.

In: BMC Bioinformatics, Vol. 11, No. SUPPL. 3, S1, 29.04.2010.

Research output: Contribution to journalArticle

Lin, Guan N. ; Wang, Zheng ; Xu, Dong ; Cheng, Jianlin. / SeqRate : Sequence-based protein folding type classification and rates prediction. In: BMC Bioinformatics. 2010 ; Vol. 11, No. SUPPL. 3.
@article{43cec963857d4ed9a590165a668ea1d9,
title = "SeqRate: Sequence-based protein folding type classification and rates prediction",
abstract = "Background: Protein folding rate is an important property of a protein. Predicting protein folding rate is useful for understanding protein folding process and guiding protein design. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. And most methods do not distinguish the different kinetic nature (two-state folding or multi-state folding) of the proteins. Here we developed a method, SeqRate, to predict both protein folding kinetic type (two-state versus multi-state) and real-value folding rate using sequence length, amino acid composition, contact order, contact number, and secondary structure information predicted from only protein sequence with support vector machines.Results: We systematically studied the contributions of individual features to folding rate prediction. On a standard benchmark dataset, the accuracy of folding kinetic type classification is 80{\%}. The Pearson correlation coefficient and the mean absolute difference between predicted and experimental folding rates (sec-1) in the base-10 logarithmic scale are 0.81 and 0.79 for two-state protein folders, and 0.80 and 0.68 for three-state protein folders. SeqRate is the first sequence-based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods. Its performance can be further enhanced with additional information, such as structure-based geometric contacts, as inputs.Conclusions: Both the web server and software of predicting folding rate are publicly available at http://casp.rnet.missouri.edu/fold_rate/index.html.",
author = "Lin, {Guan N.} and Zheng Wang and Dong Xu and Jianlin Cheng",
year = "2010",
month = "4",
day = "29",
doi = "10.1186/1471-2105-11-S3-S1",
language = "English (US)",
volume = "11",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",
number = "SUPPL. 3",

}

TY - JOUR

T1 - SeqRate

T2 - Sequence-based protein folding type classification and rates prediction

AU - Lin, Guan N.

AU - Wang, Zheng

AU - Xu, Dong

AU - Cheng, Jianlin

PY - 2010/4/29

Y1 - 2010/4/29

N2 - Background: Protein folding rate is an important property of a protein. Predicting protein folding rate is useful for understanding protein folding process and guiding protein design. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. And most methods do not distinguish the different kinetic nature (two-state folding or multi-state folding) of the proteins. Here we developed a method, SeqRate, to predict both protein folding kinetic type (two-state versus multi-state) and real-value folding rate using sequence length, amino acid composition, contact order, contact number, and secondary structure information predicted from only protein sequence with support vector machines.Results: We systematically studied the contributions of individual features to folding rate prediction. On a standard benchmark dataset, the accuracy of folding kinetic type classification is 80%. The Pearson correlation coefficient and the mean absolute difference between predicted and experimental folding rates (sec-1) in the base-10 logarithmic scale are 0.81 and 0.79 for two-state protein folders, and 0.80 and 0.68 for three-state protein folders. SeqRate is the first sequence-based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods. Its performance can be further enhanced with additional information, such as structure-based geometric contacts, as inputs.Conclusions: Both the web server and software of predicting folding rate are publicly available at http://casp.rnet.missouri.edu/fold_rate/index.html.

AB - Background: Protein folding rate is an important property of a protein. Predicting protein folding rate is useful for understanding protein folding process and guiding protein design. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. And most methods do not distinguish the different kinetic nature (two-state folding or multi-state folding) of the proteins. Here we developed a method, SeqRate, to predict both protein folding kinetic type (two-state versus multi-state) and real-value folding rate using sequence length, amino acid composition, contact order, contact number, and secondary structure information predicted from only protein sequence with support vector machines.Results: We systematically studied the contributions of individual features to folding rate prediction. On a standard benchmark dataset, the accuracy of folding kinetic type classification is 80%. The Pearson correlation coefficient and the mean absolute difference between predicted and experimental folding rates (sec-1) in the base-10 logarithmic scale are 0.81 and 0.79 for two-state protein folders, and 0.80 and 0.68 for three-state protein folders. SeqRate is the first sequence-based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods. Its performance can be further enhanced with additional information, such as structure-based geometric contacts, as inputs.Conclusions: Both the web server and software of predicting folding rate are publicly available at http://casp.rnet.missouri.edu/fold_rate/index.html.

UR - http://www.scopus.com/inward/record.url?scp=77952279379&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77952279379&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-11-S3-S1

DO - 10.1186/1471-2105-11-S3-S1

M3 - Article

VL - 11

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - SUPPL. 3

M1 - S1

ER -