Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins

Ralf Landgraf, Ioannis Xenarios, David Eisenberg

Research output: Contribution to journalArticle

197 Citations (Scopus)

Abstract

Three-dimensional cluster analysis offers a method for the prediction of functional residue clusters in proteins. This method requires a representative structure and a multiple sequence alignment as input data. Individual residues are represented in terms of regional alignments that reflect both their structural environment and their evolutionary variation, as defined by the alignment of homologous sequences. From the overall (global) and the residue-specific (regional) alignments, we calculate the global and regional similarity matrices, containing scores for all pairwise sequence comparisons in the respective alignments. Comparing the matrices yields two scores for each residue. The regional conservation score (CR(x)) defines the conservation of each residue x and its neighbors in 3D space relative to the protein as a whole. The similarity deviation score (S(x)) detects residue clusters with sequence similarities that deviate from the Similarities suggested by the full-length sequences. We evaluated 3D cluster analysis on a set of 35 families of proteins with available cocrystal structures, showing small ligand interfaces, nucleic acid interfaces and two types of protein-protein interfaces (transient and stable). We present two examples in detail: fructose-1,6-bisphosphate aldolase and the mitogen-activated protein kinase ERK2. We found that the regional conservation score (CR(x)) identifies functional residue clusters better than a scoring scheme that does not take 3D information into account. CR(x) is particularly useful for the prediction of poorly conserved, transient protein-protein interfaces. Many of the proteins studied contained residue clusters with elevated similarity deviation scores. These residue clusters correlate with specificity-conferring regions: 3D cluster analysis therefore represents an easily applied method for the prediction of functionally relevant spatial clusters of residues in proteins.

Original languageEnglish
Pages (from-to)1487-1502
Number of pages16
JournalJournal of Molecular Biology
Volume307
Issue number5
DOIs
StatePublished - Apr 13 2001
Externally publishedYes

Fingerprint

Cluster Analysis
Proteins
Fructose-Bisphosphate Aldolase
Sequence Alignment
Sequence Homology
Mitogen-Activated Protein Kinases
Nucleic Acids
Ligands

Keywords

  • Bioinformatics
  • Evolutionary tracing
  • Phylogeny
  • Protein families
  • Residue patches

ASJC Scopus subject areas

  • Virology

Cite this

Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins. / Landgraf, Ralf; Xenarios, Ioannis; Eisenberg, David.

In: Journal of Molecular Biology, Vol. 307, No. 5, 13.04.2001, p. 1487-1502.

Research output: Contribution to journalArticle

@article{257a01a4e2184f6192a9decb95e27c0d,
title = "Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins",
abstract = "Three-dimensional cluster analysis offers a method for the prediction of functional residue clusters in proteins. This method requires a representative structure and a multiple sequence alignment as input data. Individual residues are represented in terms of regional alignments that reflect both their structural environment and their evolutionary variation, as defined by the alignment of homologous sequences. From the overall (global) and the residue-specific (regional) alignments, we calculate the global and regional similarity matrices, containing scores for all pairwise sequence comparisons in the respective alignments. Comparing the matrices yields two scores for each residue. The regional conservation score (CR(x)) defines the conservation of each residue x and its neighbors in 3D space relative to the protein as a whole. The similarity deviation score (S(x)) detects residue clusters with sequence similarities that deviate from the Similarities suggested by the full-length sequences. We evaluated 3D cluster analysis on a set of 35 families of proteins with available cocrystal structures, showing small ligand interfaces, nucleic acid interfaces and two types of protein-protein interfaces (transient and stable). We present two examples in detail: fructose-1,6-bisphosphate aldolase and the mitogen-activated protein kinase ERK2. We found that the regional conservation score (CR(x)) identifies functional residue clusters better than a scoring scheme that does not take 3D information into account. CR(x) is particularly useful for the prediction of poorly conserved, transient protein-protein interfaces. Many of the proteins studied contained residue clusters with elevated similarity deviation scores. These residue clusters correlate with specificity-conferring regions: 3D cluster analysis therefore represents an easily applied method for the prediction of functionally relevant spatial clusters of residues in proteins.",
keywords = "Bioinformatics, Evolutionary tracing, Phylogeny, Protein families, Residue patches",
author = "Ralf Landgraf and Ioannis Xenarios and David Eisenberg",
year = "2001",
month = "4",
day = "13",
doi = "10.1006/jmbi.2001.4540",
language = "English",
volume = "307",
pages = "1487--1502",
journal = "Journal of Molecular Biology",
issn = "0022-2836",
publisher = "Academic Press Inc.",
number = "5",

}

TY - JOUR

T1 - Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins

AU - Landgraf, Ralf

AU - Xenarios, Ioannis

AU - Eisenberg, David

PY - 2001/4/13

Y1 - 2001/4/13

N2 - Three-dimensional cluster analysis offers a method for the prediction of functional residue clusters in proteins. This method requires a representative structure and a multiple sequence alignment as input data. Individual residues are represented in terms of regional alignments that reflect both their structural environment and their evolutionary variation, as defined by the alignment of homologous sequences. From the overall (global) and the residue-specific (regional) alignments, we calculate the global and regional similarity matrices, containing scores for all pairwise sequence comparisons in the respective alignments. Comparing the matrices yields two scores for each residue. The regional conservation score (CR(x)) defines the conservation of each residue x and its neighbors in 3D space relative to the protein as a whole. The similarity deviation score (S(x)) detects residue clusters with sequence similarities that deviate from the Similarities suggested by the full-length sequences. We evaluated 3D cluster analysis on a set of 35 families of proteins with available cocrystal structures, showing small ligand interfaces, nucleic acid interfaces and two types of protein-protein interfaces (transient and stable). We present two examples in detail: fructose-1,6-bisphosphate aldolase and the mitogen-activated protein kinase ERK2. We found that the regional conservation score (CR(x)) identifies functional residue clusters better than a scoring scheme that does not take 3D information into account. CR(x) is particularly useful for the prediction of poorly conserved, transient protein-protein interfaces. Many of the proteins studied contained residue clusters with elevated similarity deviation scores. These residue clusters correlate with specificity-conferring regions: 3D cluster analysis therefore represents an easily applied method for the prediction of functionally relevant spatial clusters of residues in proteins.

AB - Three-dimensional cluster analysis offers a method for the prediction of functional residue clusters in proteins. This method requires a representative structure and a multiple sequence alignment as input data. Individual residues are represented in terms of regional alignments that reflect both their structural environment and their evolutionary variation, as defined by the alignment of homologous sequences. From the overall (global) and the residue-specific (regional) alignments, we calculate the global and regional similarity matrices, containing scores for all pairwise sequence comparisons in the respective alignments. Comparing the matrices yields two scores for each residue. The regional conservation score (CR(x)) defines the conservation of each residue x and its neighbors in 3D space relative to the protein as a whole. The similarity deviation score (S(x)) detects residue clusters with sequence similarities that deviate from the Similarities suggested by the full-length sequences. We evaluated 3D cluster analysis on a set of 35 families of proteins with available cocrystal structures, showing small ligand interfaces, nucleic acid interfaces and two types of protein-protein interfaces (transient and stable). We present two examples in detail: fructose-1,6-bisphosphate aldolase and the mitogen-activated protein kinase ERK2. We found that the regional conservation score (CR(x)) identifies functional residue clusters better than a scoring scheme that does not take 3D information into account. CR(x) is particularly useful for the prediction of poorly conserved, transient protein-protein interfaces. Many of the proteins studied contained residue clusters with elevated similarity deviation scores. These residue clusters correlate with specificity-conferring regions: 3D cluster analysis therefore represents an easily applied method for the prediction of functionally relevant spatial clusters of residues in proteins.

KW - Bioinformatics

KW - Evolutionary tracing

KW - Phylogeny

KW - Protein families

KW - Residue patches

UR - http://www.scopus.com/inward/record.url?scp=0035853280&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035853280&partnerID=8YFLogxK

U2 - 10.1006/jmbi.2001.4540

DO - 10.1006/jmbi.2001.4540

M3 - Article

C2 - 11292355

AN - SCOPUS:0035853280

VL - 307

SP - 1487

EP - 1502

JO - Journal of Molecular Biology

JF - Journal of Molecular Biology

SN - 0022-2836

IS - 5

ER -