Anatomy of Escherichia coli ribosome binding sites

Ryan K. Shultzaberger, R. Elaine Bucheimer, Kenneth E. Rudd, Thomas D. Schneider

Research output: Contribution to journalArticle

93 Citations (Scopus)

Abstract

During translational initiation in prokaryotes, the 3′ end of the 16S rRNA binds to a region just upstream of the initiation codon. The relationship between this Shine-Dalgarno (SD) region and the binding of ribosomes to translation start-points has been well studied, but a unified mathematical connection between the SD, the initiation codon and the spacing between them has been lacking. Using information theory, we constructed a model that treats these three components uniformly by assigning to the SD and the initiation region (IR) conservations in bits of information, and by assigning to the spacing an uncertainty, also in bits. To build the model, we first aligned the SD region by maximizing the information content there. The ease of this process confirmed the existence of the SD pattern within a set of 4122 reviewed and revised Escherichia coli gene starts. This large data set allowed us to show graphically, by sequence logos, that the spacing between the SD and the initiation region affects both the SD site conservation and its pattern. We used the aligned SD, the spacing, and the initiation region to model ribosome binding and to identify gene starts that do not conform to the ribosome binding site model. A total of 569 experimentally proven starts are more conserved (have higher information content) than the full set of revised starts, which probably reflects an experimental bias against the detection of gene products that have inefficient ribosome binding sites. Models were refined cyclically by removing non-conforming weak sites. After this procedure, models derived from either the original or the revised gene start annotation were similar. Therefore, this information theory-based technique provides a method for easily constructing biologically sensible ribosome binding site models. Such models should be useful for refining gene-start predictions of any sequenced bacterial genome.

Original languageEnglish
Pages (from-to)215-228
Number of pages14
JournalJournal of Molecular Biology
Volume313
Issue number1
DOIs
StatePublished - Oct 12 2001
Externally publishedYes

Fingerprint

Ribosomes
Anatomy
Binding Sites
Escherichia coli
Information Theory
Initiator Codon
Genes
Position-Specific Scoring Matrices
Bacterial Genomes
Molecular Sequence Annotation
Uncertainty

Keywords

  • Information theory
  • Ribosome
  • Sequence logo
  • Sequence walker
  • Shine-Dalgarno

ASJC Scopus subject areas

  • Virology

Cite this

Shultzaberger, R. K., Bucheimer, R. E., Rudd, K. E., & Schneider, T. D. (2001). Anatomy of Escherichia coli ribosome binding sites. Journal of Molecular Biology, 313(1), 215-228. https://doi.org/10.1006/jmbi.2001.5040

Anatomy of Escherichia coli ribosome binding sites. / Shultzaberger, Ryan K.; Bucheimer, R. Elaine; Rudd, Kenneth E.; Schneider, Thomas D.

In: Journal of Molecular Biology, Vol. 313, No. 1, 12.10.2001, p. 215-228.

Research output: Contribution to journalArticle

Shultzaberger, RK, Bucheimer, RE, Rudd, KE & Schneider, TD 2001, 'Anatomy of Escherichia coli ribosome binding sites', Journal of Molecular Biology, vol. 313, no. 1, pp. 215-228. https://doi.org/10.1006/jmbi.2001.5040
Shultzaberger RK, Bucheimer RE, Rudd KE, Schneider TD. Anatomy of Escherichia coli ribosome binding sites. Journal of Molecular Biology. 2001 Oct 12;313(1):215-228. https://doi.org/10.1006/jmbi.2001.5040
Shultzaberger, Ryan K. ; Bucheimer, R. Elaine ; Rudd, Kenneth E. ; Schneider, Thomas D. / Anatomy of Escherichia coli ribosome binding sites. In: Journal of Molecular Biology. 2001 ; Vol. 313, No. 1. pp. 215-228.
@article{e1f74dcbd8854c83b06c52d1a65a982a,
title = "Anatomy of Escherichia coli ribosome binding sites",
abstract = "During translational initiation in prokaryotes, the 3′ end of the 16S rRNA binds to a region just upstream of the initiation codon. The relationship between this Shine-Dalgarno (SD) region and the binding of ribosomes to translation start-points has been well studied, but a unified mathematical connection between the SD, the initiation codon and the spacing between them has been lacking. Using information theory, we constructed a model that treats these three components uniformly by assigning to the SD and the initiation region (IR) conservations in bits of information, and by assigning to the spacing an uncertainty, also in bits. To build the model, we first aligned the SD region by maximizing the information content there. The ease of this process confirmed the existence of the SD pattern within a set of 4122 reviewed and revised Escherichia coli gene starts. This large data set allowed us to show graphically, by sequence logos, that the spacing between the SD and the initiation region affects both the SD site conservation and its pattern. We used the aligned SD, the spacing, and the initiation region to model ribosome binding and to identify gene starts that do not conform to the ribosome binding site model. A total of 569 experimentally proven starts are more conserved (have higher information content) than the full set of revised starts, which probably reflects an experimental bias against the detection of gene products that have inefficient ribosome binding sites. Models were refined cyclically by removing non-conforming weak sites. After this procedure, models derived from either the original or the revised gene start annotation were similar. Therefore, this information theory-based technique provides a method for easily constructing biologically sensible ribosome binding site models. Such models should be useful for refining gene-start predictions of any sequenced bacterial genome.",
keywords = "Information theory, Ribosome, Sequence logo, Sequence walker, Shine-Dalgarno",
author = "Shultzaberger, {Ryan K.} and Bucheimer, {R. Elaine} and Rudd, {Kenneth E.} and Schneider, {Thomas D.}",
year = "2001",
month = "10",
day = "12",
doi = "10.1006/jmbi.2001.5040",
language = "English",
volume = "313",
pages = "215--228",
journal = "Journal of Molecular Biology",
issn = "0022-2836",
publisher = "Academic Press Inc.",
number = "1",

}

TY - JOUR

T1 - Anatomy of Escherichia coli ribosome binding sites

AU - Shultzaberger, Ryan K.

AU - Bucheimer, R. Elaine

AU - Rudd, Kenneth E.

AU - Schneider, Thomas D.

PY - 2001/10/12

Y1 - 2001/10/12

N2 - During translational initiation in prokaryotes, the 3′ end of the 16S rRNA binds to a region just upstream of the initiation codon. The relationship between this Shine-Dalgarno (SD) region and the binding of ribosomes to translation start-points has been well studied, but a unified mathematical connection between the SD, the initiation codon and the spacing between them has been lacking. Using information theory, we constructed a model that treats these three components uniformly by assigning to the SD and the initiation region (IR) conservations in bits of information, and by assigning to the spacing an uncertainty, also in bits. To build the model, we first aligned the SD region by maximizing the information content there. The ease of this process confirmed the existence of the SD pattern within a set of 4122 reviewed and revised Escherichia coli gene starts. This large data set allowed us to show graphically, by sequence logos, that the spacing between the SD and the initiation region affects both the SD site conservation and its pattern. We used the aligned SD, the spacing, and the initiation region to model ribosome binding and to identify gene starts that do not conform to the ribosome binding site model. A total of 569 experimentally proven starts are more conserved (have higher information content) than the full set of revised starts, which probably reflects an experimental bias against the detection of gene products that have inefficient ribosome binding sites. Models were refined cyclically by removing non-conforming weak sites. After this procedure, models derived from either the original or the revised gene start annotation were similar. Therefore, this information theory-based technique provides a method for easily constructing biologically sensible ribosome binding site models. Such models should be useful for refining gene-start predictions of any sequenced bacterial genome.

AB - During translational initiation in prokaryotes, the 3′ end of the 16S rRNA binds to a region just upstream of the initiation codon. The relationship between this Shine-Dalgarno (SD) region and the binding of ribosomes to translation start-points has been well studied, but a unified mathematical connection between the SD, the initiation codon and the spacing between them has been lacking. Using information theory, we constructed a model that treats these three components uniformly by assigning to the SD and the initiation region (IR) conservations in bits of information, and by assigning to the spacing an uncertainty, also in bits. To build the model, we first aligned the SD region by maximizing the information content there. The ease of this process confirmed the existence of the SD pattern within a set of 4122 reviewed and revised Escherichia coli gene starts. This large data set allowed us to show graphically, by sequence logos, that the spacing between the SD and the initiation region affects both the SD site conservation and its pattern. We used the aligned SD, the spacing, and the initiation region to model ribosome binding and to identify gene starts that do not conform to the ribosome binding site model. A total of 569 experimentally proven starts are more conserved (have higher information content) than the full set of revised starts, which probably reflects an experimental bias against the detection of gene products that have inefficient ribosome binding sites. Models were refined cyclically by removing non-conforming weak sites. After this procedure, models derived from either the original or the revised gene start annotation were similar. Therefore, this information theory-based technique provides a method for easily constructing biologically sensible ribosome binding site models. Such models should be useful for refining gene-start predictions of any sequenced bacterial genome.

KW - Information theory

KW - Ribosome

KW - Sequence logo

KW - Sequence walker

KW - Shine-Dalgarno

UR - http://www.scopus.com/inward/record.url?scp=0035850734&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035850734&partnerID=8YFLogxK

U2 - 10.1006/jmbi.2001.5040

DO - 10.1006/jmbi.2001.5040

M3 - Article

C2 - 11601857

AN - SCOPUS:0035850734

VL - 313

SP - 215

EP - 228

JO - Journal of Molecular Biology

JF - Journal of Molecular Biology

SN - 0022-2836

IS - 1

ER -