New computer and statistical methods were used to determine significant direct and inverted repeats in the Escherichia coli contig sequence collection of aggregate 1.6 × 106 base-pairs. Eight groups of mostly new structural repeat identities were uncovered. Apart from the high statistical significance of these repeat sequences, there are suggestive relationships of the group matches in terms of neighboring genes, of genomic distributions, of their texts, and of their potentials for secondary structure. Four of these groups are relatively numerous, 11 to 26 members, one is in coding sequences and three are in non-coding. The coding group consists of the ATP-activated transmembrane component of a typical high-affinity protein-binding transport system. One of the non-coding groups consists of a special rho-independent transcription termination signal closely following an operon. The gene neighbors of this group often appear to be involved in some way in processing RNA or DNA. A second non-coding group has, for one or both neighboring genes, a component of a system responding to stress or starvation for some nutrient.
- Escherichia coli, protein binding transport
- rho-independent transcription terminators
- Statistically significantly long common words
- Systems inducible by nutrient starvation
ASJC Scopus subject areas