Proteins of Escherichia coli come in sizes that are multiples of 14 kDa

Domain concepts and evolutionary implications

M. A. Savageau

Research output: Contribution to journalArticle

26 Citations (Scopus)

Abstract

Initial attempts to correlate the distribution of gene density (number of gene loci per unit length on the linkage map) with the distribution of lengths of coding sequences have led to the observation that 46% of approximately 1000 sampled proteins in Escherichia coli have molecular masses of n x 14,000 ± 2500 daltons (n = 1, 2, ...). This clustering around multiples of 14,000 contrasts with the 36% one would expect in these ranges if the sizes were uniformly distributed. The entire distribution is well fit by a sum of normal or lognormal distributions located at multiples of 14,000, which suggests that the percentage of E. coli proteins governed by the underlying sizing mechanism is much greater than 50%. Clustering of protein molecular sizes around multiples of a unit size also is suggested by the distribution of well-characterized HeLa cell proteins. The distribution of gene lengths for E. coli suggests regular clustering, which implies that the clustering of protein molecular masses is not an artifact of the molecular mass measurement by gel electrophoresis. These observations suggest the existence of a fundamental structural unit. The rather uniform size of this structural unit (without any apparent sequence homology) suggests that a general principle such as geometrical or physical optimization at the DNA or protein level is responsible. This suggestion is discussed in relation to experimental evidence for the domain structure of proteins and to existing hypotheses that attempt to account for these domains. Microevolution would appear to be accommodated by incremental changes within this fundamental unit, whereas macroevolution would appear to involve 'quantum' changes to the next stable size of protein.

Original languageEnglish (US)
Pages (from-to)1198-1202
Number of pages5
JournalProceedings of the National Academy of Sciences of the United States of America
Volume83
Issue number5
StatePublished - 1986
Externally publishedYes

Fingerprint

Escherichia coli Proteins
Cluster Analysis
Proteins
Genes
Sequence Homology
HeLa Cells
Artifacts
Electrophoresis
Gels
Escherichia coli
DNA

ASJC Scopus subject areas

  • General
  • Genetics

Cite this

@article{cade58af757a4bc9a65c4d00ae4f50c1,
title = "Proteins of Escherichia coli come in sizes that are multiples of 14 kDa: Domain concepts and evolutionary implications",
abstract = "Initial attempts to correlate the distribution of gene density (number of gene loci per unit length on the linkage map) with the distribution of lengths of coding sequences have led to the observation that 46{\%} of approximately 1000 sampled proteins in Escherichia coli have molecular masses of n x 14,000 ± 2500 daltons (n = 1, 2, ...). This clustering around multiples of 14,000 contrasts with the 36{\%} one would expect in these ranges if the sizes were uniformly distributed. The entire distribution is well fit by a sum of normal or lognormal distributions located at multiples of 14,000, which suggests that the percentage of E. coli proteins governed by the underlying sizing mechanism is much greater than 50{\%}. Clustering of protein molecular sizes around multiples of a unit size also is suggested by the distribution of well-characterized HeLa cell proteins. The distribution of gene lengths for E. coli suggests regular clustering, which implies that the clustering of protein molecular masses is not an artifact of the molecular mass measurement by gel electrophoresis. These observations suggest the existence of a fundamental structural unit. The rather uniform size of this structural unit (without any apparent sequence homology) suggests that a general principle such as geometrical or physical optimization at the DNA or protein level is responsible. This suggestion is discussed in relation to experimental evidence for the domain structure of proteins and to existing hypotheses that attempt to account for these domains. Microevolution would appear to be accommodated by incremental changes within this fundamental unit, whereas macroevolution would appear to involve 'quantum' changes to the next stable size of protein.",
author = "Savageau, {M. A.}",
year = "1986",
language = "English (US)",
volume = "83",
pages = "1198--1202",
journal = "Proceedings of the National Academy of Sciences of the United States of America",
issn = "0027-8424",
number = "5",

}

TY - JOUR

T1 - Proteins of Escherichia coli come in sizes that are multiples of 14 kDa

T2 - Domain concepts and evolutionary implications

AU - Savageau, M. A.

PY - 1986

Y1 - 1986

N2 - Initial attempts to correlate the distribution of gene density (number of gene loci per unit length on the linkage map) with the distribution of lengths of coding sequences have led to the observation that 46% of approximately 1000 sampled proteins in Escherichia coli have molecular masses of n x 14,000 ± 2500 daltons (n = 1, 2, ...). This clustering around multiples of 14,000 contrasts with the 36% one would expect in these ranges if the sizes were uniformly distributed. The entire distribution is well fit by a sum of normal or lognormal distributions located at multiples of 14,000, which suggests that the percentage of E. coli proteins governed by the underlying sizing mechanism is much greater than 50%. Clustering of protein molecular sizes around multiples of a unit size also is suggested by the distribution of well-characterized HeLa cell proteins. The distribution of gene lengths for E. coli suggests regular clustering, which implies that the clustering of protein molecular masses is not an artifact of the molecular mass measurement by gel electrophoresis. These observations suggest the existence of a fundamental structural unit. The rather uniform size of this structural unit (without any apparent sequence homology) suggests that a general principle such as geometrical or physical optimization at the DNA or protein level is responsible. This suggestion is discussed in relation to experimental evidence for the domain structure of proteins and to existing hypotheses that attempt to account for these domains. Microevolution would appear to be accommodated by incremental changes within this fundamental unit, whereas macroevolution would appear to involve 'quantum' changes to the next stable size of protein.

AB - Initial attempts to correlate the distribution of gene density (number of gene loci per unit length on the linkage map) with the distribution of lengths of coding sequences have led to the observation that 46% of approximately 1000 sampled proteins in Escherichia coli have molecular masses of n x 14,000 ± 2500 daltons (n = 1, 2, ...). This clustering around multiples of 14,000 contrasts with the 36% one would expect in these ranges if the sizes were uniformly distributed. The entire distribution is well fit by a sum of normal or lognormal distributions located at multiples of 14,000, which suggests that the percentage of E. coli proteins governed by the underlying sizing mechanism is much greater than 50%. Clustering of protein molecular sizes around multiples of a unit size also is suggested by the distribution of well-characterized HeLa cell proteins. The distribution of gene lengths for E. coli suggests regular clustering, which implies that the clustering of protein molecular masses is not an artifact of the molecular mass measurement by gel electrophoresis. These observations suggest the existence of a fundamental structural unit. The rather uniform size of this structural unit (without any apparent sequence homology) suggests that a general principle such as geometrical or physical optimization at the DNA or protein level is responsible. This suggestion is discussed in relation to experimental evidence for the domain structure of proteins and to existing hypotheses that attempt to account for these domains. Microevolution would appear to be accommodated by incremental changes within this fundamental unit, whereas macroevolution would appear to involve 'quantum' changes to the next stable size of protein.

UR - http://www.scopus.com/inward/record.url?scp=0040950837&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0040950837&partnerID=8YFLogxK

M3 - Article

VL - 83

SP - 1198

EP - 1202

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

SN - 0027-8424

IS - 5

ER -