Evolutionary comparisons suggest many novel cAMP response protein binding sites in Escherichia coli

Charles Brown, C. G. Callan

Research output: Contribution to journalArticle

36 Citations (Scopus)

Abstract

The cAMP response protein (CRP) is a transcription factor known to regulate many genes in Escherichia coli. Computational studies of transcription factor binding to DNA are usually based on a simple matrix model of sequence-dependent binding energy. For CRP, this model predicts many binding sites that are not known to be functional. If they are indeed spurious, the underlying binding model is called into question. We use a species comparison method to assess the functionality of a population of such predicted CRP sites in E. coli. We compare them with orthologous sites in Salmonella typhimurium identified independently by CLUSTALW alignment, and find a dependence of mutation probability on position in the site. This dependence increases with predicted site binding energy. The positions where mutation is most strongly suppressed are those where mutation would have the biggest effect on predicted binding energy. This finding suggests that many of the novel sites are functional, that the matrix model correctly estimates their binding strength, and that calculated CRP binding strength is the quantity that is conserved between species. The analysis also identifies many new E. coli binding sites and genes likely to be functional for CRP.

Original languageEnglish (US)
Pages (from-to)2404-2409
Number of pages6
JournalProceedings of the National Academy of Sciences of the United States of America
Volume101
Issue number8
DOIs
StatePublished - Feb 22 2004
Externally publishedYes

Fingerprint

Protein Binding
Binding Sites
Escherichia coli
Mutation
Proteins
Transcription Factors
Salmonella typhimurium
Genes
DNA
Population

ASJC Scopus subject areas

  • General

Cite this

@article{9e4b12fe59ad499993c2ffa6e0c2f9ec,
title = "Evolutionary comparisons suggest many novel cAMP response protein binding sites in Escherichia coli",
abstract = "The cAMP response protein (CRP) is a transcription factor known to regulate many genes in Escherichia coli. Computational studies of transcription factor binding to DNA are usually based on a simple matrix model of sequence-dependent binding energy. For CRP, this model predicts many binding sites that are not known to be functional. If they are indeed spurious, the underlying binding model is called into question. We use a species comparison method to assess the functionality of a population of such predicted CRP sites in E. coli. We compare them with orthologous sites in Salmonella typhimurium identified independently by CLUSTALW alignment, and find a dependence of mutation probability on position in the site. This dependence increases with predicted site binding energy. The positions where mutation is most strongly suppressed are those where mutation would have the biggest effect on predicted binding energy. This finding suggests that many of the novel sites are functional, that the matrix model correctly estimates their binding strength, and that calculated CRP binding strength is the quantity that is conserved between species. The analysis also identifies many new E. coli binding sites and genes likely to be functional for CRP.",
author = "Charles Brown and Callan, {C. G.}",
year = "2004",
month = "2",
day = "22",
doi = "10.1073/pnas.0308628100",
language = "English (US)",
volume = "101",
pages = "2404--2409",
journal = "Proceedings of the National Academy of Sciences of the United States of America",
issn = "0027-8424",
number = "8",

}

TY - JOUR

T1 - Evolutionary comparisons suggest many novel cAMP response protein binding sites in Escherichia coli

AU - Brown, Charles

AU - Callan, C. G.

PY - 2004/2/22

Y1 - 2004/2/22

N2 - The cAMP response protein (CRP) is a transcription factor known to regulate many genes in Escherichia coli. Computational studies of transcription factor binding to DNA are usually based on a simple matrix model of sequence-dependent binding energy. For CRP, this model predicts many binding sites that are not known to be functional. If they are indeed spurious, the underlying binding model is called into question. We use a species comparison method to assess the functionality of a population of such predicted CRP sites in E. coli. We compare them with orthologous sites in Salmonella typhimurium identified independently by CLUSTALW alignment, and find a dependence of mutation probability on position in the site. This dependence increases with predicted site binding energy. The positions where mutation is most strongly suppressed are those where mutation would have the biggest effect on predicted binding energy. This finding suggests that many of the novel sites are functional, that the matrix model correctly estimates their binding strength, and that calculated CRP binding strength is the quantity that is conserved between species. The analysis also identifies many new E. coli binding sites and genes likely to be functional for CRP.

AB - The cAMP response protein (CRP) is a transcription factor known to regulate many genes in Escherichia coli. Computational studies of transcription factor binding to DNA are usually based on a simple matrix model of sequence-dependent binding energy. For CRP, this model predicts many binding sites that are not known to be functional. If they are indeed spurious, the underlying binding model is called into question. We use a species comparison method to assess the functionality of a population of such predicted CRP sites in E. coli. We compare them with orthologous sites in Salmonella typhimurium identified independently by CLUSTALW alignment, and find a dependence of mutation probability on position in the site. This dependence increases with predicted site binding energy. The positions where mutation is most strongly suppressed are those where mutation would have the biggest effect on predicted binding energy. This finding suggests that many of the novel sites are functional, that the matrix model correctly estimates their binding strength, and that calculated CRP binding strength is the quantity that is conserved between species. The analysis also identifies many new E. coli binding sites and genes likely to be functional for CRP.

UR - http://www.scopus.com/inward/record.url?scp=1442306129&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=1442306129&partnerID=8YFLogxK

U2 - 10.1073/pnas.0308628100

DO - 10.1073/pnas.0308628100

M3 - Article

C2 - 14983022

AN - SCOPUS:1442306129

VL - 101

SP - 2404

EP - 2409

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

SN - 0027-8424

IS - 8

ER -