Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products

Christopher W. Beitel, Lutz Froenicke, Jenna M. Lang, Ian F Korf, Richard W Michelmore, Jonathan A Eisen, Aaron E. Darling

Research output: Contribution to journalArticle

42 Citations (Scopus)

Abstract

Metagenomics is a valuable tool for the study of microbial communities but has been limited by the difficulty of "binning" the resulting sequences into groups corresponding to the individual species and strains that constitute the community. Moreover, there are presently no methods to track the flow of mobile DNA elements such as plasmids through communities or to determine which of these are co-localized within the same cell. We address these limitations by applying Hi-C, a technology originally designed for the study of three-dimensional genome structure in eukaryotes, to measure the cellular co-localization of DNA sequences. We leveraged Hi-C data generated from a simple synthetic metagenome sample to accurately cluster metagenome assembly contigs into groups that contain nearly complete genomes of each species. The Hi-C data also reliably associated plasmids with the chromosomes of their host and with each other. We further demonstrated that Hi-C data provides a long-range signal of strain-specific genotypes, indicating such data may be useful for high-resolution genotyping of microbial populations. Our work demonstrates that Hi-C sequencing data provide valuable information for metagenome analyses that are not currently obtainable by other methods. This metagenomic Hi-C method could facilitate future studies of the fine-scale population structure of microbes, as well as studies of how antibiotic resistance plasmids (or other genetic elements) mobilize in microbial communities. The method is not limited to microbiology; the genetic architecture of other heterogeneous populations of cells could also be studied with this technique.

Original languageEnglish (US)
Article numbere415
JournalPeerJ
Volume2014
Issue number1
DOIs
StatePublished - 2014

Fingerprint

Metagenome
Deconvolution
Ligation
plasmids
Plasmids
Metagenomics
Genes
Microbiology
DNA sequences
Genome
microbial communities
Chromosomes
Population
Microbial Drug Resistance
Eukaryota
methodology
Cells
genome
Anti-Bacterial Agents
microbiology

Keywords

  • Genome scaffolding
  • Haplotype phasing
  • Hi-C
  • Markov clustering
  • Metagenome assembly
  • Metagenomics
  • Microbial ecology
  • Plasmids
  • Strain differentiation
  • Synthetic microbial communities

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)
  • Neuroscience(all)

Cite this

Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products. / Beitel, Christopher W.; Froenicke, Lutz; Lang, Jenna M.; Korf, Ian F; Michelmore, Richard W; Eisen, Jonathan A; Darling, Aaron E.

In: PeerJ, Vol. 2014, No. 1, e415, 2014.

Research output: Contribution to journalArticle

Beitel, Christopher W. ; Froenicke, Lutz ; Lang, Jenna M. ; Korf, Ian F ; Michelmore, Richard W ; Eisen, Jonathan A ; Darling, Aaron E. / Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products. In: PeerJ. 2014 ; Vol. 2014, No. 1.
@article{15df3e668f134c3fb6bad106c0c383da,
title = "Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products",
abstract = "Metagenomics is a valuable tool for the study of microbial communities but has been limited by the difficulty of {"}binning{"} the resulting sequences into groups corresponding to the individual species and strains that constitute the community. Moreover, there are presently no methods to track the flow of mobile DNA elements such as plasmids through communities or to determine which of these are co-localized within the same cell. We address these limitations by applying Hi-C, a technology originally designed for the study of three-dimensional genome structure in eukaryotes, to measure the cellular co-localization of DNA sequences. We leveraged Hi-C data generated from a simple synthetic metagenome sample to accurately cluster metagenome assembly contigs into groups that contain nearly complete genomes of each species. The Hi-C data also reliably associated plasmids with the chromosomes of their host and with each other. We further demonstrated that Hi-C data provides a long-range signal of strain-specific genotypes, indicating such data may be useful for high-resolution genotyping of microbial populations. Our work demonstrates that Hi-C sequencing data provide valuable information for metagenome analyses that are not currently obtainable by other methods. This metagenomic Hi-C method could facilitate future studies of the fine-scale population structure of microbes, as well as studies of how antibiotic resistance plasmids (or other genetic elements) mobilize in microbial communities. The method is not limited to microbiology; the genetic architecture of other heterogeneous populations of cells could also be studied with this technique.",
keywords = "Genome scaffolding, Haplotype phasing, Hi-C, Markov clustering, Metagenome assembly, Metagenomics, Microbial ecology, Plasmids, Strain differentiation, Synthetic microbial communities",
author = "Beitel, {Christopher W.} and Lutz Froenicke and Lang, {Jenna M.} and Korf, {Ian F} and Michelmore, {Richard W} and Eisen, {Jonathan A} and Darling, {Aaron E.}",
year = "2014",
doi = "10.7717/peerj.415",
language = "English (US)",
volume = "2014",
journal = "PeerJ",
issn = "2167-8359",
publisher = "PeerJ",
number = "1",

}

TY - JOUR

T1 - Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products

AU - Beitel, Christopher W.

AU - Froenicke, Lutz

AU - Lang, Jenna M.

AU - Korf, Ian F

AU - Michelmore, Richard W

AU - Eisen, Jonathan A

AU - Darling, Aaron E.

PY - 2014

Y1 - 2014

N2 - Metagenomics is a valuable tool for the study of microbial communities but has been limited by the difficulty of "binning" the resulting sequences into groups corresponding to the individual species and strains that constitute the community. Moreover, there are presently no methods to track the flow of mobile DNA elements such as plasmids through communities or to determine which of these are co-localized within the same cell. We address these limitations by applying Hi-C, a technology originally designed for the study of three-dimensional genome structure in eukaryotes, to measure the cellular co-localization of DNA sequences. We leveraged Hi-C data generated from a simple synthetic metagenome sample to accurately cluster metagenome assembly contigs into groups that contain nearly complete genomes of each species. The Hi-C data also reliably associated plasmids with the chromosomes of their host and with each other. We further demonstrated that Hi-C data provides a long-range signal of strain-specific genotypes, indicating such data may be useful for high-resolution genotyping of microbial populations. Our work demonstrates that Hi-C sequencing data provide valuable information for metagenome analyses that are not currently obtainable by other methods. This metagenomic Hi-C method could facilitate future studies of the fine-scale population structure of microbes, as well as studies of how antibiotic resistance plasmids (or other genetic elements) mobilize in microbial communities. The method is not limited to microbiology; the genetic architecture of other heterogeneous populations of cells could also be studied with this technique.

AB - Metagenomics is a valuable tool for the study of microbial communities but has been limited by the difficulty of "binning" the resulting sequences into groups corresponding to the individual species and strains that constitute the community. Moreover, there are presently no methods to track the flow of mobile DNA elements such as plasmids through communities or to determine which of these are co-localized within the same cell. We address these limitations by applying Hi-C, a technology originally designed for the study of three-dimensional genome structure in eukaryotes, to measure the cellular co-localization of DNA sequences. We leveraged Hi-C data generated from a simple synthetic metagenome sample to accurately cluster metagenome assembly contigs into groups that contain nearly complete genomes of each species. The Hi-C data also reliably associated plasmids with the chromosomes of their host and with each other. We further demonstrated that Hi-C data provides a long-range signal of strain-specific genotypes, indicating such data may be useful for high-resolution genotyping of microbial populations. Our work demonstrates that Hi-C sequencing data provide valuable information for metagenome analyses that are not currently obtainable by other methods. This metagenomic Hi-C method could facilitate future studies of the fine-scale population structure of microbes, as well as studies of how antibiotic resistance plasmids (or other genetic elements) mobilize in microbial communities. The method is not limited to microbiology; the genetic architecture of other heterogeneous populations of cells could also be studied with this technique.

KW - Genome scaffolding

KW - Haplotype phasing

KW - Hi-C

KW - Markov clustering

KW - Metagenome assembly

KW - Metagenomics

KW - Microbial ecology

KW - Plasmids

KW - Strain differentiation

KW - Synthetic microbial communities

UR - http://www.scopus.com/inward/record.url?scp=84903835646&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84903835646&partnerID=8YFLogxK

U2 - 10.7717/peerj.415

DO - 10.7717/peerj.415

M3 - Article

VL - 2014

JO - PeerJ

JF - PeerJ

SN - 2167-8359

IS - 1

M1 - e415

ER -