Introducing W.A.T.E.R.S.

A Workflow for the Alignment, Taxonomy, and Ecology of Ribosomal Sequences

Amber L. Hartman, Sean Riddle, Timothy McPhillips, Bertram Ludäscher, Jonathan A Eisen

Research output: Contribution to journalArticle

27 Citations (Scopus)

Abstract

Background: For more than two decades microbiologists have used a highly conserved microbial gene as a phylogenetic marker for bacteria and archaea. The small-subunit ribosomal RNA gene, also known as 16 S rRNA, is encoded by ribosomal DNA, 16 S rDNA, and has provided a powerful comparative tool to microbial ecologists. Over time, the microbial ecology field has matured from small-scale studies in a select number of environments to massive collections of sequence data that are paired with dozens of corresponding collection variables. As the complexity of data and tool sets have grown, the need for flexible automation and maintenance of the core processes of 16 S rDNA sequence analysis has increased correspondingly.Results: We present WATERS, an integrated approach for 16 S rDNA analysis that bundles a suite of publicly available 16 S rDNA analysis software tools into a single software package. The "toolkit" includes sequence alignment, chimera removal, OTU determination, taxonomy assignment, phylogentic tree construction as well as a host of ecological analysis and visualization tools. WATERS employs a flexible, collection-oriented 'workflow' approach using the open-source Kepler system as a platform.Conclusions: By packaging available software tools into a single automated workflow, WATERS simplifies 16 S rDNA analyses, especially for those without specialized bioinformatics, programming expertise. In addition, WATERS, like some of the newer comprehensive rRNA analysis tools, allows researchers to minimize the time dedicated to carrying out tedious informatics steps and to focus their attention instead on the biological interpretation of the results. One advantage of WATERS over other comprehensive tools is that the use of the Kepler workflow system facilitates result interpretation and reproducibility via a data provenance sub-system. Furthermore, new "actors" can be added to the workflow as desired and we see WATERS as an initial seed for a sizeable and growing repository of interoperable, easy-to-combine tools for asking increasingly complex microbial ecology questions.

Original languageEnglish (US)
Article number317
JournalBMC Bioinformatics
Volume11
DOIs
StatePublished - Jun 12 2010

Fingerprint

Workflow
Taxonomies
Ecology
Taxonomy
Ribosomal DNA
Work Flow
Alignment
Kepler
Software
Software Tools
Microbial Genes
Genes
Gene
Informatics
Provenance
Ribosomal RNA
Sequence Alignment
Automation
Archaea
Reproducibility

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Structural Biology
  • Applied Mathematics

Cite this

Introducing W.A.T.E.R.S. A Workflow for the Alignment, Taxonomy, and Ecology of Ribosomal Sequences. / Hartman, Amber L.; Riddle, Sean; McPhillips, Timothy; Ludäscher, Bertram; Eisen, Jonathan A.

In: BMC Bioinformatics, Vol. 11, 317, 12.06.2010.

Research output: Contribution to journalArticle

Hartman, Amber L. ; Riddle, Sean ; McPhillips, Timothy ; Ludäscher, Bertram ; Eisen, Jonathan A. / Introducing W.A.T.E.R.S. A Workflow for the Alignment, Taxonomy, and Ecology of Ribosomal Sequences. In: BMC Bioinformatics. 2010 ; Vol. 11.
@article{b0249eb28d8e4c119789ffc2f9240ae8,
title = "Introducing W.A.T.E.R.S.: A Workflow for the Alignment, Taxonomy, and Ecology of Ribosomal Sequences",
abstract = "Background: For more than two decades microbiologists have used a highly conserved microbial gene as a phylogenetic marker for bacteria and archaea. The small-subunit ribosomal RNA gene, also known as 16 S rRNA, is encoded by ribosomal DNA, 16 S rDNA, and has provided a powerful comparative tool to microbial ecologists. Over time, the microbial ecology field has matured from small-scale studies in a select number of environments to massive collections of sequence data that are paired with dozens of corresponding collection variables. As the complexity of data and tool sets have grown, the need for flexible automation and maintenance of the core processes of 16 S rDNA sequence analysis has increased correspondingly.Results: We present WATERS, an integrated approach for 16 S rDNA analysis that bundles a suite of publicly available 16 S rDNA analysis software tools into a single software package. The {"}toolkit{"} includes sequence alignment, chimera removal, OTU determination, taxonomy assignment, phylogentic tree construction as well as a host of ecological analysis and visualization tools. WATERS employs a flexible, collection-oriented 'workflow' approach using the open-source Kepler system as a platform.Conclusions: By packaging available software tools into a single automated workflow, WATERS simplifies 16 S rDNA analyses, especially for those without specialized bioinformatics, programming expertise. In addition, WATERS, like some of the newer comprehensive rRNA analysis tools, allows researchers to minimize the time dedicated to carrying out tedious informatics steps and to focus their attention instead on the biological interpretation of the results. One advantage of WATERS over other comprehensive tools is that the use of the Kepler workflow system facilitates result interpretation and reproducibility via a data provenance sub-system. Furthermore, new {"}actors{"} can be added to the workflow as desired and we see WATERS as an initial seed for a sizeable and growing repository of interoperable, easy-to-combine tools for asking increasingly complex microbial ecology questions.",
author = "Hartman, {Amber L.} and Sean Riddle and Timothy McPhillips and Bertram Lud{\"a}scher and Eisen, {Jonathan A}",
year = "2010",
month = "6",
day = "12",
doi = "10.1186/1471-2105-11-317",
language = "English (US)",
volume = "11",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Introducing W.A.T.E.R.S.

T2 - A Workflow for the Alignment, Taxonomy, and Ecology of Ribosomal Sequences

AU - Hartman, Amber L.

AU - Riddle, Sean

AU - McPhillips, Timothy

AU - Ludäscher, Bertram

AU - Eisen, Jonathan A

PY - 2010/6/12

Y1 - 2010/6/12

N2 - Background: For more than two decades microbiologists have used a highly conserved microbial gene as a phylogenetic marker for bacteria and archaea. The small-subunit ribosomal RNA gene, also known as 16 S rRNA, is encoded by ribosomal DNA, 16 S rDNA, and has provided a powerful comparative tool to microbial ecologists. Over time, the microbial ecology field has matured from small-scale studies in a select number of environments to massive collections of sequence data that are paired with dozens of corresponding collection variables. As the complexity of data and tool sets have grown, the need for flexible automation and maintenance of the core processes of 16 S rDNA sequence analysis has increased correspondingly.Results: We present WATERS, an integrated approach for 16 S rDNA analysis that bundles a suite of publicly available 16 S rDNA analysis software tools into a single software package. The "toolkit" includes sequence alignment, chimera removal, OTU determination, taxonomy assignment, phylogentic tree construction as well as a host of ecological analysis and visualization tools. WATERS employs a flexible, collection-oriented 'workflow' approach using the open-source Kepler system as a platform.Conclusions: By packaging available software tools into a single automated workflow, WATERS simplifies 16 S rDNA analyses, especially for those without specialized bioinformatics, programming expertise. In addition, WATERS, like some of the newer comprehensive rRNA analysis tools, allows researchers to minimize the time dedicated to carrying out tedious informatics steps and to focus their attention instead on the biological interpretation of the results. One advantage of WATERS over other comprehensive tools is that the use of the Kepler workflow system facilitates result interpretation and reproducibility via a data provenance sub-system. Furthermore, new "actors" can be added to the workflow as desired and we see WATERS as an initial seed for a sizeable and growing repository of interoperable, easy-to-combine tools for asking increasingly complex microbial ecology questions.

AB - Background: For more than two decades microbiologists have used a highly conserved microbial gene as a phylogenetic marker for bacteria and archaea. The small-subunit ribosomal RNA gene, also known as 16 S rRNA, is encoded by ribosomal DNA, 16 S rDNA, and has provided a powerful comparative tool to microbial ecologists. Over time, the microbial ecology field has matured from small-scale studies in a select number of environments to massive collections of sequence data that are paired with dozens of corresponding collection variables. As the complexity of data and tool sets have grown, the need for flexible automation and maintenance of the core processes of 16 S rDNA sequence analysis has increased correspondingly.Results: We present WATERS, an integrated approach for 16 S rDNA analysis that bundles a suite of publicly available 16 S rDNA analysis software tools into a single software package. The "toolkit" includes sequence alignment, chimera removal, OTU determination, taxonomy assignment, phylogentic tree construction as well as a host of ecological analysis and visualization tools. WATERS employs a flexible, collection-oriented 'workflow' approach using the open-source Kepler system as a platform.Conclusions: By packaging available software tools into a single automated workflow, WATERS simplifies 16 S rDNA analyses, especially for those without specialized bioinformatics, programming expertise. In addition, WATERS, like some of the newer comprehensive rRNA analysis tools, allows researchers to minimize the time dedicated to carrying out tedious informatics steps and to focus their attention instead on the biological interpretation of the results. One advantage of WATERS over other comprehensive tools is that the use of the Kepler workflow system facilitates result interpretation and reproducibility via a data provenance sub-system. Furthermore, new "actors" can be added to the workflow as desired and we see WATERS as an initial seed for a sizeable and growing repository of interoperable, easy-to-combine tools for asking increasingly complex microbial ecology questions.

UR - http://www.scopus.com/inward/record.url?scp=77953313654&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77953313654&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-11-317

DO - 10.1186/1471-2105-11-317

M3 - Article

VL - 11

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 317

ER -