Sole-Search

An integrated analysis program for peak detection and functional annotation using ChIP-seq data

Kimberly R. Blahnik, Lei Dou, Henriette O'Geen, Timothy McPhillips, Xiaoqin Xu, Alina R. Cao, Sushma Iyengar, Charles M. Nicolet, Bertram Ludäscher, Ian F Korf, Peggy J. Farnham

Research output: Contribution to journalArticle

76 Citations (Scopus)

Abstract

Next-generation sequencing is revolutionizing the identification of transcription factor binding sites throughout the human genome. However, the bioinformatics analysis of large datasets collected using chromatin immunoprecipitation and high-throughput sequencing is often a roadblock that impedes researchers in their attempts to gain biological insights from their experiments. We have developed integrated peak-calling and analysis software (Sole-Search) which is available through a user-friendly interface and (i) converts raw data into a format for visualization on a genome browser, (ii) outputs ranked peak locations using a statistically based method that overcomes the significant problem of false positives, (iii) identifies the gene nearest to each peak, (iv) classifies the location of each peak relative to gene structure, (v) provides information such as the number of binding sites per chromosome and per gene and (vi) allows the user to determine overlap between two different experiments. In addition, the program performs an analysis of amplified and deleted regions of the input genome. This software is web-based and automated, allowing easy and immediate access to all investigators. We demonstrate the utility of our software by collecting, analyzing and comparing ChIP-seq data for six different human transcription factors/cell line combinations.

Original languageEnglish (US)
Article numbergkp1012
JournalNucleic Acids Research
Volume38
Issue number3
DOIs
StatePublished - Nov 10 2009

Fingerprint

Software
Transcription Factors
Binding Sites
Research Personnel
Genome
Genes
Chromatin Immunoprecipitation
Human Genome
Computational Biology
Chromosomes
Cell Line
Datasets

ASJC Scopus subject areas

  • Genetics

Cite this

Blahnik, K. R., Dou, L., O'Geen, H., McPhillips, T., Xu, X., Cao, A. R., ... Farnham, P. J. (2009). Sole-Search: An integrated analysis program for peak detection and functional annotation using ChIP-seq data. Nucleic Acids Research, 38(3), [gkp1012]. https://doi.org/10.1093/nar/gkp1012

Sole-Search : An integrated analysis program for peak detection and functional annotation using ChIP-seq data. / Blahnik, Kimberly R.; Dou, Lei; O'Geen, Henriette; McPhillips, Timothy; Xu, Xiaoqin; Cao, Alina R.; Iyengar, Sushma; Nicolet, Charles M.; Ludäscher, Bertram; Korf, Ian F; Farnham, Peggy J.

In: Nucleic Acids Research, Vol. 38, No. 3, gkp1012, 10.11.2009.

Research output: Contribution to journalArticle

Blahnik, KR, Dou, L, O'Geen, H, McPhillips, T, Xu, X, Cao, AR, Iyengar, S, Nicolet, CM, Ludäscher, B, Korf, IF & Farnham, PJ 2009, 'Sole-Search: An integrated analysis program for peak detection and functional annotation using ChIP-seq data', Nucleic Acids Research, vol. 38, no. 3, gkp1012. https://doi.org/10.1093/nar/gkp1012
Blahnik, Kimberly R. ; Dou, Lei ; O'Geen, Henriette ; McPhillips, Timothy ; Xu, Xiaoqin ; Cao, Alina R. ; Iyengar, Sushma ; Nicolet, Charles M. ; Ludäscher, Bertram ; Korf, Ian F ; Farnham, Peggy J. / Sole-Search : An integrated analysis program for peak detection and functional annotation using ChIP-seq data. In: Nucleic Acids Research. 2009 ; Vol. 38, No. 3.
@article{b03a5c1415414d2db682f760b2ba500b,
title = "Sole-Search: An integrated analysis program for peak detection and functional annotation using ChIP-seq data",
abstract = "Next-generation sequencing is revolutionizing the identification of transcription factor binding sites throughout the human genome. However, the bioinformatics analysis of large datasets collected using chromatin immunoprecipitation and high-throughput sequencing is often a roadblock that impedes researchers in their attempts to gain biological insights from their experiments. We have developed integrated peak-calling and analysis software (Sole-Search) which is available through a user-friendly interface and (i) converts raw data into a format for visualization on a genome browser, (ii) outputs ranked peak locations using a statistically based method that overcomes the significant problem of false positives, (iii) identifies the gene nearest to each peak, (iv) classifies the location of each peak relative to gene structure, (v) provides information such as the number of binding sites per chromosome and per gene and (vi) allows the user to determine overlap between two different experiments. In addition, the program performs an analysis of amplified and deleted regions of the input genome. This software is web-based and automated, allowing easy and immediate access to all investigators. We demonstrate the utility of our software by collecting, analyzing and comparing ChIP-seq data for six different human transcription factors/cell line combinations.",
author = "Blahnik, {Kimberly R.} and Lei Dou and Henriette O'Geen and Timothy McPhillips and Xiaoqin Xu and Cao, {Alina R.} and Sushma Iyengar and Nicolet, {Charles M.} and Bertram Lud{\"a}scher and Korf, {Ian F} and Farnham, {Peggy J.}",
year = "2009",
month = "11",
day = "10",
doi = "10.1093/nar/gkp1012",
language = "English (US)",
volume = "38",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "3",

}

TY - JOUR

T1 - Sole-Search

T2 - An integrated analysis program for peak detection and functional annotation using ChIP-seq data

AU - Blahnik, Kimberly R.

AU - Dou, Lei

AU - O'Geen, Henriette

AU - McPhillips, Timothy

AU - Xu, Xiaoqin

AU - Cao, Alina R.

AU - Iyengar, Sushma

AU - Nicolet, Charles M.

AU - Ludäscher, Bertram

AU - Korf, Ian F

AU - Farnham, Peggy J.

PY - 2009/11/10

Y1 - 2009/11/10

N2 - Next-generation sequencing is revolutionizing the identification of transcription factor binding sites throughout the human genome. However, the bioinformatics analysis of large datasets collected using chromatin immunoprecipitation and high-throughput sequencing is often a roadblock that impedes researchers in their attempts to gain biological insights from their experiments. We have developed integrated peak-calling and analysis software (Sole-Search) which is available through a user-friendly interface and (i) converts raw data into a format for visualization on a genome browser, (ii) outputs ranked peak locations using a statistically based method that overcomes the significant problem of false positives, (iii) identifies the gene nearest to each peak, (iv) classifies the location of each peak relative to gene structure, (v) provides information such as the number of binding sites per chromosome and per gene and (vi) allows the user to determine overlap between two different experiments. In addition, the program performs an analysis of amplified and deleted regions of the input genome. This software is web-based and automated, allowing easy and immediate access to all investigators. We demonstrate the utility of our software by collecting, analyzing and comparing ChIP-seq data for six different human transcription factors/cell line combinations.

AB - Next-generation sequencing is revolutionizing the identification of transcription factor binding sites throughout the human genome. However, the bioinformatics analysis of large datasets collected using chromatin immunoprecipitation and high-throughput sequencing is often a roadblock that impedes researchers in their attempts to gain biological insights from their experiments. We have developed integrated peak-calling and analysis software (Sole-Search) which is available through a user-friendly interface and (i) converts raw data into a format for visualization on a genome browser, (ii) outputs ranked peak locations using a statistically based method that overcomes the significant problem of false positives, (iii) identifies the gene nearest to each peak, (iv) classifies the location of each peak relative to gene structure, (v) provides information such as the number of binding sites per chromosome and per gene and (vi) allows the user to determine overlap between two different experiments. In addition, the program performs an analysis of amplified and deleted regions of the input genome. This software is web-based and automated, allowing easy and immediate access to all investigators. We demonstrate the utility of our software by collecting, analyzing and comparing ChIP-seq data for six different human transcription factors/cell line combinations.

UR - http://www.scopus.com/inward/record.url?scp=77951230100&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77951230100&partnerID=8YFLogxK

U2 - 10.1093/nar/gkp1012

DO - 10.1093/nar/gkp1012

M3 - Article

VL - 38

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 3

M1 - gkp1012

ER -