ScBFA: Modeling detection patterns to mitigate technical noise in large-scale single-cell genomics data

Ruoxin Li, Gerald Quon

Research output: Contribution to journalArticle

Abstract

Technical variation in feature measurements, such as gene expression and locus accessibility, is a key challenge of large-scale single-cell genomic datasets. We show that this technical variation in both scRNA-seq and scATAC-seq datasets can be mitigated by analyzing feature detection patterns alone and ignoring feature quantification measurements. This result holds when datasets have low detection noise relative to quantification noise. We demonstrate state-of-the-art performance of detection pattern models using our new framework, scBFA, for both cell type identification and trajectory inference. Performance gains can also be realized in one line of R code in existing pipelines.

Original languageEnglish (US)
Article number193
JournalGenome Biology
Volume20
Issue number1
DOIs
StatePublished - Sep 9 2019

Fingerprint

Genomics
Noise
genomics
Small Cytoplasmic RNA
modeling
cells
accessibility
gene expression
trajectories
trajectory
Gene Expression
loci
Datasets
detection

Keywords

  • Cell type identification
  • Dimensionality reduction
  • Gene detection
  • Gene quantification
  • scATAC-seq
  • scRNA-seq
  • Technical noise
  • Trajectory inference
  • Variable gene selection

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Genetics
  • Cell Biology

Cite this

ScBFA : Modeling detection patterns to mitigate technical noise in large-scale single-cell genomics data. / Li, Ruoxin; Quon, Gerald.

In: Genome Biology, Vol. 20, No. 1, 193, 09.09.2019.

Research output: Contribution to journalArticle

@article{1aa90abcf69243f0bbab022e611b7957,
title = "ScBFA: Modeling detection patterns to mitigate technical noise in large-scale single-cell genomics data",
abstract = "Technical variation in feature measurements, such as gene expression and locus accessibility, is a key challenge of large-scale single-cell genomic datasets. We show that this technical variation in both scRNA-seq and scATAC-seq datasets can be mitigated by analyzing feature detection patterns alone and ignoring feature quantification measurements. This result holds when datasets have low detection noise relative to quantification noise. We demonstrate state-of-the-art performance of detection pattern models using our new framework, scBFA, for both cell type identification and trajectory inference. Performance gains can also be realized in one line of R code in existing pipelines.",
keywords = "Cell type identification, Dimensionality reduction, Gene detection, Gene quantification, scATAC-seq, scRNA-seq, Technical noise, Trajectory inference, Variable gene selection",
author = "Ruoxin Li and Gerald Quon",
year = "2019",
month = "9",
day = "9",
doi = "10.1186/s13059-019-1806-0",
language = "English (US)",
volume = "20",
journal = "Genome Biology",
issn = "1465-6914",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - ScBFA

T2 - Modeling detection patterns to mitigate technical noise in large-scale single-cell genomics data

AU - Li, Ruoxin

AU - Quon, Gerald

PY - 2019/9/9

Y1 - 2019/9/9

N2 - Technical variation in feature measurements, such as gene expression and locus accessibility, is a key challenge of large-scale single-cell genomic datasets. We show that this technical variation in both scRNA-seq and scATAC-seq datasets can be mitigated by analyzing feature detection patterns alone and ignoring feature quantification measurements. This result holds when datasets have low detection noise relative to quantification noise. We demonstrate state-of-the-art performance of detection pattern models using our new framework, scBFA, for both cell type identification and trajectory inference. Performance gains can also be realized in one line of R code in existing pipelines.

AB - Technical variation in feature measurements, such as gene expression and locus accessibility, is a key challenge of large-scale single-cell genomic datasets. We show that this technical variation in both scRNA-seq and scATAC-seq datasets can be mitigated by analyzing feature detection patterns alone and ignoring feature quantification measurements. This result holds when datasets have low detection noise relative to quantification noise. We demonstrate state-of-the-art performance of detection pattern models using our new framework, scBFA, for both cell type identification and trajectory inference. Performance gains can also be realized in one line of R code in existing pipelines.

KW - Cell type identification

KW - Dimensionality reduction

KW - Gene detection

KW - Gene quantification

KW - scATAC-seq

KW - scRNA-seq

KW - Technical noise

KW - Trajectory inference

KW - Variable gene selection

UR - http://www.scopus.com/inward/record.url?scp=85071974984&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071974984&partnerID=8YFLogxK

U2 - 10.1186/s13059-019-1806-0

DO - 10.1186/s13059-019-1806-0

M3 - Article

C2 - 31500668

AN - SCOPUS:85071974984

VL - 20

JO - Genome Biology

JF - Genome Biology

SN - 1465-6914

IS - 1

M1 - 193

ER -