ScAlign

A tool for alignment, integration, and rare cell identification from scRNA-seq data

Nelson Johansen, Gerald Quon

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

scRNA-seq dataset integration occurs in different contexts, such as the identification of cell type-specific differences in gene expression across conditions or species, or batch effect correction. We present scAlign, an unsupervised deep learning method for data integration that can incorporate partial, overlapping, or a complete set of cell labels, and estimate per-cell differences in gene expression across datasets. scAlign performance is state-of-the-art and robust to cross-dataset variation in cell type-specific expression and cell type composition. We demonstrate that scAlign reveals gene expression programs for rare populations of malaria parasites. Our framework is widely applicable to integration challenges in other domains.

Original languageEnglish (US)
Article number166
JournalGenome Biology
Volume20
Issue number1
DOIs
StatePublished - Aug 14 2019

Fingerprint

Small Cytoplasmic RNA
gene expression
Gene Expression
malaria
cells
parasite
learning
Population Control
Malaria
Parasites
alignment
Learning
parasites

Keywords

  • Alignment
  • Batch effects
  • Data harmonization
  • Data integration
  • Deep learning
  • Domain adaptation
  • Neural networks
  • Response to stimulus
  • scRNA-seq

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Genetics
  • Cell Biology

Cite this

ScAlign : A tool for alignment, integration, and rare cell identification from scRNA-seq data. / Johansen, Nelson; Quon, Gerald.

In: Genome Biology, Vol. 20, No. 1, 166, 14.08.2019.

Research output: Contribution to journalArticle

@article{a16e760c049740889e213dc6a91845b6,
title = "ScAlign: A tool for alignment, integration, and rare cell identification from scRNA-seq data",
abstract = "scRNA-seq dataset integration occurs in different contexts, such as the identification of cell type-specific differences in gene expression across conditions or species, or batch effect correction. We present scAlign, an unsupervised deep learning method for data integration that can incorporate partial, overlapping, or a complete set of cell labels, and estimate per-cell differences in gene expression across datasets. scAlign performance is state-of-the-art and robust to cross-dataset variation in cell type-specific expression and cell type composition. We demonstrate that scAlign reveals gene expression programs for rare populations of malaria parasites. Our framework is widely applicable to integration challenges in other domains.",
keywords = "Alignment, Batch effects, Data harmonization, Data integration, Deep learning, Domain adaptation, Neural networks, Response to stimulus, scRNA-seq",
author = "Nelson Johansen and Gerald Quon",
year = "2019",
month = "8",
day = "14",
doi = "10.1186/s13059-019-1766-4",
language = "English (US)",
volume = "20",
journal = "Genome Biology",
issn = "1465-6914",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - ScAlign

T2 - A tool for alignment, integration, and rare cell identification from scRNA-seq data

AU - Johansen, Nelson

AU - Quon, Gerald

PY - 2019/8/14

Y1 - 2019/8/14

N2 - scRNA-seq dataset integration occurs in different contexts, such as the identification of cell type-specific differences in gene expression across conditions or species, or batch effect correction. We present scAlign, an unsupervised deep learning method for data integration that can incorporate partial, overlapping, or a complete set of cell labels, and estimate per-cell differences in gene expression across datasets. scAlign performance is state-of-the-art and robust to cross-dataset variation in cell type-specific expression and cell type composition. We demonstrate that scAlign reveals gene expression programs for rare populations of malaria parasites. Our framework is widely applicable to integration challenges in other domains.

AB - scRNA-seq dataset integration occurs in different contexts, such as the identification of cell type-specific differences in gene expression across conditions or species, or batch effect correction. We present scAlign, an unsupervised deep learning method for data integration that can incorporate partial, overlapping, or a complete set of cell labels, and estimate per-cell differences in gene expression across datasets. scAlign performance is state-of-the-art and robust to cross-dataset variation in cell type-specific expression and cell type composition. We demonstrate that scAlign reveals gene expression programs for rare populations of malaria parasites. Our framework is widely applicable to integration challenges in other domains.

KW - Alignment

KW - Batch effects

KW - Data harmonization

KW - Data integration

KW - Deep learning

KW - Domain adaptation

KW - Neural networks

KW - Response to stimulus

KW - scRNA-seq

UR - http://www.scopus.com/inward/record.url?scp=85071042979&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071042979&partnerID=8YFLogxK

U2 - 10.1186/s13059-019-1766-4

DO - 10.1186/s13059-019-1766-4

M3 - Article

VL - 20

JO - Genome Biology

JF - Genome Biology

SN - 1465-6914

IS - 1

M1 - 166

ER -