A vast amount of public RNA-sequencing datasets have been generated and used widely to study transcriptome mechanisms. These data offer precious opportunity for advancing biological research in transcriptome studies such as alternative splicing. We report the first large-scale integrated analysis of RNA-Seq data of splicing factors for systematically identifying key factors in diseases and biological processes. We analyzed 1,321 RNA-Seq libraries of various mouse tissues and cell lines, comprising more than 6.6 TB sequences from 75 independent studies that experimentally manipulated 56 splicing factors. Using these data, RNA splicing signatures and gene expression signatures were computed, and signature comparison analysis identified a list of key splicing factors in Rett syndrome and cold-induced thermogenesis. We show that cold-induced RNA-binding proteins rescue the neurite outgrowth defects in Rett syndrome using neuronal morphology analysis, and we also reveal that SRSF1 and PTBP1 are required for energy expenditure in adipocytes using metabolic flux analysis. Our study provides an integrated analysis for identifying key factors in diseases and biological processes and highlights the importance of public data resources for identifying hypotheses for experimental testing.
ASJC Scopus subject areas
- Statistics and Probability
- Information Systems
- Computer Science Applications
- Statistics, Probability and Uncertainty
- Library and Information Sciences