Motivation: The diverse functionalities of RNA can be attributed to its capacity to form complex and varied structures. The recent proliferation of new structure probing techniques coupled with high-throughput sequencing has helped RNA studies expand in both scope and depth. Despite differences in techniques, most experiments face similar challenges in reproducibility due to the stochastic nature of chemical probing and sequencing. As these protocols expand to transcriptome-wide studies, quality control becomes a more daunting task. General and efficient methodologies are needed to quantify variability and quality in the wide range of current and emerging structure probing experiments. Results: We develop metrics to rapidly and quantitatively evaluate data quality from structure probing experiments, demonstrating their efficacy on both small synthetic libraries and transcriptome-wide datasets. We use a signal-to-noise ratio concept to evaluate replicate agreement, which has the capacity to identify high-quality data. We also consider and compare two methods to assess variability inherent in probing experiments, which we then utilize to evaluate the coverage adjustments needed to meet desired quality. The developed metrics and tools will be useful in summarizing large-scale datasets and will help standardize quality control in the field.
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics