Soft windowing application to improve analysis of high-throughput phenotyping data

Hamed Haselimashhadi, Jeremy C. Mason, Violeta Munoz-Fuentes, Federico López-Gómez, Kolawole Babalola, Elif F. Acar, Vivek Kumar, Jacqui White, Ann M. Flenniken, Ruairidh King, Ewan Straiton, John Richard Seavitt, Angelina Gaspero, Arturo Garza, Audrey E. Christianson, Chih Wei Hsu, Corey L. Reynolds, Denise G. Lanza, Isabel Lorenzo, Jennie R. GreenJuan J. Gallegos, Ritu Bohat, Rodney C. Samaco, Surabi Veeraragavan, Jong Kyoung Kim, Gregor Miller, Helmult Fuchs, Lillian Garrett, Lore Becker, Yeon Kyung Kang, David Clary, Soo Young Cho, Masaru Tamura, Nobuhiko Tanaka, Kyung Dong Soo, Alexandr Bezginov, Ghina Bou About, Marie France Champy, Laurent Vasseur, Sophie Leblanc, Hamid Meziane, Mohammed Selloum, Patrick T. Reilly, Nadine Spielmann, Holger Maier, Valerie Gailus-Durner, Tania Sorg, Masuya Hiroshi, Obata Yuichi, Jason D. Heaney, Mary E. Dickinson, Wurst Wolfgang, Glauco P. Tocchini-Valentini, Kevin C.Kent Lloyd, Colin McKerlie, Je Kyung Seong, Herault Yann, Martin Hrabé De Angelis, Steve D.M. Brown, Damian Smedley, Paul Flicek, Ann Marie Mallon, Helen Parkinson, Terrence F. Meehan, Russell Schwartz

Research output: Contribution to journalArticle

Abstract

Motivation: High-throughput phenomic projects generate complex data from small treatment and large control groups that increase the power of the analyses but introduce variation over time. A method is needed to utlize a set of temporally local controls that maximizes analytic power while minimizing noise from unspecified environmental factors. Results: Here we introduce 'soft windowing', a methodological approach that selects a window of time that includes the most appropriate controls for analysis. Using phenotype data from the International Mouse Phenotyping Consortium (IMPC), adaptive windows were applied such that control data collected proximally to mutants were assigned the maximal weight, while data collected earlier or later had less weight. We applied this method to IMPC data and compared the results with those obtained from a standard non-windowed approach. Validation was performed using a resampling approach in which we demonstrate a 10% reduction of false positives from 2.5 million analyses. We applied the method to our production analysis pipeline that establishes genotype-phenotype associations by comparing mutant versus control data. We report an increase of 30% in significant P-values, as well as linkage to 106 versus 99 disease models via phenotype overlap with the soft-windowed and non-windowed approaches, respectively, from a set of 2082 mutant mouse lines. Our method is generalizable and can benefit large-scale human phenomic projects such as the UK Biobank and the All of Us resources.

Original languageEnglish (US)
Pages (from-to)1492-1500
Number of pages9
JournalBioinformatics
Volume36
Issue number5
DOIs
StatePublished - Mar 1 2020

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint Dive into the research topics of 'Soft windowing application to improve analysis of high-throughput phenotyping data'. Together they form a unique fingerprint.

  • Cite this

    Haselimashhadi, H., Mason, J. C., Munoz-Fuentes, V., López-Gómez, F., Babalola, K., Acar, E. F., Kumar, V., White, J., Flenniken, A. M., King, R., Straiton, E., Seavitt, J. R., Gaspero, A., Garza, A., Christianson, A. E., Hsu, C. W., Reynolds, C. L., Lanza, D. G., Lorenzo, I., ... Schwartz, R. (2020). Soft windowing application to improve analysis of high-throughput phenotyping data. Bioinformatics, 36(5), 1492-1500. https://doi.org/10.1093/bioinformatics/btz744