Abstract
Abstract: This paper describes informatics for cross-sample analysis with comprehensive two-dimensional gas chromatography (GCxGC) and high-resolution mass spectrometry (HRMS). GCxGC-HRMS analysis produces large data sets that are rich with information, but highly complex. The size of the data and volume of information requires automated processing for comprehensive cross-sample analysis, but the complexity poses a challenge for developing robust methods. The approach developed here analyzes GCxGC-HRMS data from multiple samples to extract a feature template that comprehensively captures the pattern of peaks detected in the retention-times plane. Then, for each sample chromatogram, the template is geometrically transformed to align with the detected peak pattern and generate a set of feature measurements for cross-sample analyses such as sample classification and biomarker discovery. The approach avoids the intractable problem of comprehensive peak matching by using a few reliable peaks for alignment and peak-based retention-plane windows to define comprehensive features that can be reliably matched for cross-sample analysis. The informatics are demonstrated with a set of 18 samples from breast-cancer tumors, each from different individuals, six each for Grades 1-3. The features allow classification that matches grading by a cancer pathologist with 78% success in leave-one-out cross-validation experiments. The HRMS signatures of the features of interest can be examined for determining elemental compositions and identifying compounds.
Original language | English (US) |
---|---|
Pages (from-to) | 1279-1288 |
Number of pages | 10 |
Journal | Talanta |
Volume | 83 |
Issue number | 4 |
DOIs | |
State | Published - 2011 |
Fingerprint
Keywords
- Biomarker discovery
- Cheminformatics
- Comprehensive two-dimensional gas chromatography
- High-resolution mass spectrometry
- Metabolomics
- Sample classification
ASJC Scopus subject areas
- Chemistry(all)
Cite this
Informatics for cross-sample analysis with comprehensive two-dimensional gas chromatography and high-resolution mass spectrometry (GCxGC-HRMS). / Reichenbach, Stephen E.; Tian, Xue; Tao, Qingping; Ledford, Edward B.; Wu, Zhanpin; Fiehn, Oliver.
In: Talanta, Vol. 83, No. 4, 2011, p. 1279-1288.Research output: Contribution to journal › Article
}
TY - JOUR
T1 - Informatics for cross-sample analysis with comprehensive two-dimensional gas chromatography and high-resolution mass spectrometry (GCxGC-HRMS)
AU - Reichenbach, Stephen E.
AU - Tian, Xue
AU - Tao, Qingping
AU - Ledford, Edward B.
AU - Wu, Zhanpin
AU - Fiehn, Oliver
PY - 2011
Y1 - 2011
N2 - Abstract: This paper describes informatics for cross-sample analysis with comprehensive two-dimensional gas chromatography (GCxGC) and high-resolution mass spectrometry (HRMS). GCxGC-HRMS analysis produces large data sets that are rich with information, but highly complex. The size of the data and volume of information requires automated processing for comprehensive cross-sample analysis, but the complexity poses a challenge for developing robust methods. The approach developed here analyzes GCxGC-HRMS data from multiple samples to extract a feature template that comprehensively captures the pattern of peaks detected in the retention-times plane. Then, for each sample chromatogram, the template is geometrically transformed to align with the detected peak pattern and generate a set of feature measurements for cross-sample analyses such as sample classification and biomarker discovery. The approach avoids the intractable problem of comprehensive peak matching by using a few reliable peaks for alignment and peak-based retention-plane windows to define comprehensive features that can be reliably matched for cross-sample analysis. The informatics are demonstrated with a set of 18 samples from breast-cancer tumors, each from different individuals, six each for Grades 1-3. The features allow classification that matches grading by a cancer pathologist with 78% success in leave-one-out cross-validation experiments. The HRMS signatures of the features of interest can be examined for determining elemental compositions and identifying compounds.
AB - Abstract: This paper describes informatics for cross-sample analysis with comprehensive two-dimensional gas chromatography (GCxGC) and high-resolution mass spectrometry (HRMS). GCxGC-HRMS analysis produces large data sets that are rich with information, but highly complex. The size of the data and volume of information requires automated processing for comprehensive cross-sample analysis, but the complexity poses a challenge for developing robust methods. The approach developed here analyzes GCxGC-HRMS data from multiple samples to extract a feature template that comprehensively captures the pattern of peaks detected in the retention-times plane. Then, for each sample chromatogram, the template is geometrically transformed to align with the detected peak pattern and generate a set of feature measurements for cross-sample analyses such as sample classification and biomarker discovery. The approach avoids the intractable problem of comprehensive peak matching by using a few reliable peaks for alignment and peak-based retention-plane windows to define comprehensive features that can be reliably matched for cross-sample analysis. The informatics are demonstrated with a set of 18 samples from breast-cancer tumors, each from different individuals, six each for Grades 1-3. The features allow classification that matches grading by a cancer pathologist with 78% success in leave-one-out cross-validation experiments. The HRMS signatures of the features of interest can be examined for determining elemental compositions and identifying compounds.
KW - Biomarker discovery
KW - Cheminformatics
KW - Comprehensive two-dimensional gas chromatography
KW - High-resolution mass spectrometry
KW - Metabolomics
KW - Sample classification
UR - http://www.scopus.com/inward/record.url?scp=79251597939&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79251597939&partnerID=8YFLogxK
U2 - 10.1016/j.talanta.2010.09.057
DO - 10.1016/j.talanta.2010.09.057
M3 - Article
C2 - 21215864
AN - SCOPUS:79251597939
VL - 83
SP - 1279
EP - 1288
JO - Talanta
JF - Talanta
SN - 0039-9140
IS - 4
ER -