Motivation: Metabolite fingerprinting is a technology for providing information from spectra of total compositions of metabolites. Here, spectra acquisitions by microchip-based nanoflow-direct-infusion QTOF mass spectrometry, a simple and high throughput technique, is tested for its informative power. As a simple test case we are using Arabidopsis thaliana crosses. The question is how metabolite fingerprinting reflects the biological background. In many applications the classical principal component analysis (PCA) is used for detecting relevant information. Here a modern alternative is introduced - the independent component analysis (ICA). Due to its independence condition, ICA is more suitable for our questions than PCA. However, ICA has not been developed for a small number of high-dimensional samples, therefore a strategy is needed to overcome this limitation. Results: To apply ICA successfully it is essential first to reduce the high dimension of the dataset, by using PCA. The number of principal components determines the quality of ICA significantly, therefore we propose a criterion for estimating the optimal dimension automatically. The kurtosis measure is used to order the extracted components to our interest. Applied to our A. thaliana data, ICA detects three relevant factors, two biological and one technical, and clearly outperforms the PCA.
ASJC Scopus subject areas
- Clinical Biochemistry
- Computer Science Applications
- Computational Theory and Mathematics