Robust detection and verification of linear relationships to generate metabolic networks using estimates of technical errors

Frank Kose, Jan Budczies, Matthias Holschneider, Oliver Fiehn

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Background: The size and magnitude of the metabolome, the ratio between individual metabolites and the response of metabolic networks is controlled by multiple cellular factors. A tight control over metabolite ratios will be reflected by a linear relationship of pairs of metabolite due to the flexibility of metabolic pathways. Hence, unbiased detection and validation of linear metabolic variance can be interpreted in terms of biological control. For robust analyses, criteria for rejecting or accepting linearities need to be developed despite technical measurement errors. The entirety of all pair wise linear metabolic relationships then yields insights into the network of cellular regulation. Results: The Bayesian law was applied for detecting linearities that are validated by explaining the residues by the degree of technical measurement errors. Test statistics were developed and the algorithm was tested on simulated data using 3-150 samples and 0-100% technical error. Under the null hypothesis of the existence of a linear relationship, type I errors remained below 5% for data sets consisting of more than four samples, whereas the type II error rate quickly raised with increasing technical errors. Conversely, a filter was developed to balance the error rates in the opposite direction. A minimum of 20 biological replicates is recommended if technical errors remain below 20% relative standard deviation and if thresholds for false error rates are acceptable at less than 5%. The algorithm was proven to be robust against outliers, unlike Pearson's correlations. Conclusion: The algorithm facilitates finding linear relationships in complex datasets, which is radically different from estimating linearity parameters from given linear relationships. Without filter, it provides high sensitivity and fair specificity. If the filter is activated, high specificity but only fair sensitivity is yielded. Total error rates are more favorable with deactivated filters, and hence, metabolomic networks should be generated without the filter. In addition, Bayesian likelihoods facilitate the detection of multiple linear dependencies between two variables. This property of the algorithm enables its use as a discovery tool and to generate novel hypotheses of the existence of otherwise hidden biological factors.

Original languageEnglish (US)
Article number162
JournalBMC Bioinformatics
Volume8
DOIs
StatePublished - May 21 2007

Fingerprint

Metabolic Network
Metabolic Networks and Pathways
Error Rate
Filter
Estimate
Linearity
Metabolites
Metabolomics
Metabolome
Measurement Error
Biological Factors
Specificity
Measurement errors
Biological Control
Type II error
Pearson Correlation
Type I error
Sensitivity and Specificity
Relationships
Null hypothesis

ASJC Scopus subject areas

  • Medicine(all)
  • Structural Biology
  • Applied Mathematics

Cite this

Robust detection and verification of linear relationships to generate metabolic networks using estimates of technical errors. / Kose, Frank; Budczies, Jan; Holschneider, Matthias; Fiehn, Oliver.

In: BMC Bioinformatics, Vol. 8, 162, 21.05.2007.

Research output: Contribution to journalArticle

@article{1cead9b3893e4ca69c4794b9a86e2ee3,
title = "Robust detection and verification of linear relationships to generate metabolic networks using estimates of technical errors",
abstract = "Background: The size and magnitude of the metabolome, the ratio between individual metabolites and the response of metabolic networks is controlled by multiple cellular factors. A tight control over metabolite ratios will be reflected by a linear relationship of pairs of metabolite due to the flexibility of metabolic pathways. Hence, unbiased detection and validation of linear metabolic variance can be interpreted in terms of biological control. For robust analyses, criteria for rejecting or accepting linearities need to be developed despite technical measurement errors. The entirety of all pair wise linear metabolic relationships then yields insights into the network of cellular regulation. Results: The Bayesian law was applied for detecting linearities that are validated by explaining the residues by the degree of technical measurement errors. Test statistics were developed and the algorithm was tested on simulated data using 3-150 samples and 0-100{\%} technical error. Under the null hypothesis of the existence of a linear relationship, type I errors remained below 5{\%} for data sets consisting of more than four samples, whereas the type II error rate quickly raised with increasing technical errors. Conversely, a filter was developed to balance the error rates in the opposite direction. A minimum of 20 biological replicates is recommended if technical errors remain below 20{\%} relative standard deviation and if thresholds for false error rates are acceptable at less than 5{\%}. The algorithm was proven to be robust against outliers, unlike Pearson's correlations. Conclusion: The algorithm facilitates finding linear relationships in complex datasets, which is radically different from estimating linearity parameters from given linear relationships. Without filter, it provides high sensitivity and fair specificity. If the filter is activated, high specificity but only fair sensitivity is yielded. Total error rates are more favorable with deactivated filters, and hence, metabolomic networks should be generated without the filter. In addition, Bayesian likelihoods facilitate the detection of multiple linear dependencies between two variables. This property of the algorithm enables its use as a discovery tool and to generate novel hypotheses of the existence of otherwise hidden biological factors.",
author = "Frank Kose and Jan Budczies and Matthias Holschneider and Oliver Fiehn",
year = "2007",
month = "5",
day = "21",
doi = "10.1186/1471-2105-8-162",
language = "English (US)",
volume = "8",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Robust detection and verification of linear relationships to generate metabolic networks using estimates of technical errors

AU - Kose, Frank

AU - Budczies, Jan

AU - Holschneider, Matthias

AU - Fiehn, Oliver

PY - 2007/5/21

Y1 - 2007/5/21

N2 - Background: The size and magnitude of the metabolome, the ratio between individual metabolites and the response of metabolic networks is controlled by multiple cellular factors. A tight control over metabolite ratios will be reflected by a linear relationship of pairs of metabolite due to the flexibility of metabolic pathways. Hence, unbiased detection and validation of linear metabolic variance can be interpreted in terms of biological control. For robust analyses, criteria for rejecting or accepting linearities need to be developed despite technical measurement errors. The entirety of all pair wise linear metabolic relationships then yields insights into the network of cellular regulation. Results: The Bayesian law was applied for detecting linearities that are validated by explaining the residues by the degree of technical measurement errors. Test statistics were developed and the algorithm was tested on simulated data using 3-150 samples and 0-100% technical error. Under the null hypothesis of the existence of a linear relationship, type I errors remained below 5% for data sets consisting of more than four samples, whereas the type II error rate quickly raised with increasing technical errors. Conversely, a filter was developed to balance the error rates in the opposite direction. A minimum of 20 biological replicates is recommended if technical errors remain below 20% relative standard deviation and if thresholds for false error rates are acceptable at less than 5%. The algorithm was proven to be robust against outliers, unlike Pearson's correlations. Conclusion: The algorithm facilitates finding linear relationships in complex datasets, which is radically different from estimating linearity parameters from given linear relationships. Without filter, it provides high sensitivity and fair specificity. If the filter is activated, high specificity but only fair sensitivity is yielded. Total error rates are more favorable with deactivated filters, and hence, metabolomic networks should be generated without the filter. In addition, Bayesian likelihoods facilitate the detection of multiple linear dependencies between two variables. This property of the algorithm enables its use as a discovery tool and to generate novel hypotheses of the existence of otherwise hidden biological factors.

AB - Background: The size and magnitude of the metabolome, the ratio between individual metabolites and the response of metabolic networks is controlled by multiple cellular factors. A tight control over metabolite ratios will be reflected by a linear relationship of pairs of metabolite due to the flexibility of metabolic pathways. Hence, unbiased detection and validation of linear metabolic variance can be interpreted in terms of biological control. For robust analyses, criteria for rejecting or accepting linearities need to be developed despite technical measurement errors. The entirety of all pair wise linear metabolic relationships then yields insights into the network of cellular regulation. Results: The Bayesian law was applied for detecting linearities that are validated by explaining the residues by the degree of technical measurement errors. Test statistics were developed and the algorithm was tested on simulated data using 3-150 samples and 0-100% technical error. Under the null hypothesis of the existence of a linear relationship, type I errors remained below 5% for data sets consisting of more than four samples, whereas the type II error rate quickly raised with increasing technical errors. Conversely, a filter was developed to balance the error rates in the opposite direction. A minimum of 20 biological replicates is recommended if technical errors remain below 20% relative standard deviation and if thresholds for false error rates are acceptable at less than 5%. The algorithm was proven to be robust against outliers, unlike Pearson's correlations. Conclusion: The algorithm facilitates finding linear relationships in complex datasets, which is radically different from estimating linearity parameters from given linear relationships. Without filter, it provides high sensitivity and fair specificity. If the filter is activated, high specificity but only fair sensitivity is yielded. Total error rates are more favorable with deactivated filters, and hence, metabolomic networks should be generated without the filter. In addition, Bayesian likelihoods facilitate the detection of multiple linear dependencies between two variables. This property of the algorithm enables its use as a discovery tool and to generate novel hypotheses of the existence of otherwise hidden biological factors.

UR - http://www.scopus.com/inward/record.url?scp=34250782109&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34250782109&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-8-162

DO - 10.1186/1471-2105-8-162

M3 - Article

VL - 8

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 162

ER -