### Abstract

Background: The size and magnitude of the metabolome, the ratio between individual metabolites and the response of metabolic networks is controlled by multiple cellular factors. A tight control over metabolite ratios will be reflected by a linear relationship of pairs of metabolite due to the flexibility of metabolic pathways. Hence, unbiased detection and validation of linear metabolic variance can be interpreted in terms of biological control. For robust analyses, criteria for rejecting or accepting linearities need to be developed despite technical measurement errors. The entirety of all pair wise linear metabolic relationships then yields insights into the network of cellular regulation. Results: The Bayesian law was applied for detecting linearities that are validated by explaining the residues by the degree of technical measurement errors. Test statistics were developed and the algorithm was tested on simulated data using 3-150 samples and 0-100% technical error. Under the null hypothesis of the existence of a linear relationship, type I errors remained below 5% for data sets consisting of more than four samples, whereas the type II error rate quickly raised with increasing technical errors. Conversely, a filter was developed to balance the error rates in the opposite direction. A minimum of 20 biological replicates is recommended if technical errors remain below 20% relative standard deviation and if thresholds for false error rates are acceptable at less than 5%. The algorithm was proven to be robust against outliers, unlike Pearson's correlations. Conclusion: The algorithm facilitates finding linear relationships in complex datasets, which is radically different from estimating linearity parameters from given linear relationships. Without filter, it provides high sensitivity and fair specificity. If the filter is activated, high specificity but only fair sensitivity is yielded. Total error rates are more favorable with deactivated filters, and hence, metabolomic networks should be generated without the filter. In addition, Bayesian likelihoods facilitate the detection of multiple linear dependencies between two variables. This property of the algorithm enables its use as a discovery tool and to generate novel hypotheses of the existence of otherwise hidden biological factors.

Original language | English (US) |
---|---|

Article number | 162 |

Journal | BMC Bioinformatics |

Volume | 8 |

DOIs | |

State | Published - May 21 2007 |

### Fingerprint

### ASJC Scopus subject areas

- Medicine(all)
- Structural Biology
- Applied Mathematics

### Cite this

*BMC Bioinformatics*,

*8*, [162]. https://doi.org/10.1186/1471-2105-8-162

**Robust detection and verification of linear relationships to generate metabolic networks using estimates of technical errors.** / Kose, Frank; Budczies, Jan; Holschneider, Matthias; Fiehn, Oliver.

Research output: Contribution to journal › Article

*BMC Bioinformatics*, vol. 8, 162. https://doi.org/10.1186/1471-2105-8-162

}

TY - JOUR

T1 - Robust detection and verification of linear relationships to generate metabolic networks using estimates of technical errors

AU - Kose, Frank

AU - Budczies, Jan

AU - Holschneider, Matthias

AU - Fiehn, Oliver

PY - 2007/5/21

Y1 - 2007/5/21

N2 - Background: The size and magnitude of the metabolome, the ratio between individual metabolites and the response of metabolic networks is controlled by multiple cellular factors. A tight control over metabolite ratios will be reflected by a linear relationship of pairs of metabolite due to the flexibility of metabolic pathways. Hence, unbiased detection and validation of linear metabolic variance can be interpreted in terms of biological control. For robust analyses, criteria for rejecting or accepting linearities need to be developed despite technical measurement errors. The entirety of all pair wise linear metabolic relationships then yields insights into the network of cellular regulation. Results: The Bayesian law was applied for detecting linearities that are validated by explaining the residues by the degree of technical measurement errors. Test statistics were developed and the algorithm was tested on simulated data using 3-150 samples and 0-100% technical error. Under the null hypothesis of the existence of a linear relationship, type I errors remained below 5% for data sets consisting of more than four samples, whereas the type II error rate quickly raised with increasing technical errors. Conversely, a filter was developed to balance the error rates in the opposite direction. A minimum of 20 biological replicates is recommended if technical errors remain below 20% relative standard deviation and if thresholds for false error rates are acceptable at less than 5%. The algorithm was proven to be robust against outliers, unlike Pearson's correlations. Conclusion: The algorithm facilitates finding linear relationships in complex datasets, which is radically different from estimating linearity parameters from given linear relationships. Without filter, it provides high sensitivity and fair specificity. If the filter is activated, high specificity but only fair sensitivity is yielded. Total error rates are more favorable with deactivated filters, and hence, metabolomic networks should be generated without the filter. In addition, Bayesian likelihoods facilitate the detection of multiple linear dependencies between two variables. This property of the algorithm enables its use as a discovery tool and to generate novel hypotheses of the existence of otherwise hidden biological factors.

AB - Background: The size and magnitude of the metabolome, the ratio between individual metabolites and the response of metabolic networks is controlled by multiple cellular factors. A tight control over metabolite ratios will be reflected by a linear relationship of pairs of metabolite due to the flexibility of metabolic pathways. Hence, unbiased detection and validation of linear metabolic variance can be interpreted in terms of biological control. For robust analyses, criteria for rejecting or accepting linearities need to be developed despite technical measurement errors. The entirety of all pair wise linear metabolic relationships then yields insights into the network of cellular regulation. Results: The Bayesian law was applied for detecting linearities that are validated by explaining the residues by the degree of technical measurement errors. Test statistics were developed and the algorithm was tested on simulated data using 3-150 samples and 0-100% technical error. Under the null hypothesis of the existence of a linear relationship, type I errors remained below 5% for data sets consisting of more than four samples, whereas the type II error rate quickly raised with increasing technical errors. Conversely, a filter was developed to balance the error rates in the opposite direction. A minimum of 20 biological replicates is recommended if technical errors remain below 20% relative standard deviation and if thresholds for false error rates are acceptable at less than 5%. The algorithm was proven to be robust against outliers, unlike Pearson's correlations. Conclusion: The algorithm facilitates finding linear relationships in complex datasets, which is radically different from estimating linearity parameters from given linear relationships. Without filter, it provides high sensitivity and fair specificity. If the filter is activated, high specificity but only fair sensitivity is yielded. Total error rates are more favorable with deactivated filters, and hence, metabolomic networks should be generated without the filter. In addition, Bayesian likelihoods facilitate the detection of multiple linear dependencies between two variables. This property of the algorithm enables its use as a discovery tool and to generate novel hypotheses of the existence of otherwise hidden biological factors.

UR - http://www.scopus.com/inward/record.url?scp=34250782109&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34250782109&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-8-162

DO - 10.1186/1471-2105-8-162

M3 - Article

C2 - 17517139

AN - SCOPUS:34250782109

VL - 8

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 162

ER -