The relation between indel length and functional divergence: A formal study

Raheleh Salari, Alexander Schönhuth, Fereydoun Hormozdiari, Artem Cherkasov, S. Cenk Sahinalp

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Although insertions and deletions (indels) are a common type of evolutionary sequence variation, their origins and their functional consequences have not been comprehensively understood. There is evidence that, on one hand, classical alignment procedures only roughly reflect the evolutionary processes and, on the other hand, that they cause structural changes in the proteins' surfaces. We first demonstrate how to identify alignment gaps that have been introduced by evolution to a statistical significant degree, by means of a novel, sound statistical framework, based on pair hidden Markov models (HMMs). Second, we examine paralogous protein pairs in E. coli, obtained by computation of classical global alignments. Distinguishing between indel and non-indel pairs, according to our novel statistics, revealed that, despite having the same sequence identity, indel pairs are significantly less functionally similar than non-indel pairs, as measured by recently suggested GO based functional distances. This suggests that indels cause more severe functional changes than other types of sequence variation and that indel statistics should be taken into additional account to assess functional similarity between paralogous protein pairs.

Original languageEnglish (US)
Title of host publicationAlgorithms in Bioinformatics - 8th International Workshop, WABI 2008, Proceedings
Pages330-341
Number of pages12
DOIs
StatePublished - Nov 28 2008
Externally publishedYes
Event8th International Workshop on Algorithms in Bioinformatics, WABI 2008 - Karlsruhe, Germany
Duration: Sep 15 2008Sep 19 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5251 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other8th International Workshop on Algorithms in Bioinformatics, WABI 2008
CountryGermany
CityKarlsruhe
Period9/15/089/19/08

Fingerprint

Divergence
Proteins
Statistics
Alignment
Hidden Markov models
Protein
Escherichia coli
Deletion
Insertion
Acoustic waves
Structural Change
Escherichia Coli
Markov Model
Demonstrate

Keywords

  • Alignment statistics
  • Deletions
  • GO
  • Insertions
  • Pair Hidden Markov Models

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Salari, R., Schönhuth, A., Hormozdiari, F., Cherkasov, A., & Sahinalp, S. C. (2008). The relation between indel length and functional divergence: A formal study. In Algorithms in Bioinformatics - 8th International Workshop, WABI 2008, Proceedings (pp. 330-341). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5251 LNBI). https://doi.org/10.1007/978-3-540-87361-7_28

The relation between indel length and functional divergence : A formal study. / Salari, Raheleh; Schönhuth, Alexander; Hormozdiari, Fereydoun; Cherkasov, Artem; Sahinalp, S. Cenk.

Algorithms in Bioinformatics - 8th International Workshop, WABI 2008, Proceedings. 2008. p. 330-341 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5251 LNBI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Salari, R, Schönhuth, A, Hormozdiari, F, Cherkasov, A & Sahinalp, SC 2008, The relation between indel length and functional divergence: A formal study. in Algorithms in Bioinformatics - 8th International Workshop, WABI 2008, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5251 LNBI, pp. 330-341, 8th International Workshop on Algorithms in Bioinformatics, WABI 2008, Karlsruhe, Germany, 9/15/08. https://doi.org/10.1007/978-3-540-87361-7_28
Salari R, Schönhuth A, Hormozdiari F, Cherkasov A, Sahinalp SC. The relation between indel length and functional divergence: A formal study. In Algorithms in Bioinformatics - 8th International Workshop, WABI 2008, Proceedings. 2008. p. 330-341. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-540-87361-7_28
Salari, Raheleh ; Schönhuth, Alexander ; Hormozdiari, Fereydoun ; Cherkasov, Artem ; Sahinalp, S. Cenk. / The relation between indel length and functional divergence : A formal study. Algorithms in Bioinformatics - 8th International Workshop, WABI 2008, Proceedings. 2008. pp. 330-341 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{70d11978c75f4e548749979e0e831b67,
title = "The relation between indel length and functional divergence: A formal study",
abstract = "Although insertions and deletions (indels) are a common type of evolutionary sequence variation, their origins and their functional consequences have not been comprehensively understood. There is evidence that, on one hand, classical alignment procedures only roughly reflect the evolutionary processes and, on the other hand, that they cause structural changes in the proteins' surfaces. We first demonstrate how to identify alignment gaps that have been introduced by evolution to a statistical significant degree, by means of a novel, sound statistical framework, based on pair hidden Markov models (HMMs). Second, we examine paralogous protein pairs in E. coli, obtained by computation of classical global alignments. Distinguishing between indel and non-indel pairs, according to our novel statistics, revealed that, despite having the same sequence identity, indel pairs are significantly less functionally similar than non-indel pairs, as measured by recently suggested GO based functional distances. This suggests that indels cause more severe functional changes than other types of sequence variation and that indel statistics should be taken into additional account to assess functional similarity between paralogous protein pairs.",
keywords = "Alignment statistics, Deletions, GO, Insertions, Pair Hidden Markov Models",
author = "Raheleh Salari and Alexander Sch{\"o}nhuth and Fereydoun Hormozdiari and Artem Cherkasov and Sahinalp, {S. Cenk}",
year = "2008",
month = "11",
day = "28",
doi = "10.1007/978-3-540-87361-7_28",
language = "English (US)",
isbn = "3540873600",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "330--341",
booktitle = "Algorithms in Bioinformatics - 8th International Workshop, WABI 2008, Proceedings",

}

TY - GEN

T1 - The relation between indel length and functional divergence

T2 - A formal study

AU - Salari, Raheleh

AU - Schönhuth, Alexander

AU - Hormozdiari, Fereydoun

AU - Cherkasov, Artem

AU - Sahinalp, S. Cenk

PY - 2008/11/28

Y1 - 2008/11/28

N2 - Although insertions and deletions (indels) are a common type of evolutionary sequence variation, their origins and their functional consequences have not been comprehensively understood. There is evidence that, on one hand, classical alignment procedures only roughly reflect the evolutionary processes and, on the other hand, that they cause structural changes in the proteins' surfaces. We first demonstrate how to identify alignment gaps that have been introduced by evolution to a statistical significant degree, by means of a novel, sound statistical framework, based on pair hidden Markov models (HMMs). Second, we examine paralogous protein pairs in E. coli, obtained by computation of classical global alignments. Distinguishing between indel and non-indel pairs, according to our novel statistics, revealed that, despite having the same sequence identity, indel pairs are significantly less functionally similar than non-indel pairs, as measured by recently suggested GO based functional distances. This suggests that indels cause more severe functional changes than other types of sequence variation and that indel statistics should be taken into additional account to assess functional similarity between paralogous protein pairs.

AB - Although insertions and deletions (indels) are a common type of evolutionary sequence variation, their origins and their functional consequences have not been comprehensively understood. There is evidence that, on one hand, classical alignment procedures only roughly reflect the evolutionary processes and, on the other hand, that they cause structural changes in the proteins' surfaces. We first demonstrate how to identify alignment gaps that have been introduced by evolution to a statistical significant degree, by means of a novel, sound statistical framework, based on pair hidden Markov models (HMMs). Second, we examine paralogous protein pairs in E. coli, obtained by computation of classical global alignments. Distinguishing between indel and non-indel pairs, according to our novel statistics, revealed that, despite having the same sequence identity, indel pairs are significantly less functionally similar than non-indel pairs, as measured by recently suggested GO based functional distances. This suggests that indels cause more severe functional changes than other types of sequence variation and that indel statistics should be taken into additional account to assess functional similarity between paralogous protein pairs.

KW - Alignment statistics

KW - Deletions

KW - GO

KW - Insertions

KW - Pair Hidden Markov Models

UR - http://www.scopus.com/inward/record.url?scp=56649117362&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=56649117362&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-87361-7_28

DO - 10.1007/978-3-540-87361-7_28

M3 - Conference contribution

AN - SCOPUS:56649117362

SN - 3540873600

SN - 9783540873600

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 330

EP - 341

BT - Algorithms in Bioinformatics - 8th International Workshop, WABI 2008, Proceedings

ER -