Most computational gene-finding methods in current use are derived from the fields of natural language processing and speech recognition. These latter fields are concerned with parsing spoken or written language into functional components such as nouns, verbs, and phrases of various types. The parsing task is governed by a set of syntax rules that dictate which linguistic elements may immediately follow each other in well-formed sentences-for example, The problem of gene-finding is rather similar to linguistic parsing in that we wish to partition a sequence of letters into elements of biological relevance, such as exons, introns, and the intergenic regions separating genes. That is, we wish to not only find the genes, but also to predict their internal exon-intron structure so that the encoded protein(s) may be deduced. Figure 5.1 illustrates this internal structure for a typical gene.
ASJC Scopus subject areas
- Computer Science(all)