Abstract
The 3′ ends of mRNAs terminate with a poly(A) tail. This post-transcriptional modification is directed by sequence features present in the 3′-untranslated region (3′-UTR). We have undertaken a computational analysis of 3′ end formation in Caenorhabditis elegans. By aligning cDNAs that diverge from genomic sequence at the poly(A) tract, we accurately identified a large set of true cleavage sites. When there are many transcripts aligned to a particular locus, local variation of the cleavage site over a span of a few bases is frequently observed. We find that in addition to the well-known AAUAAA motif there are several regions with distinct nucleotide compositional biases. We propose a generalized hidden Markov model that describes sequence features in C.elegans 3′-UTRs. We find that a computer program employing this model accurately predicts experimentally observed 3′ ends even when there are multiple AAUAAA motifs and multiple cleavage sites. We have made available a complete set of polyadenylation site predictions for the C.elegans genome, including a subset of 6570 supported by aligned transcripts.
Original language | English (US) |
---|---|
Pages (from-to) | 3392-3399 |
Number of pages | 8 |
Journal | Nucleic Acids Research |
Volume | 32 |
Issue number | 11 |
DOIs | |
State | Published - 2004 |
Externally published | Yes |
ASJC Scopus subject areas
- Genetics