GC skew is a measure of the strand asymmetry in the distribution of guanines and cytosines. GC skew favors R-loops, a type of three stranded nucleic acid structures that form upon annealing of an RNA strand to one strand of DNA, creating a persistent RNA:DNA hybrid. Previous studies show that GC skew is prevalent at thousands of human CpG island (CGI) promoters and transcription termination regions, which correspond to hotspots of R-loop formation. Here, we investigated the conservation of GC skew patterns in 60 sequenced chordates genomes. We report that GC skew is a conserved sequence characteristic of the CGI promoter class in vertebrates. Furthermore, we reveal that promoter GC skew peaks at the exon 1/ intron1 junction and that it is highly correlated with gene age and CGI promoter strength. Our data also show that GC skew is predictive of unmethylated CGI promoters in a range of vertebrate species and that it imparts significant DNA hypomethylation for promoters with intermediate CpG densities. Finally, we observed that terminal GC skew is conserved for a subset of vertebrate genes that tend to be located significantly closer to their downstream neighbors, consistent with a role for R-loop formation in transcription termination.
ASJC Scopus subject areas