The entirety of all protein coding sequences is reported to represent a small fraction (~2%) of the mouse and human genomes; the vast majority of the rest of the genome is presumed to be repetitive elements (REs). In this study, the C57BL/6J mouse reference genome was subjected to an unbiased RE mining to establish a whole-genome profile of RE occurrence and arrangement. The C57BL/6J mouse genome was fragmented into an initial set of 5,321 units of 0.5 Mb, and surveyed for REs using unbiased self-alignment and dot-matrix protocols. The survey revealed that individual chromosomes had unique profiles of RE arrangement structures, named RE arrays. The RE populations in certain genomic regions were arranged into various forms of complexly organized structures using combinations of direct and/or inverse repeats. Some of these RE arrays spanned stretches of over 2 Mb, which may contribute to the structural configuration of the respective genomic regions. There were substantial differences in RE density among the 21 chromosomes, with chromosome Y being the most densely populated. In addition, the RE array population in the mouse chromosomes X and Y was substantially different from those of the reference human chromosomes. Conversion of the dot-matrix data pertaining to a tandem 13-repeat structure within the Ch7.032 genome unit into a line map of known REs revealed a repeat unit of ~11.3 Kb as a mosaic of six different RE types. The data obtained from this study allowed for a comprehensive RE profiling, including the establishment of a library of RE arrays, of the reference mouse genome. Some of these RE arrays may participate in a spectrum of normal and disease biology that are specific for mice.
ASJC Scopus subject areas
- Agricultural and Biological Sciences(all)
- Biochemistry, Genetics and Molecular Biology(all)