To better understand genome function and evolution in Mycobacterium tuberculosis, the genomes of 100 epidemiologically well characterized clinical isolates were interrogated by DNA microarrays and sequencing. We identified 68 different large-sequence polymorphisms (comprising 186,137 bp, or 4.2% of the genome) that are present in H37Rv, but absent from one or more clinical isolates. A total of 224 genes (5.5%), including genes in all major functional categories, were found to be partially or completely deleted. Deletions are not distributed randomly throughout the genome but instead tend to be aggregated. The distinct deletions in some aggregations appear in closely related isolates, suggesting a genomically disruptive process specific to an individual mycobacterial lineage. Other genomic aggregations include distinct deletions that appear in phylogenetically unrelated isolates, suggesting that a genomic region is vulnerable throughout the species. Although the deletions identified here are evidently inessential to the causation of disease (they are found in active clinical cases), their frequency spectrum suggests that most are weakly deleterious to the pathogen. For some deletions, short-term evolutionary pressure due to the host immune system or antibiotics may favor the elimination of genes, whereas longer-term physiological requirements maintain the genes in the population.
|Original language||English (US)|
|Number of pages||6|
|Journal||Proceedings of the National Academy of Sciences of the United States of America|
|State||Published - Apr 6 2004|
ASJC Scopus subject areas