Arabidopsis thaliana has emerged as a model system for studies of plant genetics and development, and its genome has been targeted for sequencing by an international consortium (the Arabidopsis Genome Initiative; http://genome-www.stanford.edu/Arabidopsis/agi.html). To support the genome- sequencing effort, we fingerprinted more than 20,000 BACs (ref. 2) from two high-quality publicly available libraries, generating an estimated 17-fold redundant coverage of the genome, and used the fingerprints to nucleate assembly of the data by computer. Subsequent manual revision of the assemblies resulted in the incorporation of 19,661 fingerprinted BACs into 169 ordered sets of overlapping clones ('contigs'), each containing at least 3 clones. These contigs are ideal for parallel selection of BACs for large- scale sequencing and have supported the generation of more than 5.8 Mb of finished genome sequence submitted to GenBank; analysis of the sequence has confirmed the integrity of contigs constructed using this fingerprint data. Placement of contigs onto chromosomes can now be performed, and is being pursued by groups involved in both sequencing and positional cloning studies. To our knowledge, these data provide the first example of whole-genome random BAC fingerprint analysis of a eucaryote, and have provided a model essential to efforts aimed at generating similar databases of fingerprint contigs to support sequencing of other complex genomes, including that of human.
ASJC Scopus subject areas