A large fraction of the gene mapping studies performed today have as an ultimate goal the cloning of a phenotypically defined locus based on its chromosomal position. This process of positional cloning (discussed in detail in Section 10.3) is still rather tedious, and it is usually dependent on two experimental tools that exist in the form of panels. The first panel consists of DNA samples obtained from the offspring of a cross set up to uncover recombination events between and among the phenotypically defined locus and nearby marker loci. The types of crosses that can be used and the number of offspring to be analyzed are topics of the following chapter. In all cases, analysis of a large number of offspring is required to have a reasonable chance at identifying recombination breakpoints that are close to the locus of interest.
Identification of the recombination breakpoints that lie closest to the locus of interest is dependent on the availability of a sufficient number of region-specific polymorphic DNA markers. This is the second panel of tools. Ideally, one would like to have at hand a set of markers, such as microsatellites, distributed at average distances of a few hundred kilobases apart. This would provide sufficient resolution for the mapping of recombination sites (Section 7.2.3 and Figure 7.5) as well for the recovery of overlapping YAC clones (Section 10.3.3).
Before 1994, most regions of the genome were not covered to this degree, and it was nearly always necessary for investigators to pursue special strategies to increase the size of the region-specific marker panel. However, as this section is being written, the average whole genome density of mapped microsatellite markers has reached one per megabase, and within a year's time, it will be one per 500 kb. Furthermore, contigs of overlapping YAC clones have been developed for two complete human chromosome arms 21q and the Y (Chumakov et al., 1992; Foote et al., 1992), and it is only a matter of time before additional human chromosomes and mouse chromosomes are added to this list. If an ordered, whole chromosome library is available, one can go directly to the clones that span the region of interest to derive polymorphic marker loci. This could be readily accomplished, for example, by screening for microsatellites within these clones.
Thus, what follows will soon be of historical interest only for mouse geneticists: approaches that investigators have used in the past to generate region-specific panels of DNA markers. These approaches have been included here for two reasons. First, to enable all readers to appreciate earlier work in this area of mouse molecular genetics. Second, to describe tools that may still be critical for geneticists working on organisms whose genomes are less well characterized than that of the mouse.
All rational approaches to region-specific cloning are based on fractionating the genome such that only a single chromosome or defined subchromosomal region is accessible prior to the recovery of clones that can be tested for use as DNA markers. Genome fractionation protocols fall into several classes with certain advantages and disadvantages. The major classes of genome fractionation methods are described in the following subsections.
The most direct means for genome fractionation is based on "microscopic dissection" (or microdissection as it is commonly called) of the region of interest from spreads of metaphase chromosomes on glass slides. This technique was first developed for the isolation of polytene chromosome bands from Drosophila salivary gland chromosomes (Scalenghe et al., 1981), and was later modified for use with mammalian chromosomes (Röhme et al., 1984). To aid in the identification of the correct chromosome, one can start with cells from mice in which the chromosome is marked karyotypically within the context of a single Robertsonian chromosome (Röhme et al., 1984, see Section 5.2). Microdissection is an extremely tedious protocol that is difficult to master, and it is this difficulty that is its main drawback. However, the most skilled practitioners can circumscribe the region of dissection to a few chromosomal bands. This can represent a 100-fold enrichment from the whole genome, with almost no contamination from unlinked chromosomal regions. Although chromosome microdissection was developed prior to PCR, it is when the two techniques are combined that the power of this approach becomes apparent with the potential for generating thousands of markers from a very well-defined subchromosomal interval (Ludecke et al., 1989; Bohlander et al., 1992). Detailed protocols for performing chromosome microdissection followed by cloning have been described in a monograph by Hagag and Viola (1993).
A less tedious protocol for direct genomic fractionation is based on the utilization of a fluorescent-activated cell sorter (FACS) to separate the metaphase chromosome of interest away from all other chromosomes (Gray et al., 1987). The starting material for this protocol must come from a cell line in which this chromosome is physically distinguishable from all others. Sources of such chromosomes include cells from animals with an appropriate Robertsonian translocation (Bahary et al., 1992) or interspecific somatic cell hybrid lines that contain only the foreign chromosome or subregion of interest (Section 10.2.3). The material obtained from a typical FACS sort is likely to be 50-70% pure, equivalent to an enrichment factor of about ten-fold, with the remaining material due to contaminants from other chromosomes. The resolution of the FACS chromosome fractionation protocol is clearly much less than that possible with microdissection, and this is its main drawback. The main advantage of this protocol is that a greater amount of material can be recovered and used directly to construct chromosome-specific large-insert genomic libraries (Bahary et al., 1992).
A variety of somatic cell hybrid lines have been generated that contain only one or a few mouse chromosomes on the genetic background of a different species as described in Section 10.2.2. The host genomes used most often to create somatic cell hybrid lines of use to mouse geneticists are Chinese hamster and human. The main advantage of well-characterized somatic cell hybrid lines is the ease with which they can be used, and the unlimited amount of high quality material that they can provide. The main disadvantage is that mouse genomic material is not alone, but mixed together with the whole genome of another species. Thus, to derive mouse-specific clones for use as markers, one must choose a protocol that allows the discrimination of mouse sequences from these other sequences, be they hamster or human. This can be accomplished by enlisting the highly repetitive element families B1, B2 and L1, that are unique to the mouse genome. The earlier approaches along this line were based on the construction of whole genome libraries from the cell line and then screening for mouse-containing clones with one or more repeat sequences (Kasahara et al., 1987). Of course, once such repeat element clones were obtained, it was imperative to subclone unique flanking sequences for use as DNA markers. More recently, the IRS-PCR technique described in Section 8.3.5 has been used with great success in the rapid recovery of mouse-specific sequences from somatic cell hybrid lines (Simmler et al., 1991; Herman et al., 1992). With IRS-PCR, there is no need to first prepare a whole genome library.
An obvious limitation to the recovery of region-specific probes with IRS-PCR is that amplification will only occur between repetitive elements that are relatively close to each other and in the correct orientation. With the use of just the B2 primer, Herman and her colleagues (1991) were able to amplify approximately one PCR product for each megabase of mouse DNA present in somatic cell lines containing portions of the mouse X chromosome. With the use of other repeat element primers, alone or in combination, additional loci could be amplified (Simmler et al., 1991; Herman et al., 1992). PCR fragments can be readily excised from gels for cloning or for direct use as probes for linkage analysis.
Under special circumstances, other approaches can be considered for obtaining an enrichment of sequences from particular subregions of the genome. For example, if the region is contained within a defined NotI restriction fragment (or one derived from another infrequent cutter) that is sufficiently larger than the one megabase average, it would be possible to excise the portion of a pulsed field gel that contained this fragment followed by amplification (IRS-PCR or random sequence) and cloning. This procedure could provide as much as a 10-fold enrichment for sequences within a multiple-megabase region (Michiels et al., 1987).
In another approach, Hardies and colleagues (Rikke et al., 1991; Rikke and Hardies, 1991; Herman et al., 1992) have taken advantage of the concerted evolution of L1 sequences that occurs within a species (see Section 126.96.36.199) to develop specific oligonucleotides that recognize L1 subfamilies that are relatively unique to the genomes of either M. spretus or M. musculus (see Section 5.4.2). These oligonucleotides can be used to probe whole genome libraries made from animals congenic for a chromosomal region of interest from one species within the genetic background of the other species. This protocol has been validated in another laboratory (Himmelbauer and Silver, 1993) and could serve to provide a small number of new markers from those limited cases where the appropriate congenic lines have been constructed. In genetic terms, congenic strains are far superior to somatic cell hybrids because the region of interest can be more greatly circumscribed. As indicated in Figure 3.6, after ten generations of backcrossing, the differential region will have an average length of 20 cM, and after 20 generations of backcrossing, the average differential length will be reduced to 10 cM.
Finally, in theory, one should be able to enrich for a region deleted in one genome, but not another, by subtractive hybridization. This approach has been tried in various formats that are all dependent on the use of a large excess of DNA from the deleted genome to drive hybrid formation with sequences that are also present in the non-deleted or "tester genome" (Kunkel et al., 1985). If the driver sequences are tagged in some way, they can be removed from the completed reaction mixture along with the tester sequences to which they hybridized. "Target sequences" unique to the tester genome in other words, those that have been deleted from the driver genome will all be left behind in the solution ready for analysis or cloning.
In practice, this approach has never worked as well as one would like because the high complexity of the mammalian genome prevents the hybridization reaction from going to completion. Even when subtractive steps are reiterated, the target sequences have only been enriched by a factor of 100-1000 at the very most. Thus, in its original form, this approach has lost favor. More recently, Wigler and his colleagues have built upon the subtractive hybridization approach to a develop a PCR-based technique that is much more sensitive and highly resolving (Lisitsyn et al., 1993). This new technique, called representational difference analysis (RDA), can be used to purify to completion sequences that are deleted from one genome but not another that is otherwise identical. In theory, this same technique could also be used in manner analogous to that described for the L1 sequences above, for the identification and cloning of new RFLPs that are present in the differential DNA segment that distinguishes two members of a congenic pair.