The fundamental goal of molecular genetics is to understand, at the molecular level, how genotype is translated into phenotype. To accomplish this goal, investigators everywhere are busy dissecting the genome into its component parts the genes and then tracking the pathway from the gene to its product to its role in the overall scheme of life. This contemporary approach to biological understanding can be divided into two parts: first, an investigator must obtain a clone of the gene, then second, he or she can use the clone in a large variety of experiments aimed at investigating the function of the gene.
There are two very different pathways to the analysis of gene function. One pathway begins with a mutant or variant phenotype and follows this back to a clone of the guilty gene; this pathway will be discussed in the next Section. The other pathway begins with a clone of a transcription unit whose function is not understood, and proceeds to utilize this clone in various experiments aimed at uncovering gene function. With tens of thousands of uncharacterized transcription units sitting in every cDNA library, how does one go about choosing which ones to study. Often, clones have been chosen in a manner akin to a fishing expedition in which an investigator recovers clones from a cDNA library and selects a subset that show a pattern of expression among tissues or developmental stages indicative of a potential role in a particular biological process. However, an ultimate goal of the human genome project is to characterize and understand the function of all genes in the genome (Hochgeschwender, 1992). In a more directed approach toward this goal, it is possible to walk down a cloned chromosomal region and pick up each transcription unit one by one for further analysis of function.
The first step in the analysis of a newly cloned gene is always to determine its sequence and compare it with all other sequences stored in databases such as GenBank and others. Sequence homologies in and of themselves can often be used to predict characteristics of the polypeptide encoded by the new gene under investigation. In some cases, the new product will contain a "domain" with homology to a specific "peptide motif" that is associated with a particular function in groups of previously characterized polypeptides. For example, one or more peptide motifs have been identified that are characteristic of DNA binding domains, membrane-associated domains, various enzymatic activities, receptor functions, and many others. Peptide motifs are almost always degenerate amino acid sequences; they can vary in length from just three amino acids to over one hundred residues.
Even when the new gene product does not contain any previously defined peptide motifs, standard search algorithms will sometimes allow the identification of previously cloned genes that are related by descent from a common ancestral sequence. Once again, if the function of the previously characterized gene has been determined, it can be used as a starting point for understanding the function of the new gene under investigation.
Although the sequence can sometimes provide clues to gene function, further experiments will always be required to demonstrate conclusively the role played by a particular gene product in the overall scheme of life. These further experiments can take two different forms: biochemical and genetic. A biochemical investigation is often begun by cloning the open reading frame into an expression vector which is placed back into an appropriate host cell system for the "in vitro" synthesis of large quantities of the gene product. This can then be used to immunize rabbits or mice for the production of polyclonal antisera or monoclonal antibodies. These antibodies can then be used as a tool to investigate the expression and localization of the protein both among cells and within cells, and to purify the native protein from the mouse. The purified native protein can be analyzed for enzymatic activities and for its interactions with other molecules. Biochemical studies can often provide critical insight into the function of a particular polypeptide.
The genetic approach to understanding function flows from the ability to manipulate the expression of the selected gene within the mouse and then follow the phenotypic consequences of this manipulation. The two most powerful approaches to gene manipulation are based on targeted mutagenesis and the insertion of transgene constructs of any conceivable kind into the germline of the mouse; both of these approaches are discussed in depth in Chapter 6. Targeted mutagenesis allows an investigator to produce a null mutation at the locus of interest, and determine how and where the absence of the corresponding gene product affects the animal, its tissues, and its cells. The transgenic technology can be used to produce animals that misexpress the gene and/or its product in the wrong place, the wrong time, or the wrong form. The rationale for the use of both targeted mutagenesis and directed transgenesis is that by examining the perturbations in phenotype that occur in response to perturbations of the genotype, one can gain insight by contrast into the true function of the normal wild-type locus.
In some cases, the genetic approach will be the one that uncovers the function of a gene, and in other cases, it will be the biochemical approach. However, these two approaches are entirely complementary and thus together they are likely to provide more information than either one can alone.
The second pathway to deciphering the relationship between genotype and phenotype is based on the initial observation of an interesting new variant that distinguishes one group of individuals from another. Variants may be observed in the context of either deleterious mutations or polymorphic differences in a common traits such as growth, life span, disease resistance or various physiological parameters. In all of these cases, the phenotype will be available for analysis before the causative gene or genes. The process by which one moves from a phenotypic difference to the gene (or genes) responsible is referred to as positional cloning.
There are two stages in the process of positional cloning. The first stage is the focus of a major portion of this book: the use of formal linkage analysis and other genetic approaches as tools to find flanking DNA markers that must lie very close to the gene of interest. With these markers in hand, one can move to the second stage of this pathway: cloning across the region that must contain the gene responsible for the phenotype, and then identifying the gene itself apart from all other genes and non-genic sequences within this region. This second stage will be discussed in Chapter 10.
With all of the new approaches to mapping that have been developed over the last few years, it has become possible, for the first time, to follow the segregation of the whole genome from each parent to each offspring in a cross. This, in turn, has allowed investigators to consider the exciting possibility of approaching the genetic basis for quantitative, polygenic, and multifactorial traits. In fact, most common types of phenotypic differences that distinguish one individual from another are due to the interaction of alleles at more than one locus, and expression is often modified by environmental factors as well. The available inbred strains provide a treasure chest of polygenic differences that control characteristics as diverse as size, life span, reproductive performance, aggression, and levels of susceptibility or resistance to particular diseases, both infectious and inherited. The golden age of mammalian genetics beckons: the genetic components of any and all traits that show variation between different mice are now amenable to dissection with classical genetic tools that can provide a means for obtaining clones of all of the genes involved which can, in turn, be used as tools to understand each trait at the molecular level.