Previous Next

Sources of Subline Divergence and their Relative Importance for Sublines of Six Major Inbred Strains of Mice 1

Donald W. Bailey

The Jackson Laboratory
Bar Harbor, Maine

A number of genetic differences among sublines of various highly inbred strains of mice have been reported ( 1, 2, 3, 4, 5, 6, 7, 8, 9). Such differences should come as no surprise, for they were predicted by the early geneticists. Nevertheless, with the number of sublines increasing, especially through development of new congenic lines and specific-pathogen-free colonies, the number of encounters with subline differences will be growing and their vitiating effects on research no doubt will be frequently felt.

Subline differences arise by the gradual differential fixation of genes at loci that were heterozygous from three possible causes: 1) contamination from outcrossing, 2) incomplete inbreeding, and 3) mutation. In this paper we shall consider the relative importance of these three sources of subline differences, and we shall see how existing sublines of six major inbred strains of mice may be expected to have diverged due to these sources.


Genetic contamination often will become obvious if a full-sib mating system is strictly adhered to, for the presence of new recessive as well as dominant coat color genes will soon make themselves evident. However, in a strain in which inbreeding has been relaxed, the presence of contaminating genes may not be so apparent. (This points out one of the hazards of not maintaining a strict mating regimen.) The number of contaminant genes that eventually contribute to fixed gene differences between subsequently derived inbred sublines will be inversely proportional to the effective number of breeders maintained during the generations of random mating between the time of the outcross and the resumption of inbreeding.

Contamination can be identified quite effectively by routine skin graft exchanges or allozyme typing for genetic quality control. Moreover, contamination can be identified as the likely source of subline divergence if the number of gene differences is significantly greater than that possible either from mutation or from residual heterozygosity, or if the new allelic differences specifically match those of the suspected contaminating strain. Contamination may indeed be the source of most existing major subline differences, but because it is based on human error we shall not consider it further here.

Residual Heterozygosity

For present considerations, we define residual heterozygosity as that proportion of loci with genotypic mating combinations that are not yet in the genetically fixed state (both parents homozygous for the same allele) at a specified generation of inbreeding. This definition is more inclusive and more appropriate for the present purpose than the commonly used definition, namely, that proportion of individuals at a given generation of inbreeding that are still heterozygous at a specified locus ( 10).

Fisher ( 11), using the concept of "junctions" and the generation matrix method, was able to estimate the number and lengths of heterogenic tracts (chromosomal segments that are still of heterozygous origin and thus potential carriers of residual heterozygosity) at specified generations of inbreeding. From these estimates he was also able to calculate the probability of no heterogenic tract remaining (complete genetic fixation, barring mutation) after progressive amounts of inbreeding ( Figure 1). We have used Fisher's method with the chromosome number of 20 and an estimated mouse genome length of 1500 cM ( 12); Fisher had used the estimate of 2500 cM. From the curve in Figure 1, we see that the probability of being completely free of heterogenic tracts does not attain even the 0.5 level until F36 nor the 0.99 level until F60. Purity is not that easily attained.

The probability of complete purity takes into account only the heterogeneity of the genomes of the original pair of mice but no subsequently arising mutations. The question that more realistically should be asked is: At what generation does residual heterozygosity become no more important than mutation as a source of subline differentiation? We can answer this question by comparing the numbers of fixed gene differences arising from the two sources.

By Fisher's method we can estimate the number and lengths of heterogenic tracts after a specified number of generations of full-sib mating as in columns 2 and 3 of Table 1. Then, if the number of structural genes in the mouse genome is 30,000 ( 12), the proportionate number of structural genes in the heterogenic tracts after different amounts of inbreeding can be estimated as in column 5 of Table 1. If only 35% of these loci were heterozygous at the time of the strain's origin ( 13) and 43% of these in turn contribute to subline differences (as based on proportions of the genotypic mating combinations and the probabilities of the alleles in these combinations becoming differentially fixed in two sublines), then the number of fixed gene differences between two sublines that branched at a given generation of inbreeding and continually maintained by full-sib mating for at least 20 more generations can be estimated as in column 6 of Table 1. [The estimates in column 4 can be calculated directly from the generation matrix without regard to chromosome number or genome length. In these calculations and all others in this paper, the effects of selection are ignored.]

The estimated 35% heterozygosity that we have employed is derived from observations on electrophoretic variants of enzymes and soluble proteins in Mus musculus ( 13). On one hand, it can be argued that heterozygosity may not yet have been as high as this in the fanciers' mouse populations from which the strains of laboratory mice arose, because the effective breeding population from time to time was probably quite small. On the other hand, many fanciers although maintaining small populations probably continually exchanged breeders and occasionally introduced variant mice from wild populations of different subspecies and in this way probably would have maintained much of the original genetic heterogeneity. However, even if the heterozygosity were reduced, the estimate of 35% seems within reason and perhaps still on the conservative side considering the arguments Lewontin ( 14) gives for such estimates being too low.


The expected number of differentially fixed genes (n) arising from new mutation is expressed by the equation:

n = (G1 + G2 - 7) μ,

where G1 and G2 are the number of generations that sublines 1 and 2, respectively, have been inbred since branching, μ is the mutation frequency per gamete per specified block of loci, and the number 7 corrects for the effect of mutations of recent generations having not yet been fixed. If one wishes to include unfixed loci, instead of subtracting 7 in the above equation, subtract 4 for recessives, and add 20 for dominants. The subject is treated in more detail in a recent paper ( 9).

If there are 30,000 structural genes ( 12) and the average mutation frequency per gamete per locus is 10-5, then the mutation rate for all structural genes would be about 0.3 per gamete, and the number eventually becoming differentially fixed in two sublines would be:

(G1 + G2 - 7) (0.3).

Mutation rate as used here is not a true mutation rate but rather a practicable one that results from a confounding of the probabilities of origination, viability, observability, and reproducibility of the mutant.

The value 10-5 is generally used as a rough estimate of spontaneous mutation rate in mammals. However, published estimates of mutation rate in mice vary greatly and often are lower than 10-5 ( 15). Still it seems reasonable to round off the estimate to this higher value for the present considerations. Most mutation rates in mice are based on loci that determine visible traits such as coat color. The number of viable mutations that actually occur at these loci is probably much greater, because surely not all mutations result in observable phenotypic changes. On the other hand, it can be argued that it is the observable changes that we are truly concerned about in subline divergence because these are what affect experiments, so why consider undetected mutations? However, as experiments deal more and more with traits at the molecular level of gene expression, more mutations with subtle effects will become evident and the apparent rate will rise. Thus, with our present state of knowledge, the rate we have chosen seems reasonable and may indeed prove to be too low.

A Comparison of Sources

From the equation of n and from values in column 6 of Table 1, we have constructed Figure 2 to show the contributions of residual heterozygosity and mutation to subline divergence. The expected number of genes differentially fixed between two sublines, each inbred for 154 generations (an arbitrarily chosen number), is shown in relation to the generation at which branching occurred. Curve A shows the contribution from new mutations alone, while curve B shows the contribution of residual heterozygosity in addition to that of mutation. As an example, two sublines branching at generation F20, and both continuing under full sib mating for 134 additional generations to F154, will differ at 117 loci due to residual heterozygosity that was present at the time of branching ( Table 1) and will differ at [2(154-20)-7](0.3) = 78 more loci due to mutations, most of which have occurred since branching, for a total of 195 loci with fixed gene differences.

The chance of any particular researcher encountering a genetic difference between sublines depends of course on the number of genes that determine the trait that he is studying. For example, if there are 330 histocompatibility (H) genes in the mouse, then the expected number of H-gene differences between two sublines would be 330/30,000 = 0.01 of that indicated in Figure 2. In the example given above, two sublines inbred for 154 generations but having branched at generation 20 would be expected to differ at (0.01)(195) = 2 H loci. One of these differences would have originated by mutation.

The above estimate of 330 H loci is an average of estimates obtained by three different approaches: 1) the mutation rate of the block of H loci averaged from studies conducted at two different laboratories is 5 x 10-3 ( 8). If we assume the mutation rate per H locus is the same as that for other loci, i.e., 10-5 ( 15), then 3 x 10-3/10-5 = 300 H loci. 2) One corollary of Snell's laws of transplantation is that mutations showing a loss of antigen can only be detected at heterozygous loci (in the present case, these would be those loci at which the parental strains, C57BL/6By and BALB/cBy, of the F1 mice being assayed for mutations carry different alleles). So the number of loss-type mutations out of the total number found should be proportionate to the number of H loci by which the two parental strains differ out of the total number of all H loci in the species. This argument has been set forth in more detail ( 17), where the estimate obtained was 430 H loci. 3) An individual wild mouse has been estimated to be heterozygous on the average at about 11% of its loci ( 13). Since the BALB/cBy and C57BL/6By strains have been estimated to differ at 29 or more H loci ( 18), and since each inbred strain theoretically is genetically equivalent to a single gamete drawn at random from a wild population, then the F1 derived from these two strains would be equivalent to a representative individual from a wild population, and because it is heterozygous at 29 or more H loci the number of H loci in the species would be estimated at (100/11)(29) = 264.

We have constructed an equivalence curve in Figure 3 so that we can compare the relative importance of the two sources of subline divergence. Points in the area above and to the right of the curve indicate mutation to be the most important contributor, and below and to the left, residual heterozygosity to be the more important contributor. As an example, if the nomenclature committee were to recommend (as discussed later) that each subline that branched prior to F24 should be given a distinctive symbol, then to be reasonable, the committee should also recommend, that a subline inbred for 200/2 = 100 generations after branching also should be given a distinctive symbol after branching. We assume here that any subline to which this one is compared will also be inbred 100 generations after branching so that G1 + G2 = 200.

Minimally Attainable Genetic Uniformity

Sublines will differ not only by fixed mutations but also by unfixed mutations arising in recent generations and not yet having had a chance to become fixed by inbreeding. The number of unfixed gene differences between two hypothetical sublines is estimated at about 27 μ ( 9), and if μ = 0.3, then (27)(0.3) = 8. On the average, four of these will be segregating in each subline, and because they are segregating, only about 60% of the mice in each subline will be carrying any one of the four mutant genes. For perspective, it would take, on the average, 40 generations of full-sib mating before the number of unfixed genes from residual heterozygosity would equal the number (four) from mutation and 10 additional generations before only the unfixed genes from mutation would be of consequence.

These unfixed gene differences from mutation become the limit of attainable genetic uniformity within a subline, outside of the routine screening of individuals. We can continually assay for fixed-gene differences between sublines, as discussed below in regard to maintaining parallel lines to assure that no differences exist in the trait of interest, but unfixed gene differences are always potentially present and may be found between members of the same subline. However, the probability that any of these mutations will be affecting the trait of interest depends on what proportion of the genome affects that trait and in most cases this will be extremely small.

Importance to Sublines of Existing Strains

These calculations can be used to evaluate the differences expected between sublines in various existing inbred strains of mice. To do this we have constructed pedigree charts in Figures 4A, 4B, 4C, 4D, 4E, and 4F of six major strains based on information gleaned from past issues of Mouse News Letter and from its companion issue, Inbred Strains of Mice. We have used the recorded year of branching instead of the inbreeding generation of branching, because often only the year was given in the entries and generation counts were begun anew in each laboratory. We have found the number of generations progressed each year to vary among laboratories, but it averages close to 2.5, the value we have used in interpreting these charts.

We have then placed the graph of curve B in Figure 2 (contribution of both residual heterozygosity and mutation) adjacent to each chart so that differences between sublines can be estimated directly from the point of the branching. The graph adjacent to the DBA chart ( Figure 4F) includes a roughly estimated effect of the reported intercrossing of two sublines in 1929. The C57BL/Ks subline in Figure 4E probably is a product of genetic contamination, a conclusion reached because that subline differs from other C57BL/6 sublines at all loci in the H-2 complex as well as at least three other histocompatibility loci ( 19).

The branching in most cases is likely to have occurred even earlier than that indicated in the chart, perhaps three to five generations earlier, because the sender may not always have taken into consideration the branches in his own colony when selecting breeding pairs for shipment.

It will be seen in Figures 4A, 4B, 4C, 4D, 4E, and 4F that a number of early branching sublines will probably be found to differ at many loci due to both residual heterozygosity from the early branching and to mutation through many generations of separate maintenance. There is especially early branching of sublines in the C57 and A strains.

Ways of Controlling Divergence

The problem of subline divergence can be kept under control by several procedures, some more immediately practical than others.

1) Divergence can be avoided by restocking from a common source colony, but this would have to be done by all relevant laboratories. Historically this is a practice not easily established.

2) Divergence can also be slowed down by placing embryos of each strain in frozen storage ( 20), and restocking the experimental colony from time to time from the frozen source. This could be a common international frozen source from which all laboratories restocked their colonies.

3) Divergence can be detected as it occurs by the investigator maintaining two parallel lines of each strain in his colony (Mobraaten, personal communication). Mice from the two lines can be compared in all pertinent experiments.

4) The research worker could be alerted to potentially large differences between sublines by a revised nomenclature system. A current nomenclature rule ( 21) allows sublines arising in any generation after F8 to bear the same strain name. It would help the research worker considerably if such sublines arising before, say, F24 be given distinctive symbols. In the same vein, it would be helpful for any subline, as well as its branches (sub-sublines), that has been separately maintained for 100 generations or more (see Figure 3) also to be given a distinctive symbol.

5) The latter procedure unfortunately would have to await the action of an international committee. In the meantime, each experimenter can and should publish a key reference (e.g., in Mouse News Letter) which provides the full ancestral information on his sublines for others to evaluate relationships of their sublines with his and which he should always cite in his pertinent papers.

1This work was supported by NIH Research Grants No. GM22878 from the National Institute of General Medical Sciences and No. AI13130 from the National Institute of Allergy and Infectious Diseases.


1. Acton, R.T., Blankenhorn, E.P., Douglas, T.C., Owen, R.X., Hilgers, J., Hoffman, H.A., and Boyse, E.A. (1973). Nature New Biol. 245: 8.
See also MGI.

2. Ciaranello, R.D., Lipsky, A., and Axelrod, J. (1974). Proc. Natl. Acad. Sci. 71: 3006.
See also MGI.

3. Glode, L.M., and Rosenstreich, D.L. (1976) J. Immunol. 117: 2061.
See also MGI.

4. Maurer, P.M., Merryman, C.F., and Jones, J. (1974). Immunogenetics 1: 398.

5. Olsson, M., Lindahl, G., and Ruoslaht, E. (1977). J. Exp. Med. 145: 819.
See also MGI.

6. Rechicigl, R., Jr., and Heston, W.E. (1963). J. Natl. Cancer Inst. 30: 855.

7. Taniguchi, M., Tada, T., and Tokuhisa, T. (1976). J. Exp. Med. 144: 20.
See also PubMed.

8. Bailey, D.W. (1976). In Basic Aspects of Freeze Preservation of Mouse Strains (O. Muhlbock, ed.), p. 67. Gustav-Fisher Verlag, Stuttgart.

9. Bailey, D.W. (1977). Ciba Found. Symp. 52: 291.
See also PubMed.

10. Wright, S. (1921). Genetics 6: 167.

11. Fisher, R.A. (1949). The Theory of Inbreeding. Oliver and Boyd, Edinburgh.

12. McKusick, V.A., and Ruddle, F.H. (1977). (Footnote 12) Science 196: 390.
See also PubMed.

13. Selander, R.K., and Yang, S.Y. (1969). Genetics 63: 653.
See also PubMed.

14. Lewontin, R.C. (1973). Ann. Rev. Genet. 7: 1.
See also PubMed.

15. Schlager, G., and Dickie, M.M. (1967). Genetics 57: 319.
See also MGI.

16. Mukai, T., Chigusa, S.I., Mettler, L.E., and Crow, J.F. (1972). Genetics 72: 335.
See also PubMed.

17. Bailey, D.W. (1968). In Advance in Transplantation (J. Dausset, J. Hamberger, and G. Mathe, eds.), p. 317. Munksgaard, Copenhagen.

18. Bailey, D.W., and Mobraaten, L.E. (1969). Transplantation 7: 394.
See also PubMed.

19. Graff, R.J. (1970). Transplant. Proc. 2: 15.
See also MGI.

20. Whittingham, D.G. (1976). In Basic Aspects of Freeze Preservation of Mouse Strains (O. Muhlbock, ed.), p. 45. Güstav-Fisher Verlag, Stuttgart.

21. Staats, J. (1976). Cancer Res. 36: 4333.
See also MGI.

Previous Next