Names and Symbols for Mutant Alleles
Mutant Phenotypes
Genes Known Only by Mutant Phenotype
Where a gene is only known by mutant phenotype, the gene is given the name
and symbol of the first identified mutant.
- Symbols of mutations conferring a recessive phenotype begin with a lower
case letter; dominant or semidominant phenotype symbols begin with an upper
case letter. For example:
coloboma, Cm;
bouncy, bc.
- Further (allelic) mutations at the same locus, if they have the same
phenotypes, are given the same name with a lab code preceded by a serial
number (if more than one additional allele from the same lab). In the symbol,
the lab code is added as a superscript. For example: new alleles of
bouncy identified at The Jackson Laboratory:
bc2J,
bc3J, bc4J.
- If a new allelic mutation of a gene known only by mutant phenotype is
caused by a transgenic insertion, the symbol of this mutation should use
the symbol of the transgene (see below) as superscript. For example:
a transgenic mutant allele of the Purkinje cell degeneration (Agtpbp1) gene
created by Jon W. Gordon is symbolized
Agtpbp1pcd-Tg(Dhfr)1Jwg.
- If the additional allele has a different phenotype, it may be given a different name,
but when symbolized, the new mutant symbol is superscripted to the original mutant symbol.
For example:
rsgrc
is the "grey coat" allele of
rs
(recessive spotting).
- Also, if a new mutation is described and named but not shown to be an allele of
an existing gene until later, the original name of the new mutation can be kept even if the
phenotype is apparently identical, but the original symbol is used, with the new mutation
symbol as superscript. For example: haze is an allele of ruby-eye 2
(ru2), and hence is ru2hz.
- A mutation caused by insertion of a transgene, but whose molecular nature is nevertheless
unknown (i.e., it is uncloned) and which has not been shown to be an allele of an already
described mutant gene, should be given a transgene symbol (see below). When the insertion
site is cloned and/or molecularly characterized, the gene should be named and the transgene
symbol becomes superscripted. For example :
Tg(AMY1C)1Mm
causes a mutation with a phenotype of vestibular and craniofacial abnormalities, maps to
Chromosome 18, but the gene in which the insertion site occurred is unknown. Once identified,
Tg(AMY1C)1Mm
will be a superscripted allele symbol.
Phenotypes Due to Mutations in Structural Genes
When a spontaneous or induced mutant phenotype is subsequently found to be a mutation in a structural gene, or the gene in which the mutation has occurred is cloned, the mutation becomes an allele of that gene and the symbol for the mutant allele is formed by adding the original mutant symbol as a superscript to the new gene symbol. (The mutant symbol should retain its initial upper or lower case letter).
For example: the albino mutation of tyrosinase, Tyrc; the dominant white spotting mutation of Kit, KitW.
If the original mutation has multiple alleles, when describing these alleles, their symbols become part of the superscript to the identified structural gene.
For example: viable white spotting,
KitW-v; sash,
KitW-sh.
Even if the identified gene is novel, we recommend that it is nevertheless given a name and symbol different from the mutant name and symbol. This will more readily allow discrimination between mutant and wild type and between gene and phenotype.
Wild Type Alleles and Revertants
The wild-type allele of a any gene is indicated by + as superscript to the mutant symbol.
For example: the wild-type allele of the
Pax1 gene is Pax1+.
A revertant to wild-type of a mutant phenotype locus should be indicated by the symbol + with the
mutant symbol as superscript. Additional revertants are given a Lab Code and preceded by a serial
number if more than one revertant is found in a lab. For example: revertants to wild-type
of the dilute mutation of myosin 5A
(Myo5a) include
Myo5ad+ and
Myo5ad+2J.
Any DNA that has been introduced stably into the germline of mice is a transgene. Transgenes can be broken down into two categories;
- Those that occur by random insertion into the genome (usually by means of microinjection) and
- Those that occur as targeted events by methods involving homologous recombination.
Transgenics as Random Insertional Events
The symbol Tg will be used to designate genetically engineered transgenic events that result
from random insertion of DNA into the genome. In most cases, the technology used to generate
these mice result from microinjection of the construct into the blastocyst. The symbol
consists of four parts, all in Roman typeface, as follows:
Tg(YYY)#Zzz , where
- Tg indicates a transgene insertion by methods of microinjection.
- (YYY) indicates the inserted sequence. It should contain the official nomenclature for
that gene. In the cases where a construct was composed of roughly equal parts of two genes,
i.e., a fusion gene, the symbols are separated by a forward slash (see examples below).
- # is a number from 1 to 99,999 that is uniquely assigned to each germline transmittable
event in a series of events generated using the same construct .
- Zzz is the Laboratory registration code. A Laboratory Registration Code is uniquely
assigned to each laboratory originating transgenic animals, DNA loci or inbred strains.
Laboratories that have already been assigned such a code for other genetically defined mice
and rats should use the same code. The registry of these codes is maintained by the Institute for Laboratory Animal Research (ILAR),
2101 Constitution Avenue, NW, Washington, DC 20418.
- NOTE :
- Information regarding the promoter will not be reflected in the nomenclature but will be
annotated and searchable in MGD to the fullest degree.
- The strain background is independent of gene and allele nomenclature but is important in
strain and stock designations. Strain information will be captured in MGD associated with
each allele.
- Examples :
- Tg(Wnt1)Hev.
The mouse Wnt1
sequence inserted into the mouse genome. This is the first such transgene reported using
Wnt1
by the laboratory of Harold E. Varmus (Hev).
Whether or not the transgene inserted "benignly" into the genome or whether it disrupted a
locus is irrelevant to the nomenclature. The EXCEPTION is when the disrupted locus is identified.
A transgene becomes an allele of that locus, as in the next example.
- Tg(Hoxa1)1Chm.
A construct containing the mouse homeobox a1 gene was inserted by microinjection as
reported in a publication by Corey H. Mjaatvedt (Chm). This is the first germline transmission
Mjaatvedt reported using this construct.
Subsequently, the locus disrupted by the transgene insertion was identified as the gene
encoding the chondroitin sulfate proteoglycan 2
(Cspg2).
The nomenclature now becomes
Cspg2Tg(Hoxa1)Chm.
-
Tg(TCF3/HLF)1Mlc. The human transcription factor 3 gene and the hepatic leukemia
factor gene were inserted and expressed as a fusion chimeric cDNA by Michael L Cleary's
laboratory (Mlc).
Targeted Mutagenesis
The symbol tm (for targeted mutation) will be used to designate genetically engineered
transgenic events that result from homologous recombination using embryonic stem cell
technology. The symbol consists of four parts, all in Roman typeface, as follows:
YYYtm#Zzz
YYY indicates the official gene symbol of the targeted locus. In the cases where one gene
is replaced by another gene (i.e., a "knock-in"), the replacing gene symbol is inserted in
parenthesis after the # (see examples).
- tm is used to indicate targeted mutation.
- # is a number from 1 to 99,999 that is uniquely assigned to each targeted mutation of a particular locus generated by that laboratory.
- Zzz is the Laboratory registration code. A Laboratory Registration Code is uniquely assigned to each laboratory originating transgenic animals, DNA loci or inbred strains. Laboratories that have already been assigned such a code for other genetically defined mice and rats should use the same code. The registry of these codes is maintained by
ILAR.
- NOTE:
- Information regarding the construct and replacement cassette is not reflected in the
nomenclature but will be annotated and searchable in MGD to the fullest degree.
- The strain background is independent of the nomenclature.
Examples:
- Bmp1tm1Blh
is a targeted mutation generated at the bone morphogenetic protein 1
(Bmp1)
locus in the laboratory of Brigid L. Hogan (Blh).
- En1tm1(Otx2)Wrst,
where the Otx2 gene was knocked into the En1
locus as reported in the originating publication from the laboratory of W. Wurst (Wrst). Note
that this nomenclature form is only used when one gene is replacing another.
Often the E.coli beta-galactosidase gene or the Cre recombinase gene are "knocked-in" to a
locus. The same symbol convention is used for these.
-
Tfamtm1Lrsn and
Tfamtm1.1Lrsn
are the types of designations used when one targeting vector is used to generate multiple
germline transmissible alleles, such as with Cre-Lox system for recombination. Here
Tfamtm1Lrsn
is used to designate a Lox P insertion and
Tfamtm1.1Lrsn
is used to designate the allele generated after mating with a Cre transgenic mouse.
NOTE: The designation of tm1.1 is only utilized when the presence of CRE generates a germ line
transmissible event. For example, disruption of
Tfam by Cre-Lox recombination using a
ubiquitously expressed beta-actin Cre transgene produces a mouse where the disruption of
Tfam
is heritable. However, in cases where the presence of Cre-Lox produces a somatic
event, no nomenclature is assigned. For example, when
Tfam
is selectively
disrupted in heart and skeletal muscle using a muscle creatine kinase promoter.
This promoter is active from embryonic day 13. In these cases, it is best to represent the
mouse as a product of its genotype:
Tfamtm1Lrsn/
Tfamtm1Lrsn ; +/Ckme-cre
Gene Trap Loci and Alleles
Gene trap experiments in embryonic stem (ES) cells produce cell lines in which integration
into a putative gene is selected by virtue of its expression in ES cells. The trapped gene is
usually (although not necessarily) mutated by the integration. The loci of integration of a
series of gene trap lines, once characterized as unique, can be named and symbolized as members of
a series, with an appropriate root. The symbol consists of four parts, all in Roman typeface,
as follows:
Gt(YYY)#Zzz
- Gt is used to indicate a gene trap insertion.
- (YYY) indicates the inserted vector sequence.
- # is a number from 1 to 99,999 that is uniquely assigned to each germline transmittable
event in a series of events generated using the same construct.
- Zzz is the Laboratory registration code. A Laboratory Registration Code is uniquely
assigned to each laboratory originating transgenic animals, DNA loci, or inbred strains.
Laboratories that have already been assigned such a code for other genetically defined mice
and rats should use the same code. The registry of these codes is maintained by
ILAR).
Example:
Gt(ROSA)26Sor
is a gene "trapped" by the ROSA vector. It is the 26th germline transmissible event using the
ROSA vector from the laboratory of Phillipe Soriano.
As with transgenes identified only through an insertion, gene trap integrations will become
alleles of the gene, once that gene is identified.