Quick Guide to Nomenclature for Genes
When symbolizing and naming novel genes and genomic features, it is important that these symbols and names be unique identifiers. The MGD Nomenclature Committee provides advice and assistance in assigning new symbols and names. Please contact us at firstname.lastname@example.org.
Below are some basic rules and guidelines to assist in the symbol and naming process.
- Symbols are 3-5 characters, not to exceed 10 characters.
- Symbols begin with an uppercase letter followed by all lowercase letters except for recessive mutations, which begin with a lowercase letter.
- Symbols should not include tissue specificity and molecular weight designations.
- Symbols are italicized in published articles.
- Use punctuation only to separate two adjacent numbers (for example,
Lamb1-2) or for designating related sequences (for example, Emb-rs3) and pseudogenes (for example, Adh5-ps1).
- For orthologous genes in other vertebrate species, use the same symbol as in human, rat, and mouse, whenever possible.
- For homologous genes in invertebrate or prokaryote species:
- If it is the only mouse homolog, use the approved symbol for that species, but include the word homolog at the end of the name followed by the name of the species in parentheses; for example, symbol: Cdc20; name: cell division cycle 20 homolog (S. cerevisiae).
- If there is more than one mouse homolog for the invertebrate gene, assign the serial number after the word homolog e.g., symbols: Atoh1 and Atoh2; names: atonal homolog 1 (Drosophila) and atonal homolog 2 (Drosophila) respectively.)
- If the invertebrate/prokaryotic gene is similar to the mouse gene but is not determined to be a homolog, use the
letter l to denote "-like" designations (e.g., symbol:
Ash2l; name: ash2 (absent, small, or homeotic)-like (Drosophila).
- ESTs and BAC/YAC ends: use the assigned sequence accession identification numbers.
- STSs: use the convention D#Xxx%e, where:
- You can obtain approved lab codes from ILAR.
- Add a trailing "e" only when the DNA segment is expressed.
- Should be brief and specific.
- Should convey the character or function of the gene.
Other information when publishing:
- Use uppercase "C" when referring to a
specific mouse chromosome (e.g., Chromosome 15).
- Do not use a period (".") after abbreviation of chromosome (e.g., Chr 15 should not be written Chr.15).
Protein designations follow the same rules as gene symbols, with the following two distinctions:
- Protein symbols use all uppercase letters.
- Protein symbols are not italicized.
Gene symbol to use:
To distinguish between mRNA, genomic DNA, and cDNA, write the relevant prefix in parentheses; e.g., (mRNA) Rbp1.