Genome Feature Type Definitions
More Help

Many of the MGI classification terms come from the Sequence Ontology (SO) project. The SO project develops terms and relationships describing features and attributes of biological sequences. When the MGI term definition comes from the SO project, the corresponding SO ID for that term appears.

Feature TypeMGI DefinitionCorresponding
geneA region (or regions) that include all of the sequence elements necessary to encode a functional transcript. A gene may include regulatory regions, transcribed regions, and/or other functional sequence regions. SO:0000704
protein coding geneA gene that produces at least one transcript that is translated into a protein. SO:0001217
non-coding RNA geneA gene that produces an RNA transcript that functions as the gene product.SO:0001263
lncRNA geneA gene that encodes a non-coding RNA over 200 nucleotides in length. SO:0001877
antisense lncRNA geneA gene that encodes a non-coding RNA transcribed from the opposite DNA strand compared with other transcripts and overlap in part with sense RNA. SO:0002182
lincRNA geneA gene that encodes large intervening non-coding RNA. SO:0001641
sense intronic lncRNA geneA gene that encodes a sense intronic long non-coding RNA. SO:0002184
sense overlapping lncRNA geneA gene that encodes a sense overlap long non-coding RNA. SO:0002183
bidirectional promoter lncRNA geneA non-coding locus that originates from within the promoter region of a protein-coding gene, with transcription proceeding in the opposite direction on the other strand. SO:0002185
rRNA geneA gene that encodes ribosomal RNA.SO:0001637
tRNA geneA gene that encodes Transfer RNA. SO:0001272
snRNA geneA gene that encodes a Small Nuclear RNA.SO:0001268
snoRNA geneA gene that encodes for Small Nucleolar RNA.SO:0001267
miRNA geneA gene that encodes for microRNA. SO:0001265
scRNA geneA gene that encodes for Small Cytoplasmic RNA.SO:0001266
SRP RNA geneA gene that encodes the signal recognition particle RNA. SO:0000590
RNase P RNA geneA gene that encodes RNase P RNA, the RNA component of Ribonuclease P (RNase P).SO:0001639
RNase MRP RNA geneA gene that encodes RNase MRP RNA.SO:0001640
telomerase RNA geneA non-coding RNA gene, the RNA product of which is a component of telomerase. SO:0001643
unclassified non-coding RNA geneA non-coding RNA gene not classified with established ncRNA functional subcategories. 
ribozyme geneA gene that encodes an RNA with catalytic activity.SO:0002181
heritable phenotypic markerA biological region characterized as a single heritable trait in a phenotype screen. The heritable phenotype may be mapped to a chromosome but generally has not been characterized to a specific gene locus.SO:0001500
gene segmentA gene component region which acts as a recombinational unit of a gene whose functional form is generated through somatic recombination. SO:3000000
unclassified geneA region of the genome associated with transcript and/or prediction evidence but where feature classification is imprecise. 
other feature typesMGI markers that are not classified as gene including pseudogenes, QTL, transgenes, gene clusters, cytogenetic markers, & unclassified genome features.  
QTLA quantitative trait locus (QTL) is a polymorphic locus which contains alleles that differentially affect the expression of a continuously distributed phenotypic trait. Usually it is a marker described by statistical association to quantitative variation in the particular phenotypic trait that is thought to be controlled by the cumulative action of alleles at multiple loci. SO:0000771
transgeneA gene that has been transferred naturally or by any of a number of genetic engineering techniques from one organism to another. SO:0000902
complex/cluster/regionA group of linked markers characterized by related sequence and/or function where the precise location or identity of the individual components is obscure.  
cytogenetic markerA structure within a chromosome or a chromosomal rearrangement that is visible by microscopic examination.  
chromosomal deletionAn incomplete chromosome.SO:1000029
insertionThe sequence of one or more nucleotides added between two adjacent nucleotides in the sequence.SO:0000667
chromosomal inversionAn interchromosomal mutation where a region of the chromosome is inverted with respect to wild type.SO:1000030
Robertsonian fusionA non reciprocal translocation whereby the participating chromosomes break at their centromeres and the long arms fuse to form a single chromosome with a single centromere.SO:1000043
reciprocal chromosomal translocationA chromosomal translocation with two breaks; two chromosome segments have simply been exchanged.SO:1000048
chromosomal translocationAn interchromosomal mutation. Rearrangements that alter the pairing of telomeres are classified as translocations.SO:1000044
chromosomal duplicationAn extra chromosome.SO:1000037
chromosomal transpositionA chromosome structure variant whereby a region of a chromosome has been transferred to another position. Among interchromosomal rearrangements, the term transposition is reserved for that class in which the telomeres of the chromosomes involved are coupled (that is to say, form the two ends of a single DNA molecule) as in wild-type.SO:0000453
unclassified cytogenetic markerA cytogenetic marker not classifiable within current cytogenetic subcategories. 
BAC/YAC endA region of sequence from the end of a BAC or YAC clone used as a reagent in mapping and genome assembly.  
BAC endA region of sequence from the end of a BAC clone that may provide a highly specific marker. SO:0000999
YAC endA region of sequence from the end of a YAC clone that may provide a highly specific marker. SO:0001498
PAC endA region of sequence from the end of a PAC clone that may provide a highly specific marker. SO:0001480
other genome featureA region of the genome associated with biological interest (includes regulatory regions, conserved regions and related sequences, repetitive sequences, and viral integrations).  
retrotransposonA transposable element that is incorporated into a chromosome by a mechanism that requires reverse transcriptase. SO:0000180
telomereA specific structure at the end of a linear chromosome, required for the integrity and maintenance of the end. SO:0000624
minisatelliteA repeat region containing tandemly repeated sequences having a unit length of 10 to 40 bp. SO:0000643
unclassified other genome featureA genome feature that cannot be classified in any currently recognized genome category.
endogenous retroviral regionA region derived from viral infection of germ cells that has been stably integrated into the host genome and is passed on from generation to generation. SO:0000903
mutation defined regionA genomic region, containing multiple genes/genome features, within which a mutation event resulted in complex genomic changes affecting multiple features (e.g. not a simple regional deletion).
CpG islandRegions of a few hundred to a few thousand bases in vertebrate genomes that are relatively GC and CpG rich; they are typically unmethylated and often found near the 5' ends of genes. SO:0000307
promoterA regulatory_region composed of the TSS(s) and binding sites for TF_complexes of the basal transcription machinery. SO:0000167
TSS regionThe region of a gene from the 5' most TSS (transcription start site) to the 3' TSS. SO:0001240
DNA segmentA region of the genome associated with experimental interest, often used as a reagent for genetic mapping. Includes RFLP and other hybridization probes, sequence-tagged sites (STS), and regions defined by PCR primer pairs such as microsatellite markers).
pseudogenic regionA non-functional descendant of a functional entity.SO:0000462
pseudogeneA sequence that closely resembles a known functional gene, at another locus within the genome, that is non-functional a consequence of (usually several) mutations that prevent either its transcription or translation (or both). In general, pseudogenes result from either reverse transcription of a transcript of their normal paralog, in which case the pseudogene typically lacks introns and includes a poly(A) tail, or from recombination, in which case the pseudogene is typically a tandem duplication of its normal paralog. SO:0000336
pseudogenic gene segmentA recombinational unit of a gene which when incorporated by somatic recombination in the final gene transcript results in a nonfunctional product.SO:0001741
polymorphic pseudogenePseudogene owing to a SNP/DIP but in other individuals/haplotypes/strains the gene is translated.SO:0001841