SNP Terms and Concepts
More Help

This help document answers the following questions:

See also:

What is a SNP?

SNP stands for single nucleotide polymorphism. In MGI, we refer to any polymorphisms in dbSNP as a SNP. A key aspect of research in genetics is the association of sequence variation with heritable phenotypes. Sequence variations exist at defined positions within genomes and are responsible for individual phenotypic characteristics, including a person's propensity toward complex disorders such as heart disease and cancer. SNPs are the most common genetic sequence variations.

What is dbSNP?

dbSNP, the Single Nucleotide Polymorphism (SNP) database at National Center for Biotechnology Information (NCBI) is a public-domain archive for a broad collection of simple genetic polymorphisms for a variety of organisms, including mouse. This collection of polymorphisms includes single-base nucleotide substitutions (also known as single nucleotide polymorphisms or SNPs), small-scale multi-base deletions or insertions, called IN-DELS (also called deletion insertion polymorphisms or DIPs), and retroposable element insertions and microsatellite repeat variations (also called short tandem repeats or STRs). Each dbSNP entry includes the sequence context of the polymorphism (i.e., the surrounding sequence), the occurrence frequency of the polymorphism (by population or individual), and the experimental method(s), protocols, and conditions used to assay the variation.

What are dbSNP function classes?

The dbSNP function classes define the position of a polymorphism with respect to identifiable features of a specific transcript for a gene. Most function classes in gene features are defined by the location of the variation with respect to transcript exon boundaries. However, functional classes are assigned to each allele for a variation in coding regions because these classes depend on allele sequence. In dbSNP, all annotations to genes have at least one function class (assigned by dbSNP).

How are the dbSNP function classes defined?

See Understanding the MGI Mouse SNPs Legend.

Top

What is a RefSNP? How does it differ from an ss SNP?

In MGI, the term SNP is synonymous with a RefSNP. A RefSNP (rs) is a reference SNP. dbSNP maps each submitted SNP assay (ss) to the genome and assigns a RefSNP accession ID (rs number) to each submitted SNP assay. Submitted SNPs that map to the same location are clustered into the same RefSNP and have the same rs number.

The exception to this usage is when referring to the dbSNP Variation Type SNP (strictly defined as single nucleotide polymorphism. SNPs are base substitutions involving A, T, C, or G).

Top

What is a submitted SNP assay? a submitted SNP assay with and without strain/alleles? a submitter SNP ID? an exemplar ss assay?

SNP Assay TermDefinition
Submitted SNP Assay (ss)Each SNP record submitted to dbSNP is assigned a unique ssID (ss#). All SNP assays are submitted with an ss flanking sequence, which is intended to uniquely position the polymorphism in the genome.
Submitted SNP Assay with Strain/AllelesA SNP assay submitted with defined allele calls for each strain sampled. Occasionally, a poor experimental result is obtained for a given strain in the submitted SNP assay, and the allele value for that strain is submitted as N.
Submitted SNP Assay without Strain/AllelesA SNP assay submitted without defined allele calls for the strains sampled. If all ss Assays for a RefSNP are of this type, the RefSNP will not be loaded into MGI. If at least one ss assay for a RefSNP has specified strain/alleles, then the RefSNP and all of its ss Assays will be loaded into MGI.
SubmitterSNP IDSNP Assay ID assigned by the submitter to dbSNP.
Exemplar ss AssayThe representative ss Assay of a RefSNP (usually the one with the longest submitted flanking sequence). The rs reference flanking sequence is derived from the submitted flanking sequence for the exemplar ss.

Top

What is the difference between rs Flanking Sequence and ss Flanking Sequence?

Flanking SequenceDefinition
rs Flanking SequenceThe flanking sequence of the Exemplar ss Assay for the RefSNP. MGI loads only the rs Flanking Sequence dbSNP for a given RefSNP and displays this sequence on the SNP Detail report. MGI calculates the corresponding IUPAC nucleotide uncertainty code from the rs Allele Summary (strain/alleles only).
ss Flanking SequenceThe 5' and 3' flanking sequences that are intended to uniquely locate the described polymorphism from a given assay in the genome. Occasionally, the ss Flanking Sequence contains sequence that is found in multiple locations in the genome.

Top

What is the difference between ss orientation and rs orientation?

For both ss and rs orientation, f designates forward and r designates reverse.

Orientation TypeDefinition
ss OrientationThe orientation of the ss flanking sequence relative to the rs reference flanking sequence.
rs OrientationThe orientation of the rs reference flanking sequence relative to the mouse genome assembly.

Top

What is a consensus (or majority) strain allele? an rs consensus strain allele set?

Consensus TypeDefinition
Consensus Strain AlleleAlso called Majority Strain Allele. The MGI consensus allele call for a given strain of a SNP. This is determined by the majority allele values of all ss-level allele calls for that strain of the SNP. If an allele value conflict occurs among the assays for a strain, but a majority allele strain is present (the allele observed most often among all assays for the SNP, then the allele for that strain is underlined. For comparison queries on the Mouse SNP Query Form, underlining does not affed the allele value.

If an allele value cannot be determined for a strain by majority (i.e., no allele call has a majority for the strain), then the consensus allele for that strain is considered ambiguous (represented by a question mark "?"). Ambiguous alleles are always considered different from other alleles in queries. (Calculated by MGI)
rs Consensus Strain
Allele Set
The set of all strains and their consensus strain alleles for a SNP in MGI. The Mouse SNP query form considers the strain allele values of the rs Consensus Allele set for a SNP. It is possible to have a case where no single ss assay for a given SNP includes all of the strains necessary to satisfy a query; however, if the strains from all ss assays of the SNP are considered, the SNP satisfies the query. (Calculated by MGI)

Top

What are the differences between ss level, MGI-rs level, and dbSNP-rs level variation types?

Variation TypeDefinition
ss levelThe type of polymorphism for a Submitted SNP assay, assigned by dbSNP. dbSNP determines this by considering all submitted alleles for a given ss Assay. (Calculated by dbSNP)
MGI-rs levelAlso called rs Consensus Variation Type. The variation type determined for a SNP in MGI, considering all ss-level alleles for that SNP. The union of ss-level alleles for a SNP may have a different variation type than that of the individual ss assays for the SNP, even if all of the ss assays have the same variation type value. For example, consider the hypothetical SNP rs12345 with two ss assays (ss11111 alleles:-/T) and (ss22222 alleles: -/G). Both ss assays for rs12345 have variation type in-del, however, the rs Allele Summary for rs12345 is -/G/T, which has variation type Mixed, since the combined polymorphism can be either variation type SNP/ (G/T), or in-del (-/G or -/T). (Calculated by MGI)
dbSNP-rs levelThe type of polymorphism for a RefSNP in dbSNP. dbSNP calculates this by considering all ss-level alleles for a given RefSNP, with and without strain/allele relationships. MGI does not use the dbSNP Variation Type for a RefSNP. (Calculated by dbSNP)

Top

What are the dbSNP variation types?

MGI queries display only the dbSNP variation types present in the mouse SNP data.

Variation TypeDefinition
Single Nucleotide Polymorphism (SNP)Strictly defined as single base substitutions involving A, T, C, or G.
Sample allele: A/T
Insertion/Deletion Polymorphism (in-del) (also DIP)An observed insertion of one or more nucleotides in one individual relative to another individual. Since the molecular event that gave rise to this observation cannot be determined from the observation alone (i.e. was it an insertion or a deletion), both events are incorporated into the name of this polymorphism type. In dbSNP, in-dels are designated using the full sequence of the insertion as one allele, and either a fully defined string for the variant allele or a "-" character to specify the deleted allele.
Sample allele: -/T
Microsatellite or short tandem repeat (STR)Microsatellite or short tandem repeat (STR): Alleles are designated by providing the repeat motif and the copy number for each allele. Expansion of the allele repeat motif designated in dbSNP into full-length sequence will be only an approximation of the true genomic sequence because many microsatellite markers are not fully sequenced and are resolved as size variants only.
Sample allele: -/T
Named variant (Named)Applies to insertion/deletion polymorphisms of longer sequence features, such as retroposon dimorphism for Alu or line elements. These variations frequently include a deletion "-" indicator for the absent allele.
Sample allele: -/(B1)
No-variation (none)Reports may be submitted for segments of sequence that are assayed and determined to be invariant in the sample.
Sample allele: (NoVariation)
MixedAssigned to SNP clusters that group submissions from different variation classes.
Sample allele: -/C/T
Multi-Nucleotide Polymorphism (MNP)Assigned to variations that are multi-base variations of a single, common length.
Sample allele: ACG/TTC

Top