A. SummaryB. Nomenclature ChangesC. Literature Search StrategyD. HMG FamiliesE. Canonical and Mammalian HMGs F. Various HMGsG. Plant HMGs

A.  Summary

  The High Mobility Group (HMG) proteins were originally isolated from mammalian cells, named according to their electrophoretic mobility in polyacrylamide gels, and arbitrarily classed as a specific type of nonhistone proteins based on the observation that they are ubiquitous to mammalian cells, that they share certain physical properties,  and that they are associated with isolated chromatin. These mammalian proteins are considered to be Canonical HMG  proteins and are now subdivided into 3 families: the HMGB (formerly HMG-1/-2) family, the HMGN (formerly HMG-14/-17) family, and the HMGA (formerly HMG-I/Y/C) family.  Each HMG family has a characteristic functional sequence motif.  The functional motif of the HMGB family is called the "HMG-box;" that of the HMGN family, the  "nucleosomal binding domain;" and that of the HMGA family, the "AT-hook." The functional motifs characteristic of these canonical HMGs are widespread among nuclear proteins in a variety of organisms. Proteins containing any of these functional motifs embedded in their sequence are known as "HMG motif proteins."

B.  Nomenclature Changes

  The New Nomenclature establishes a set of rules and a systematic way to name the genes and proteins belonging to the High Mobility Group (HMG) families of nuclear proteins. The HMG proteins are now subdivided into three families named HMGB, HMGA, and HMGN. The names of the genes correspond to the names of the encoded proteins. Within a family, the genes are named in sequential order (e.g., HMGB1, HMGB2). Splice variants are indicated by small letters (e.g., HMGA1a, HMGA1b and HMGA1c). The nomenclature has been devised to solve certain problems in mammalian genetic nomenclature; however the rules can be extended to other organisms. The new nomenclature now allows for a rational and complete search of the literature pertaining to HMG nuclear proteins. The interrelationship between the various mammalian HMG proteins is diagrammed below.

C.  Literature Search Strategy

  Concomitant with the name changes, the search methods for HMG proteins have been modified in PubMed (ENTREZ) and alterations will be made in the MESH indexing. The old and new HMG names are both forward and backward search compatible. Thus, a PubMed search with a new term, "HMGB1", now, and in the future, will give the same result as a search with any of the old terms such as HMG1, HMG-1 or amphoterin. In the future, a search with HMG-14 will still give all the entries with text words such as either HMG-14, HMG14, or HMGN1 (see Table of HMG Proteins).  A search with HMGB, HMGN, or HMGA will yield entries for all the members of that HMG family. These options were not available prior to the name change and a search with the term "HMG" would have yielded many hits, >70% of them not related to the canonical HMG nuclear proteins.  Note that the old search strategy is still available and may be useful for certain searches.

D.  HMG Families

  1. The HMGB Family  (formerly HMG-1/-2)
  2. The HMGN Family  (formerly HMG-14/-17)
  3. The HMGA Family  (formerly HMG-I/Y; HMGI-C)

  I. The HMGB and HMG-Box Proteins. The main identifying functional symbol (root symbol) for canonical proteins of this type is HMGB. Canonical HMGB proteins typically contain TWO HMG boxes, box A and box B. HMGB related motifs are found in many mammalian and non-mammalian proteins (e.g., SRY contains an HMG box motif embedded in its primary sequence). These HMG motif proteins are referred to as HMG-Box proteins and not HMGB proteins. The nomenclature for non-canonical HMG-Box proteins is not covered under this revision.
  II. The HMGN Family and HMG-Nucleosome Binding Protein Families. The main identifying functional symbol (root symbol) for canonical proteins of this type is HMGN. Canonical HMGN proteins typically contain nucleosome binding domain. HMGN related motifs are also found in other mammalian proteins (e.g., NSBP-1 contains an HMGN box motif embedded in its primary sequence). The nomenclature for non-canonical HMG-Nucleosome Binding proteins is not covered under this revision.
  III. The HMGA Family and HMG-AT-Hook Protein Families. The main identifying functional symbol  (root symbol) for canonical proteins of this type is HMGA. Canonical HMGA proteins typically contain two or three AT hooks. HMGA related motifs are found in many mammalian and non-mammalian proteins. These HMGA motif proteins are referred to as HMG-AT-hook proteins and not HMGA proteins. The nomenclature for non-canonical HMG-AT-hook proteins is not covered under this revision.

E.  Canonical and Mammalian HMGs

Table of HMG Proteins
Protein, New Name Protein, Old (Alternate) Name Gene, New Symbol *Accession #
HMGB Proteins
HMGB1 HMG1; HMG-1;HMG 1; (amphoterin) HMGB1
Hmgb1
H: U51677
M: U00431
HMGB2 HMG2; HMG-2; HMG 2 HMGB2
Hmgb2
H: M83665
M: X67668
HMGB3 HMG2a; HMG-2a; HMG 2a (HMG-4) HMGB3
Hmgb3
H: NM_005342
M: Y10044
HMGB4** New, no protein characterized yet.   **127540 (intronless)
HMGN Proteins
HMGN1 HMG-14; HMG14; HMG 14 HMGN1
Hmgn1
H: M21339
M: NM_008251
HMGN2 HMG-17; HMG17; HMG 17 HMGN2
Hmgn2
H: X13546
M: NM_016957
HMGN3 Trip7 HMGN3
Hmgn3
H: L40357
HMGN3a   HMGN3
Hmgn3
See manuscript.
HMGN4   HMGN4 H: NM_006353
HMGN5 NSBP1, NBD-45 HMGN5
Hmgn5
H: NM_030763
M: NM_016710
HMGA Proteins
HMGA1a HMG-I; HMGI; HMG I
HMG-I/Y; a-protein
HMGA1
Hmga1
H: L17131(1)
M: J04179(2)
HMGA1b HMG-Y; HMGY; HMG Y HMGA1
Hmga1
H: M23618(2)
M: J04179(2)
HMGA1c HMG-I/R HMGA1
Hmga1
H: AF176039(2)
HMGA2 HMGI-C HMGA2
Hmga2
H: L46353
M: L41617

F. Various HMGs

Examples of other HMG Proteins, OLD nomenclature (incomplete list). These examples are not included with the text search results using the above terms, but many can be identified through sequence search strategies.

Name Remark *Accession #
HMG20 Later identified as ubiquitin. 
Not an HMG protein.
H: M26880
M: X51703
HMG20ARecently discovered factor. 
Not a canonical HMG protein.
H: NM_018200
HMG20B Recently discovered factor.
Not a canonical HMG protein
M: AL355735
HMG-6  (H6) HMGN  protein from Trout.  T: g70781
HMG-E HMGB protein originally from erythrocytes None
HMG-14a,b HMGN variants of chicken.  
HMG-D Drosophila protein related to HMGB. D: X71138
HMG-T1,2,3 HMGB proteins  from Trout. T: X02666  and  L32859
HMG-X HMGB type from Xenopus. X: D30765
HMG-3 Degradation product of HMGB1. None
HMG-4 Recently described protein identical to HMGB3. H: NP_005333
M: AAC16925
HMG-2a HMGB variant from chicken, named differently than those from human, mouse, or calf. C: X63463
HMG-2b HMGB variant from human with novel 3' UTR. H: Z17240

G.  Plant HMGs

Protein, New Name Protein, Old (Alternate) Name Remark *Accession#
HMGB1 HMGa HMGB type protein from maize (Zea mays) X58282
HMGB2 HMGc1 HMGB type protein from maize (Zea mays) Y08297
HMGB3 HMGc2 HMGB type protein from maize (Zea mays) Y08298
HMGB4 HMGd HMGB type protein from maize (Zea mays)Y08807
HMGB5 HMGe HMGB type protein from maize (Zea mays) AJ006708
HMGA HMGI/Y HMGA type protein from maize (Zea mays) AJ131371
HMGB1 HMGa HMGB type protein from Arabidopsis thalianaY14071
HMGB2 HMGb1 HMGB type protein from Arabidopsis thaliana Y14072
HMGB3 HMGb2 HMGB type protein from Arabidopsis thaliana Y14073
HMGB4 HMGg HMGB type protein from Arabidopsis thaliana Y14074
HMGB5 HMGd HMGB type protein from Arabidopsis thaliana Y14075


* The accession is the unique identifier to either the GenBank record containing the gene sequence encoding that protein or to the "reference sequence" (http://www.ncbi.nlm.nih.gov/RefSeq/) as assigned by GenBank. The accession # used does not imply any priority of discovery or scientific significance.
(1) Denotes full length gene coding for the HMGA1 proteins.

(2) Denotes the mRNA coding for the splice variant.

The tables were compiled by several investigators.

This Web site is maintained by the Mouse Gene Nomenclature Committee.

Suggestions, remarks, corrections, or questions can be addressed to bustin@helix.nih.gov.