This help document answers the following questions:
The Gene Expression Database (GXD) is designed to integrate many different types of endogenous gene expression data from the mouse in formats appropriate for comprehensive analysis. Query results, such as the developmental stage and tissue of expression (or non-expression), the genetic origin of the sample, and the numbers and sizes of detected bands, are described together with the molecular probe, the expression assay type, and the experimental conditions used. Expression patterns are described using an extensive, hierarchical dictionary of standardized anatomical terms, making it possible to record expression results from assays with differing spatial resolution in a consistent and integrated manner and to analyze expression patterns at differing levels of detail. Whenever possible, text annotations are complemented by digitized images of the original expression data.
To learn more about GXD and how we acquire data, please see About GXD.
See also the MGI Glossary for unfamiliar terms and acronyms.
The Gene Expression Data Query form Standard Search tab allows you to ask questions that can be basic or complex. Examples of the type of questions you can ask are:
(See Are there examples of query results? for the form field values for these queries.)
If you are interested in doing queries that ask what genes are expressed in some anatomical structures and/or developmental stages but not in others, you should use the Differential Expression Search tab on this form.
|Genes | Genome location | Anatomical structure or stage | Mutant / wild type | Assay types|
|Genes||This section offers a choice of two fields.
Nomenclature search. Use the left hand field to search for expression data for a mouse gene or set of genes with a similar nomenclature.
Vocabulary term search. Use the right hand field to find expression data for genes defined by their function, phenotype or disease model association.
Note that you may not submit a search with values in the Nomenclature search and the Vocabulary Term search at the same time.
|Genome location||You can enter a list of mouse genomic regions (e.g. 2:105668896-108698410, Chr12:28553755..28867491).
|Anatomical structure or stage||
Click to limit your search to assay results where expression was detected in, not detected in, or either. The default value is either.
You can choose to use one or both fields in this section.Anatomical Structures
The anatomical structures are taken from the Mouse Developmental Anatomy Ontology. Begin typing an anatomy term of interest and an autocomplete list activates on the second character. The list shows all the anatomy terms and their synonyms that match your search characters. The term is followed by the Theiler Stages for which the structure exists (e.g. lung lobe TS20-28). You must choose a term from the autocomplete list (i.e., there is no free text entry in this field).The ordering on the drop down is as follows:
For each developmental stage, the Anatomical Dictionary is organized as a hierarchy of structures. By default, the search includes substructures. This means if you search for gene expression in, for example, brain, your search will also return annotations for all substructures of brain, such as rhombencephalon, tectum and others.
If you are unsure of the most appropriate structural term to use, browse the Mouse Developmental Anatomy Ontology.Developmental Stages
Use this field to select one or more Theiler stages (TS) to focus your search on a particular stage of embryonic development. Alternately, you can choose the Use Ages tab and limit your search to the general terms, Embryonic or Postnatal, or choose specific days post conception (dpc).
Note: The Theiler system organizes development into stages defined by the appearance of specific developmental features. Embryos of the same gestational age can vary considerably with respect to development. Consequently, a Theiler stage does not precisely correspond to a particular age. For example, TS 21 applies to embryos between 12.5 and 14.0 days post conception (dpc), while TS 22 applies to embryos between 13.5 and 15.0 dpc.To select multiple stages or ages:
If you mouse-over a stage, a pop-up will describe a few defining features of that stage, such as "first somites, late head fold" for Stage 12. You can browse Stage descriptions for a list of the defining features for each Theiler stage.
|Mutant / wild type||
Select Specimens mutated in gene to search for expression data obtained from mutant specimens. Use this field to enter the symbol or name of the mutated genetic marker. The system searches for current symbols/names and synonyms of the mutated gene. Use this search field as you would the Genes' nomenclature search at the top of the search form.
Select Wild type specimens only if you wish to exclude mutant alleles from your search. This search will still include knock-in reporter genes used in in situ reporter assays. Homozygous mutant specimens are not included in this search.
Select All specimens, the default, to include all mutant and wild type specimens in your search.
|Assay types||Check list of assay types. Use this section to limit your search to assays of one or more selected types. The default is to exclude RNA-Seq assays from the search. Use the in situ, blot and whole genome assay check boxes to quickly select or deselect types of assays.|
The Differential Expression Search is specifically designed to allow you to search for genes expressed in some anatomical structures or developmental stages but not in others. Examples of the questions you can ask are:
RNA-Seq data are not yet included in these searches.
|Anatomical Structure | Developmental Stage|
|Anatomical Structure||The anatomical structures are taken from the Mouse Developmental Anatomy Browser. Begin typing a term of interest and an autocomplete list activates on the second character.
For each stage, the Anatomical Dictionary is organized as a hierarchy of structures. Thus the Find genes where expression is detected in, search includes substructures. This means that if you search for gene expression in, for example, brain, besides searching for brain, the system also searches for substructures to brain in the hierarchy, such as rhombencephalon.
If you enter a structure in the upper section of the form, then in the lower section, you must either enter another structure or check the box for "anywhere else." If you check the box for “anywhere else,” your search will return genes that have been shown to be expressed in the specified structure (or its substructures) but nowhere else.
If you are unsure of the most appropriate structural term to use, browse the Mouse Developmental Anatomy Browser.
|Developmental Stages||Use this field to select one or more Theiler stages (TS) to focus your search on a particular stage of embryonic development.
Note: The Theiler system organizes development into stages defined by the appearance of specific developmental features. Embryos of the same gestational age can vary considerably with respect to development. Consequently, a Theiler stage does not precisely correspond to a particular age. For example, TS 21 applies to embryos between 12.5 and 14.0 days post conception (dpc), while TS 22 applies to embryos between 13.5 and 15.0 dpc. Therefore, if you are interested in all available records for embryos of age 14.0 dpc, you must select both TS 21 and TS 22.
You can browse Stage descriptions for a list of the defining features for each Theiler stage.
The default tab for differential expression search results is the Tissue x Stage Matrix. This matrix is described in detail below. In brief,
Yes. There are 4 ways you can do this:
Yes, you can.
The Genes, Assays and Assay Results tabs offer progressively detailed views of your expression search results. The 2 matrices provide interactive high-level overviews of temporal and spatial expression patterns and offer the ability to drill down to more detail.
|Genes||The Genes tab provides a list of the genes that match your query along with the options to export your results as text files or to send the list to our Batch Query. Batch Query searches can associate the genes with additional information, such as functional or phenotypic data. See Using the MGI Batch Query.|
|Assays||The Assays tab presents a list of the unique assays that match your search, as well as their assayed genes and references.|
|Assay Results||The Assay Results tab provides the most detailed summary. It lists the structures assayed and whether or not expression was detected, in addition to the information provided in the 2 previous tabs.|
|Images||The Images tab allows you to quickly view images of expression results that match your search parameters.|
|Tissue x Stage Matrix||The Tissue x Stage Matrix tab offers a temporal and spatial overview of gene expression results. Toggles allow you to expand the anatomy hierarchy to reveal greater levels of detail in tissues of interest. This tab also allows you to filter results by tissue and developmental stage.|
|Tissue x Gene Matrix||The Tissue x Gene Matrix tab enables the comparison of expression patterns of different genes. As with the tissue x stage matrix, toggles allow you to expand the anatomy hierarchy to reveal greater levels of detail in tissues of interest. This tab also allows you to filter by gene.|
Your search results are displayed within six tabs, each noting the number of results: Genes, Assays, Assay Results, Images, Tissue x Stage Matrix and Tissue x Gene Matrix. With the Standard search, the Assay Results tab is selected by default. For Differential Expression searches, the Tissue x Gene Matrix tab is the default selection. See What are the advantages of the different results tabs? for a brief guide to choosing results tabs.
If your search returns more than 21 million assay results, only the first page of the assay results tab will be rendered. The other tabs will be blank. Such large search returns make our pages slow to load. To get full functionality in the summary, you will need modify your search or filter to return fewer results.
If you select the Genes tab, you are offered the ability to Export your results as a text, or forward the list of genes to the MGI Batch Query or MouseMine, where you can find additional information on the genes. Below the Export options, a table summarizes information on the gene(s). You can use the triangles in the column headers to sort by a column. The query summary contains the following fields:
MGI ID MGI accession ID for the Gene Gene Symbol of the analyzed gene, linked to its MGI marker detail record Gene Name Name of the analyzed gene Type Marker feature type for the analyzed gene. See Genome Feature Type Definitions. Chr Chromosome to which the marker is assigned. Aside from a chromosome number, other possible values include MT (mitochondrial), XY (XY pseudoautosomal), UN (chromosome assignment is not known) Genome Location-NCBI Build # Genome coordinate range in base pairs and the build number of the C57BL/6J genome cM Cytogenetic position Strand Coding strand for the gene
If you select the Assays tab, the query summary contains the following fields:
Gene Symbol of the analyzed gene for most assay types. Reflecting the nature of the assay, "Whole Genome" for RNA-Seq assays. Assay Details For most assay types, the word data linked to detailed expression results; the MGI accession ID of the assay (useful for searching purposes) appears in parentheses. For RNA-Seq assays, data set, followed by EBI's Expression Atlas set ID; clicking on the filter () icon will filter the summary for that data set. Assay Type Name of the assay type used (e.g., Immunohistochemistry, Northern blot, RNA in situ, and so on). Reference For most assay types, J number (MGI reference identifier) and short format citation for this reference linked to its MGI reference detail report. For RNA-Seq assays, the EBI Expression Atlas set ID and title, linked to the GXD RNA-Seq and Microarray Experiment Summary. There you will find details of the experiment and linkouts to the Expression Atlas (for access curated data and additional display and analysis tools), to ArrayExpress and GEO (to access the source data) and PubMed (to access the publication).
If you select Assay Results, you are offered the ability to Export your results as a text or export the RNA-Seq results to a heat map. Below the Export options, a table summarizes information on the assays. Ten columns are shown by default; an additional 4 columns are displayed when you use the “Show Additional Sample Data” toggle. You can use the triangles in the column headers to sort by a column. The query summary contains the following fields:
Gene Symbol of the analyzed gene Result Details The word data links to detailed expression results in GXD (for most assay results) or linked to experiment detail pages at EBI’s Expression Atlas (for RNA-Seq assay results); the MGI accession ID of the assay or RNA-Seq data set id appears next in parentheses Assay Type Name of the assay type used (e.g., immunohistochemistry, Northern blot, RNA in situ) Age Age of the specimen used in the assay. The abbreviations for age annotations are as follows:
E Embryonic day P Postnatal or postnatal day P w Postnatal week P m Postnatal month P y Postnatal year P adult Postnatal adult P newborn Postnatal newborn Structure Structure examined in the assay, given as a Theiler stage (TS) and anatomical structure name. Cell Type Cell type, if specified. Use of the Cell Type (CL) Ontology for annotation began in 2022, so there are not many annotations using it in the database. Detected? Was expression detected? Possible values are:
- Not Specified
- No is used for RNA-Seq results when the TPM are less than 0.5.
- Not Specified is used when the authors do not report whether a gel band is present or absent.
- Ambiguous is used when the authors specify it or our curators cannot discern from the author's description whether expression is present or absent.
TPM Level (RNA-Seq) Possible values are:
These mapping conventions are also used by EBI's Expression Atlas.
- Below Cutoff = 0.5 TPM
- Low=0.5-10 TPM
- Medium=11-1,000 TPM
- High=>1,000 TPM
Biological Replicates (RNA-Seq) Column is visible when the Show Additional Sample Data toggle is on.
The number of biological replicates whose TPM were averaged to generate the TPM
Images Name of the specimen or pane with associated images, linked to the appropriate section of the assay detail. Note: Images are shown only when we have permission from the publishers. There are no images for RNA-Seq results. Mutant Allele(s) If the specimen is a mutant, the allele pairs describing the mutant genotype appear in this field. Strain Column is visible when the Show Additional Sample Data toggle is on.
Strain information for the specimen/sample, if specified. Use of involves: in the annotation indicates that the strains listed (separated by asterisks) contributed to the genetic background of the specimen; additional unspecified strains may also be part of the specimen's genetic background. Either: appears when the authors use more than one genetic background in a reference but do not specify which one is the background of the specimen.
Sex Column is visible when the Show Additional Sample Data toggle is on.
Possible values: Female, Male, and Pooled. When the field is blank, it indicates the author did not specific the sex.
Notes (RNA-Seq) Column is visible when the Show Additional Sample Data toggle is on.
Provides additional information about RNA-Seq samples, if needed.
Reference For most assay types, J number (MGI reference identifier) and short format citation for this reference linked to its MGI reference detail report. For RNA-Seq assays, the EBI Expression Atlas set ID and title, linked to the GXD RNA-Seq and Microarray Experiment Summary. There you will find details of the experiment and linkouts to the Expression Atlas (for access curated data and additional display and analysis tools), to ArrayExpress and GEO (to access the source data) and PubMed (to access the publication).
If you select the Images tab, the query summary contains the following fields:
Figure A thumbnail image, linked to the MGI Image Detail page. Gene Symbol of the analyzed gene(s). Assay Type Name of the assay type(s) used (e.g., Immunohistochemistry, Northern blot, RNA in situ, and so on). Specimen Type Describes whether the sample is a whole mount, section, section from a whole mount, or optical section. Result Details Links to image and expression assay result details.
The Tissue x Stage Matrix tab presents the query results in a grid view. The columns show developmental, Theiler, stages (TS) where expression was examined and the rows display the tissues that were examined. Theiler stages are defined using morphological criteria, not gestational age. For example, TS4 represents the blastocyst stage. For simplicity and ease of reference, the column headers also display the gestational age range over which each developmental stage may be found. The Theiler Stages are further defined here. The tissues consist of anatomy terms from The Mouse Developmental Anatomy Ontology. You can expand or collapse the anatomy hierarchy by clicking on the blue ► or ▼.
All expression results filters are applied across the six summary tabs, including those filters that originate in the matrix. On the matrix, check rows and/or columns and click the Filter () button to reduce the grid to just the rows and columns you have checked. Selected filters will be displayed above the tab; they can be removed by clicking on them. Any filtering done on the grid will apply to all the other data tabs as well. Individual cells have the ability to filter as well. Clicking on a cell displays the options, View All Results and View Images. Selecting one of those options filters the results and takes you to the appropriate summary tab where you can explore those results by visiting detail pages.
The Tissue x Gene Matrix tab presents the query results in a grid view. The columns show genes whose expression was examined and the rows display the tissues that were examined. The tissues consist of anatomy terms from The Mouse Developmental Anatomy Ontology. You can expand or collapse the anatomy hierarchy by clicking on the blue ► or ▼.
In addition to the filters above the tabs, you can also filter by checking rows and/or columns in the grid and clicking on the Filter () button to reduce the grid to just the rows and columns you have checked. Selected filters will be displayed above the tab; they can be removed by clicking on them. Any filtering done on the grid will apply to all the other data tabs as well. Individual cells have the ability to filter as well. Clicking on a cell displays the options, View All Results and View Images. Selecting one of those options filters the results and takes you to the appropriate summary tab where you can explore those results by visiting detail pages.
The colors in the grid, blues for expression detected and reds for expression not detected, get progressively darker when there are more supporting annotations. The darker colors do not denote higher or lower levels of expression, just more evidence.
Grey bars indicate the only annotations for the tissue are ones for which the researchers examined the tissue but did not definitively state whether expression was present or absent, i.e. their description was ambiguous.
Squares with a red corner indicate that expression has been reported as being both present and absent in that tissue. Although this may be because different labs made conflicting observations, it may also be due to other factors, such as differences in the genotype or sex of the specimen, sensitivity of the assay used, or a difference in the transcript variant assayed.
A square with a gold corner indicates that although there are no annotations where expression was reported as being present in either the tissue or its substructures, there are expression results for substructures of the tissue that were reported as being either absent or ambiguous. See How are expression results propagated through the matrix anatomy hierarchy?
A circle in a square (found only in the Tissue x Stage Matrix) indicates the tissue exists at that Theiler stage but we have no expression data for it, whereas a blank square indicates that the tissue does not exist at that stage.
Clicking in any box in the grid will open a dialogue box. For blue squares, this box will indicate the number of annotations where expression was reported as being present in either the tissue or its substructures. It may also indicate additional expression results annotated to this tissue that were reported as being either absent or ambiguous; absent or ambiguous annotations to substructures will not be reported. See How are expression results propagated through the matrix anatomy hierarchy?
For red squares, this box will indicate the number of annotations where expression was reported as being absent in the tissue; annotations to substructures will not be reported. See How are expression results propagated through the matrix anatomy hierarchy?
For grey bars, this box will indicate the number of annotations to the tissue where the expression levels were reported in an ambiguous fashion. It may also indicate additional expression results annotated to this tissue that were reported as being absent. In neither case will absent and/or ambiguous annotations to substructures be reported. See How are expression results propagated through the matrix anatomy hierarchy?
Since squares with a gold corner indicate that all the expression results are absent or ambiguous annotations to substructures (rather than the structure represented by the square) the number of annotations is not indicated in the dialogue box.
In all cases, clicking "View All Results" will take you to the assay results tab where these results can be reviewed in detail. If there are annotations where expression was reported as being absent or ambiguous in a substructure, they will be shown on this tab (although they will not be included in the counts on the dialogue box). Clicking "View Images" will take you to the Images tab.
The nature of the expression data captured by GXD requires present and absent expression results to be managed differently in the matrix anatomy hierarchy. From the statement, "expression was detected in the heart ventricle," we can infer that "expression is detected (somewhere) in the heart." In contrast, the statement "expression was not detected in the heart ventricle" does not imply that "expression is found (nowhere else) in the heart." Therefore, a positive expression result is displayed in the matrix as a blue square for that anatomical structure, but any parent structure will also display a blue square based on that present result. In contrast, absent expression results can only be represented for that structure in the matrix. Parent structures will not display a red square based on an absent result in a child structure.
You can filter results by attributes of the assay results or by vocabulary annotations to the genes analyzed. In the banner above your results, below Filter expression data by:, click the button for the desired filter. This will open a box displaying all the filter terms associated with the assay results or genes in your dataset. Check the boxes of interest and click Filter to to reduce your results to only those annotated to the checked items. Filters that have been applied are displayed below the buttons. You can remove them by individually clicking on them or by using the “Remove All Filters” button.Note:
Assay Result Filters (left column)
|Anatomical System||High-level terms to describe tissue analyzed|
|Theiler Stage||Developmental stage analyzed|
|Assay Type||Assay type used for study|
|Detected?||Either: Yes; No; Not Specified; or Ambiguous|
|TPM Level||For RNA-Seq results; either: Below Cutoff; Low; Medium; or High|
|Wild type?||Whether the specimen/sample was wild type or mutant.|
Gene Filters (right column)
|Gene Type||Type of marker, e.g. protein-coding, non-coding RNA gene|
|Molecular Function||High-level GO terms describing the molecular processes the gene product is involved with|
|Biological Process||High-level GO terms describing the biological processes the gene product is involved with|
|Cellular Component||High-level GO terms describing the cellular location of the gene product|
|Phenotype||High-level Mammalian Phenotype (MP) terms describing the phenotype of mice carrying mutant alleles of the gene|
|Disease||High-level Disease Ontology (DO) terms describing the diseases mice carrying mutant alleles of the gene manifest|
Both Matrix tabs offer easy filtering by tissue. The Tissue x Stage Matrix also offers filtering by developmental stage/age, while the Tissue x Gene Matrix permits filtering by gene symbol. Check the rows or columns you wish to retain and then click the Filter () button in the grid to apply the filters. These filters are also applied to all tabs.
Yes. The Genes and Assay Results tabs provide export options. Click on the Text File icon. Your web browser will download a file of the results. The Assay Results file will include all columns, regardless of whether or not the "Show Additional Sample Data" toggle is chosen.
Yes. Click on the RNA Seq ► Heat Map button above the table of results in the Assay results tab. A browser window will open, displaying your selected data in Morpheus, a heat map visualization and analysis tool created at the Broad Institute. A summary of the data exported to Morpheus is at the top of the page. Each column of the heat map is a bioreplicate set; the GXD metadata annotations for the set are included in the column display. Each row in the heat map is a gene; the display includes its symbol and its MGI and Ensembl IDs.
Note: for any experiment, the number of samples displayed in GXD (and the heat map) may be fewer than the number in the Expression Atlas, the data source. The reasons for this are two fold:
Yes. Under the Genes tab you have the option of exporting your results as a text file, or sending the list of genes to the MGI Batch Query or MouseMine. If you click the Batch Query icon, the default Batch Summary will appear for your list of genes. Use the Click to modify search button to expand your search results to include additional information such as Gene Ontology (GO) terms, Mammalian Phenotype (MP) terms or Human Diseases associated with the genes. MouseMine offers flexible querying, numerous predefined query templates, iterative refinement of results, and linking to other model organism Mines.
See also Using the MGI Batch Query.
At the end of the query summary, there may be a link to Gene Expression Literature content records matching the parameters of your detailed expression data search.
Thus, to ensure that your query returns all the references of interest to you, searches of the detailed expression data also include searches of the literature content records (whenever query parameters apply to both these types of data). Filtering your query summary will not affect the number of literature records returned, whereas modifying your query will.
The following detailed expression data query parameters generate an expression literature search:
Because of the limited annotation of content records, the following parameters do not generate an expression literature search:
Note: You can query gene expression literature content records directly using the Gene Expression Literature Query Form.
GXD curators add new annotations from the literature every day. The MGI web site is updated with these and other new data once a week.
The following examples list the field values to set on the Gene Expression Data Standard Query to perform each query. Leave the default values in all other fields.
Genes…Search for expression data for…: en1
This query returns a list of results from assays examining En1 expression. This summary report includes the gene symbol, result details linked to an assay record (click data), the assay type used, the anatomical system assayed, the age of the specimen, the specific structure examined, an indication of whether or not expression was detected, links to images, the mutant allele pairs describing the mutant genotype of the specimen (if applicable), and the reference from which the data were curated, linked to its abstract.
Expression: detected in
Anatomical Structures: heart
Developmental Stages (dpc): Select the Ages (dpc) tab and then E8.5
Mutant / wild type: Wild type specimens only
This query returns a list of assay results for genes expressed in the heart at E8.5. Click the Genes tab to view a summary report that includes the gene symbols (with links to MGI detail pages), gene types and genomic map coordinates. The report does not include any expression results obtained from mutant alleles. Note: Each developmental stage in the example includes coverage of embryonic day 8.5. See Theiler System for details.
Genes…Search for expression data for…:wnt*
Expression: detected in
Developmental Stages (dpc): TS 16
This query returns a list of assay results for Wnt family genes whose expression was detected in Theiler Stage 16.
Genes…A set of genes defined by: axon guidance
This query returns a list of all assay results for all genes known to be involved in mouse axon guidance.
Expression: not detected in
Anatomical Structure(s): diencephalon
This query returns a list of genes found not to be expressed in the diencephalon or in anatomical structures within the diencephalon, such as the hypothalamus, 3rd ventricle, or pituitary gland.
Genes...A set of genes defined by: cell adhesion
Expression: detected in
Anatomical Structures: stomach mesenchyme
This query returns a list of genes involved in cell adhesion found to be expressed in the stomach mesenchyme.