About   Help   FAQ
Using the Protein Superfamily Browser

This help document answers the following questions about the Protein Superfamily Browser:

See also:

What is a protein superfamily?

Protein families are clustered into homeomorphic superfamilies. PIRSF uses this concept to represent protein clusters with 80% sequence similarity. Currently there are about 36,000 PIRSF superfamilies, and 1786 are represented in mouse.

What is PIRSF?

PIRSF is a curated protein classification system based on sequence analysis which clusters all known protein sequences into a network structure representing domain superfamilies, homeomorphic superfamilies, subfamilies and families. At one level there is a computationally-derived homeomorphic clustering with aims to divide all proteins into groups called superfamilies, with 80 percent sequence similarity. These homeomorphic superfamilies contain protein sequences which are homologous from end-to-end and contain the same domain architecture.

What is a PIRSF ID?

A PIRSF ID identifies a data record in the PIRSF database. PIRSF IDs have an alphanumeric format of PIRSFnnnnnn; PIRSF005574, for example, is the ID for DNA mismatch repair protein.

What is the Protein Superfamily Vocabulary?

The Protein Superfamily Vocabulary is a catalog of Protein Information Resource SuperFamily (PIRSF) classification names annotated to MGI genes.

What other terminology is associated with this tool?

See Protein Superfamily Terms and Concepts.

How is the Protein Superfamily Browser organized?

The Vocabulary has a flat structure and is organized alphanumerically (A-Z; 0-9). When you click any letter of the alphabet, a list of all protein superfamilies beginning with that letter appears, arranged alphabetically. When you click 0-9, one list appears, containing all protein superfamilies beginning with a number (e.g., 1 alpha,25-dihydroxyvitamin D3-inducible protein). The number of mouse genes in MGI appears in parentheses next to each protein superfamily name. The absence of this text, e.g., (3 genes), means that MGI currently contains no annotations for this gene.

Back to Top

What is the difference between browsing and searching the vocabulary?

You can either browse or search from the Protein Superfamily Browser entry page.

Back to Top

What can I use the Protein Superfamily Browser to find?

You can use the Browser to search for new gene clusters according to the PIRSF superfamilies they derive from. The tool integrates MGI associations between mouse genes and their representative polypeptide sequences with the PIRSF protein clustering data. MGI genes (mouse, human and rat) appear on the Protein Superfamily Detail page, grouped and classified by structural and functional relationships, based on PIRSF homeomorphic superfamilies combined with MGI orthology data.

Back to Top

What do I need to know about Boolean operators in order to use this tool?

See What Boolean operators are allowed? How do they work? Is there a default operator?.

Can I enter more than one term or PIRSF ID? mix a protein superfamily name with a PIRSF ID?

Yes, you can. However, the tool interprets any spaces in your query as ANDs. Therefore, if you do not use a Boolean operator, make sure any family names and IDs are from the same term. See What Boolean operators are allowed? How do they work? Is there a default operator? and Examples for sample entries and results.

Back to Top

How do I get the details about protein superfamilies with MGI gene annotations?

There are several ways to use the Protein Superfamily Browser to find protein superfamilies with MGI gene annotations in MGI.

  1. Use the alphanumerical list at the top of the Protein Superfamily Browser and click a letter or number. A list of protein superfamilies beginning with that letter or number appears.
  2. Use the Search box to enter term(s) or PIRSF ID(s) and click Search. The Protein Superfamily Browser page lists any matches found. Note: If there are MGI gene annotations for that gene, this information appears in parentheses beside the protein superfamily name.

Once a superfamily list appears, click an item to view its Protein Superfamily Detail page. On the Protein Superfamily Detail page, there are links to additional MGI resources. (See Interpreting a Protein Superfamily Detail page.)

Clicking the PIRSF ID number takes you to the PIR iProClass report (for an example, see the IProClass tutorial).

Back to Top

How do I interpret a Protein Superfamily Detail page?

See Interpreting a Protein Superfamily Detail Page .

What should I do if I get no results?

If your query returns no results, a Protein Superfamily Browser summary displays your input text and reports that there were zero matching human gene protein superfamily names for that term (or ID), e.g.:

You searched for... Protein Superfamily Vocabulary:contains PIRSF0000894 searching Protein Superfamily vocabulary terms and accessionIDs. 0 matching Protein Superfamily terms

The probable cause is that something in the search box is incorrect. For example:

See the sample query entries below for additional help.

Back to Top

Are there any examples?

Your entry# matches*Why?
PIRSF003333 and DNA excision repair protein0The PIRSF identifier and the term are not from the same PIRSF superfamily name. Use OR to connect this ID and term (or change the ID to, e.g., PIRSF003332, an excision repair protein superfamily ID).
DNA repair13 or moreThe tool finds any protein superfamily name containing DNA and repair. Results include DNA excision repair cross-complementing protein ERCC3, DNA mismatch repair protein, yeast DNA repair protein RAD51, and more.
PIRSF005574 OR DNA mismatch repair protein4 or moreThe tool finds entries containing either the PIRSF ID or DNA. It interprets the additional spaces as ANDs and will return any entries with mismatch and repair and protein.
DNA repair and recombination protein2 or moreThe tool finds any protein superfamily name containing DNA and repair and recombination and protein.
PIRSF0055741Using the PIRSF identifier (in this case, PIRSF005574) yields the most narrow result (i.e., if you are looking specifically for DNA mismatch repair protein).
PIRSF005574 DNA mismatch repair protein1Entering both the superfamily name and its identifier also yields a narrow result (i.e., if you are looking specifically for DNA mismatch repair protein).

The number of matches in the examples is representative and may not be the same as what you get when you query. The point is to show how structuring a query in various ways yields different (or zero) results. Sometimes getting zero (0) results, just means that MGI has no records for to the superfamily or PIRSF ID at this time.

See also Using Full-Text Searches on MGI Query Forms for detailed explanations on using Boolean operators. If your results are not as expected, this document provides additional examples of queries and returns.

Back to Top

Contributing Projects:
Mouse Genome Database (MGD), Gene Expression Database (GXD), Mouse Tumor Biology (MTB), Gene Ontology (GO), MouseCyc
Citing These Resources
Funding Information
Warranty Disclaimer & Copyright Notice
Send questions and comments to User Support.
last database update
11/20/2009
MGI_4.31
Web browser compatibility
The Jackson Laboratory