Symbol report help
Each gene with an approved VGNC symbol has its own Symbol Report that contains our manually curated data and links to many other external biomedical resources. The VGNC "core data" is displayed at the top of the page in a separate box and presents the approved nomenclature, the unique VGNC ID number, synonyms, previous nomenclature, locus type, and chromosomal location. The panels below the VGNC "core data" provides links to external resources such as gene resources and homologs in other species.
† - Placeholder symbol. If you have functional data about this gene or its product(s) please contact us.
Curated - Where links to external resources have been manually curated by a member of the VGNC, a curated tag is displayed after the assertion. Where links have been downloaded from external resources, the curated tag is not shown. Please note that assertions without the curated tag are not subject to our strict manual checking and curation procedures and hence we cannot guarantee the reliability of the data.
The text that follows is a field-by-field guide to the information provided in the Symbol Report.
Core Data fields
The official gene symbol approved by the VGNC, which is a short abbreviated form of the gene name. Symbols are approved in accordance with the Guidelines for Human Gene Nomenclature (please refer to the guidelines page on the HGNC website).
A unique ID provided by the VGNC for each gene with an approved symbol. IDs are of the format VGNC:n, where n is a unique number.
This field displays any symbols that were previously VGNC-approved nomenclature. Many genes will have no data in this field as the approved symbol has never changed.
This field displays any names that were previously VGNC-approved nomenclature. Many genes will have no data in this field as the approved name has never been changed.
Alternative symbols that have been used to refer to the gene. Aliases may be from literature, from other databases or may be added to represent membership of a gene family.
Alternative names for the gene. Aliases may be from literature, from other databases or may be added to represent membership of a gene family.
Specifies the genetic class of each gene entry. All VGNC locus types are listed below:
- gene with protein product - protein-coding genes (the protein may be predicted and of unknown function) (SO:0001217)
- RNA, cluster - region containing a cluster of small non-coding RNA genes
- RNA, long non-coding - non-protein coding genes that encode long non-coding RNAs (lncRNAs) (SO:0001877); these are at least 200 nt in length. Subtypes include intergenic (SO:0001463), intronic (SO:0001903) and antisense (SO:0001904).
- RNA, micro - non-protein coding genes that encode microRNAs (miRNAs) (SO:0001265)
- RNA, ribosomal - non-protein coding genes that encode ribosomal RNAs (rRNAs) (SO:0001637)
- RNA, small nuclear - non-protein coding genes that encode small nuclear RNAs (snRNAs) (SO:0001268)
- RNA, small nucleolar - non-protein coding genes that encode small nucleolar RNAs (snoRNAs) containing C/D or H/ACA box domains (SO:0001267)
- RNA, small cytoplasmic - non-protein coding genes that encode small cytoplasmic RNAs (scRNAs) (SO:0001266)
- RNA, transfer - non-protein coding genes that encode transfer RNAs (tRNAs) (SO:0001272)
- RNA, small misc - non-protein coding genes that encode miscellaneous types of small ncRNAs, such as vault (SO:0000404) and Y (SO:0000405) RNA genes
- phenotype only - mapped phenotypes where the causative gene has not been identified (SO:0001500)
- pseudogene - genomic DNA sequences that are similar to protein-coding genes but do not encode a functional protein (SO:0000336)
- complex locus constituent - transcriptional unit that is part of a named complex locus
- endogenous retrovirus - integrated retroviral elements that are transmitted through the germline (SO:0000100)
- fragile site - a heritable locus on a chromosome that is prone to DNA breakage
- immunoglobulin gene - gene segments that undergo somatic recombination to form heavy or light chain immunoglobulin genes (SO:0000460). Also includes immunoglobulin gene segments with open reading frames that either cannot undergo somatic recombination, or encode a peptide that is not predicted to fold correctly; these are identified by inclusion of the term "non-functional" in the gene name.
- immunoglobulin pseudogene - immunoglobulin gene segments that are inactivated due to frameshift mutations and/or stop codons in the open reading frame
- protocadherin - gene segments that constitute the three clustered protocadherins (alpha, beta and gamma)
- readthrough - a naturally occurring transcript containing coding sequence from two or more genes that can also be transcribed individually
- region - extents of genomic sequence that contain one or more genes, also applied to non-gene areas that do not fall into other types
- T cell receptor gene - gene segments that undergo somatic recombination to form either alpha, beta, gamma or delta chain T cell receptor genes (SO:0000460). Also includes T cell receptor gene segments with open reading frames that either cannot undergo somatic recombination, or encode a peptide that is not predicted to fold correctly; these are identified by inclusion of the term "non-functional" in the gene name.
- T cell receptor pseudogene - T cell receptor gene segments that are inactivated due to frameshift mutations and/or stop codons in the open reading frame
- transposable element - a segment of repetitive DNA that can move, or retrotranspose, to new sites within the genome (SO:0000101)
- unknown - entries where the locus type is currently unknown
- virus integration site - target sequence for the integration of viral DNA into the genome
The taxonomy identitifier for this species this gene belongs to. This taxon ID is taken from the NCBI taxonmy browser.
The Genbank common name for the species this gene belongs to as assigned by the NCBI.
This section contains information on homologs of the gene in other species within a table with species symbol and database ID as the columns.
Provides links to external pages dedicated to information on the gene and to genome browsers. Links are to the following pages:
- The NCBI gene page at the NCBI provides curated sequence and descriptive information about genetic loci including official nomenclature, synonyms, sequence accessions, phenotypes, EC numbers, MIM numbers, UniGene clusters, homology, map locations, and related web sites. There is also a link to the gene annotation at the NCBI Sequence Viewer, the graphical display for the NCBI Nucleotide and Protein databases.
- The Ensembl Gene View displays data associated at the gene level such as orthologs, paralogs, regulatory regions and splice variants. There is a link to the gene annotation at the Ensembl Genome Browser.
Information on proteins encoded by the gene in question. Links are made via UniProt protein accessions. There are four possible links per Symbol Report:
- The UniProt page for the encoded gene protein product. The UniProt Protein Knowledgebase is described as a curated protein sequence database that provides a high level of annotation, a minimal level of redundancy and high level of integration with other databases. We do not map to TrEMBL entries within UniProt, we only map to Swiss-Prot entries as these are manually annotated and reviewed.
- The InterPro page mapped to the displayed UniProt protein accession. InterPro is an integrated database of predictive protein "signatures" used for the classification and automatic annotation of proteins and genomes.
- The PDBe page mapped to the displayed UniProt accession. PDBe is a founding member of the Worldwide Protein Data Bank which collects, organises and disseminates data on biological macromolecular structures.
- The Reactome protein-level page mapped to the displayed UniProt protein accession. Reactome is an manually curated and peer-reviewed pathway database.