Schema for UniProt Variants - UniProt/SwissProt Amino Acid Substitutions
  Database: hg38    Primary Table: spMut Data last updated: 2024-03-26
Big Bed File Download: /gbdb/hg38/uniprot/unipMut.bb
Item Count: 117,309
The data is stored in the binary BigBed format.

Format description: Browser extensible data (12 fields) plus information about uniProt mutation
fieldexampledescription
chromchr1Chromosome (or contig, scaffold, etc.)
chromStart166860283Start position in chromosome
chromEnd166860286End position in chromosome
nameR198QName of item
score1000Score from 0-1000
strand-+ or -
thickStart166860283Start of where display should be thick (start codon)
thickEnd166860286End of where display should be thick (stop codon)
reserved0Used as itemRgb as of 2004-11-22
blockCount1Number of blocks
blockSizes3Comma separated list of block sizes
chromStarts0Start positions relative to chromStart
statusManually reviewed (Swiss-Prot)Status
varTypeNaturally occurring sequence variantVariant Type
diseasesDiseases
mutationposition 198, Arg changed to GlnCoding seq. mutation
commentsComment
variationIdQ96BN2#VAR_038352|VAR_038352UniProt variant
dbSnpIdrs2272792dbSNP
uniProtIdQ96BN2UniProt record
pmidsSource articles

Sample Rows
 
chromchromStartchromEndnamescorestrandthickStartthickEndreservedblockCountblockSizeschromStartsstatusvarTypediseasesmutationcommentsvariationIddbSnpIduniProtIdpmids
chr1166860283166860286R198Q1000-1668602831668602860130Manually reviewed (Swiss-Prot)Naturally occurring sequence variantposition 198, Arg changed to GlnQ96BN2#VAR_038352|VAR_038352rs2272792Q96BN2
chr1166936687166936690V202I1000-1669366871669366900130Manually reviewed (Swiss-Prot)Naturally occurring sequence variantposition 202, Val changed to IleQ71H61#VAR_049948|VAR_049948rs33958744Q71H61
chr1166989472166989475S41A1000+1669894721669894750130Manually reviewed (Swiss-Prot)Naturally occurring sequence variantposition 41, Ser changed to AlaQ96JY0#VAR_034103|VAR_034103rs11578336Q96JY0
chr1167063657167063660K165N1000-1670636571670636600130Manually reviewed (Swiss-Prot)Naturally occurring sequence variantposition 165, Lys changed to AsnQ99795#VAR_049874|VAR_049874rs2228399Q99795
chr1167073522167073525D20N1000-1670735221670735250130Manually reviewed (Swiss-Prot)Naturally occurring sequence variantposition 20, Asp changed to AsnQ99795#VAR_020079|VAR_020079rs2274531Q99795
chr1167125923167125926E265D1000+1671259231671259260130Manually reviewed (Swiss-Prot)Naturally occurring sequence variantposition 265, Glu changed to AspQ5VZP5#VAR_034964|VAR_034964rs267745Q5VZP515489334
chr1167126526167126529R466H1000+1671265261671265290130Manually reviewed (Swiss-Prot)Naturally occurring sequence variantposition 466, Arg changed to HisQ5VZP5#VAR_034965|VAR_034965rs6668826Q5VZP5
chr1167126643167126646A505T1000+1671266431671266460130Manually reviewed (Swiss-Prot)Naturally occurring sequence variantposition 505, Ala changed to ThrQ5VZP5#VAR_034966|VAR_034966rs3795605Q5VZP5
chr1167127693167127696K855Q1000+1671276931671276960130Manually reviewed (Swiss-Prot)Naturally occurring sequence variantposition 855, Lys changed to GlnQ5VZP5#VAR_034967|VAR_034967rs267746Q5VZP515489334
chr1167128500167128503T1124N1000+1671285001671285030130Manually reviewed (Swiss-Prot)Naturally occurring sequence variantposition 1124, Thr changed to AsnQ5VZP5#VAR_034968|VAR_034968rs2281959Q5VZP5

UniProt Variants (spMut) Track Description
 

Description

NOTE:
This track is intended for use primarily by physicians and other professionals concerned with genetic disorders, by genetics researchers, and by advanced students in science and medicine. While the genome browser database is open to the public, users seeking information about a personal medical or genetic condition are urged to consult with a qualified physician for diagnosis and for answers to personal questions.

This track shows the genomic positions of natural and artifical amino acid variants in the UniProt/SwissProt database. The data has been curated from scientific publications by the UniProt staff.

Display Conventions and Configuration

Genomic locations of UniProt/SwissProt variants are labeled with the amino acid change at a given position and, if known, the abbreviated disease name. A "?" is used if there is no disease annotated at this location, but the protein is described as being linked to only a single disease in UniProt.

Mouse over a mutation to see the UniProt comments.

Artificially-introduced mutations are colored green and naturally-occurring variants are colored red. For full information about a particular variant, click the "UniProt variant" linkout. The "UniProt record" linkout lists all variants of a particular protein sequence. The "Source articles" linkout lists the articles in PubMed that originally described the variant(s) and were used as evidence by the UniProt curators.

Methods

UniProt sequences were aligned to RefSeq sequences first with BLAT, then lifted to genome positions with pslMap. UniProt variants were parsed from the UniProt XML file. The variants were then mapped to the genome through the alignment using the pslMap program. This mapping approach draws heavily on the LS-SNP pipeline by Mark Diekhans. The complete script is part of the kent source tree and is located in src/hg/utils/uniprotMutations.

Data Access

The raw data can be explored interactively with the Table Browser, or the Data Integrator. For automated analysis, the genome annotation is stored in a bigBed file that can be downloaded from the download server. The underlying data file for this track is called spMut.bb. Individual regions or the whole genome annotation can be obtained using our tool bigBedToBed which can be compiled from the source code or downloaded as a precompiled binary for your system. Instructions for downloading source code and binaries can be found here. The tool can also be used to obtain only features within a given range, for example:
bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/bbi/uniprot/spMut.bb -chrom=chr6 -start=0 -end=1000000 stdout
Please refer to our mailing list archives for questions, or our Data Access FAQ for more information.

Credits

This track was created by Maximilian Haeussler, with advice from Mark Diekhans and Brian Raney.

References

UniProt Consortium. Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res. 2012 Jan;40(Database issue):D71-5. PMID: 22102590; PMC: PMC3245120

Yip YL, Scheib H, Diemand AV, Gattiker A, Famiglietti LM, Gasteiger E, Bairoch A. The Swiss-Prot variant page and the ModSNP database: a resource for sequence and structure information on human protein variants. Hum Mutat. 2004 May;23(5):464-70. PMID: 15108278