Schema for Modern Derived - Modern Human Derived, Denisova Ancestral
  Database: hg19    Primary Table: dhcHumDerDenAncRegFixedDbSnp Data last updated: 2012-10-02
Big Bed File Download: /gbdb/hg19/dhcHumDerDenAnc/dhcHumDerDenAncRegFixedDbSnp.bb
Item Count: 262
The data is stored in the binary BigBed format.

Format description: Human Derived, Denisova Ancestral variants and functional effect predictions from high-coverage Denisova sequencing project
fieldexampledescription
chromchr1Reference sequence chromosome
chromStart19835716Start position in chromosome
chromEnd19835718End position in chromosome
nameAG/-:rs35206814Human allele / Denisova ancestral allele
featureENSR00001038584Ensembl Transcript ID or Regulatory Region ID
or ID of TFB profile from JASPAR or TRANSFAC
gene-Ensembl Gene ID
extra-Extra info: for coding genes, Ensembl Protein ID and/or HGNC;
for Regulatory Motifs, scores & matrix ID
consequenceREGULATORY_REGIONVariant Effect Predictor (VEP) consequence term
cdnaPosition-Offset in transcript, if applicable
cdsPosition-Offset in coding sequence (CDS), if applicable
protPosition-Offset in protein sequence, if applicable
aminoAcids-Amino acid change, if applicable
codons-Codon change, if applicable
humanAlAGModern human fixed (or major) allele on positive strand
denAl-Denisova (ancestral) allele
chimpAl-Chimpanzee ancestral allele
gorAl-Gorilla ancestral allele
orangAl-Orangutan ancestral allele
denZygHOMODenisova zygosity of ancestral allele (homozygous/heterozygous)
dbSNPrs35206814dbSNP rs ID, if available
tgpFreq1.01000 Genomes Project frequency of modern human allele
flagRMFlag(s): CpG if in CpG island; RM if in repeat masked region;
LowQual if conflicting GATK calls; SysErr if prone to systematic errors
geneStrand.Gene strand: '+' or '-', if applicable; otherwise '.'

Sample Rows
 
chromchromStartchromEndnamefeaturegeneextraconsequencecdnaPositioncdsPositionprotPositionaminoAcidscodonshumanAldenAlchimpAlgorAlorangAldenZygdbSNPtgpFreqflaggeneStrand
chr11983571619835718AG/-:rs35206814ENSR00001038584--REGULATORY_REGION-----AG----HOMOrs352068141.0RM.
chr12042752920427530C/-:rs11373285ENSR00000075266--REGULATORY_REGION-----C----HOMOrs113732851.0..
chr12386671723866718C/-:rs11431369ENSR00000280263--REGULATORY_REGION-----C----HOMOrs114313691.0..
chr12891383628913837T/TA:rs2995155ENSR00001038752--REGULATORY_REGION-----TTATATAAAHETrs29951551.0RM,InDelNear.
chr13530116835301169A/-:rs11379016ENSR00000534376--REGULATORY_REGION-----A----HOMOrs113790161.0RM.
chr14331249343312494G/-:rs68140898ENSR00000164952--REGULATORY_REGION-----G----HOMOrs681408981.0CpG.
chr15143382151433822C/-:rs35037351ENSR00000536373--REGULATORY_REGION-----C----HOMOrs350373511.0InDelNear.
chr15357325253573256CTCT/-:rs113215343ENSR00000536661--REGULATORY_REGION-----CTCT----HOMOrs1132153431.0..
chr19381067893810679T/-:rs67079430ENSR00000540865--REGULATORY_REGION-----T----HOMOrs670794301.0..
chr1157137496157137497C/-:rs67893607ENSR00001040166--REGULATORY_REGION-----C----HOMOrs678936071.0..

Modern Derived (dhcHumDerDenAnc) Track Description
 

Description

This track shows mutations in the modern human lineage that rose to fixation or near fixation since the split from the last common ancestor with Denisovans, along with predicted functional effects from Ensembl's Variant Effect Predictor (VEP).

Methods

Methods and analysis are described in detail in Note 19 of supplementary online materials of (Meyer, 2012).

Whole genome Enredo-Pecan-Ortheus (EPO) alignments of human, chimpanzee, gorilla and orangutan were combined with modern human genotypes from the 1000 Genomes Project Phase 1 (1000G) to identify sites that are fixed (>99.0% frequency in 1000G) or high frequency (>90.0% frequency in 1000G) derived in modern humans and ancestral in chimpanzee and at least one other great ape (gorilla or orangutan). In order to avoid paralogous regions, human and chimpanzee sequences were required to appear in only one EPO alignment block. Some "fixed" sites are in dbSNP; these were separated out from fixed sites not in dbSNP, so three categories of frequency are displayed: Fixed, Fixed+dbSNP, and High Frequency.

Various quality filters were applied to Denisova genotypes: minimum 40 PHRED genotype likelihood from the Genome Analysis Toolkit (GATK); minimum 30 RMS map quality score; coverage at least 14X and at most 66X; no sites in positions identified as systematic errors or deemed to be of low quality due to conflicting genotype calls in a second iteration of GATK (Note 6, supplementary online materials of Meyer, 2012).

The derived-in-modern-human sites were intersected with the high-confidence-in-Denisova sites and annotated using VEP to predict effects on protein structure and transcriptional regulation.

Credits

Thanks to the Max Planck Institute for Evolutionary Anthropology for providing the data files used for this track.

References

Meyer M, Kircher M, Gansauge MT, Li H, Racimo F, Mallick S, Schraiber JG, Jay F, Prüfer K, de Filippo C et al. A high-coverage genome sequence from an archaic Denisovan individual. Science. 2012 Oct 12;338(6104):222-6. PMID: 22936568; PMC: PMC3617501; supplementary online materials, Note 19