Schema for Modern Derived - Modern Human Derived, Denisova Ancestral
  Database: hg19    Primary Table: dhcHumDerDenAncCcdsUtr5Fixed Data last updated: 2012-10-02
Big Bed File Download: /gbdb/hg19/dhcHumDerDenAnc/dhcHumDerDenAncCcdsUtr5Fixed.bb
Item Count: 88
The data is stored in the binary BigBed format.

Format description: Human Derived, Denisova Ancestral variants and functional effect predictions from high-coverage Denisova sequencing project
fieldexampledescription
chromchr1Reference sequence chromosome
chromStart26348514Start position in chromosome
chromEnd26348515End position in chromosome
nameG/THuman allele / Denisova ancestral allele
featureENST00000374280Ensembl Transcript ID or Regulatory Region ID
or ID of TFB profile from JASPAR or TRANSFAC
geneENSG00000158008Ensembl Gene ID
extraENSP=ENSP00000363398; HGNC=EXTL1Extra info: for coding genes, Ensembl Protein ID and/or HGNC;
for Regulatory Motifs, scores & matrix ID
consequence5PRIME_UTRVariant Effect Predictor (VEP) consequence term
cdnaPosition245Offset in transcript, if applicable
cdsPosition-Offset in coding sequence (CDS), if applicable
protPosition-Offset in protein sequence, if applicable
aminoAcids-Amino acid change, if applicable
codons-Codon change, if applicable
humanAlGModern human fixed (or major) allele on positive strand
denAlTDenisova (ancestral) allele
chimpAlTChimpanzee ancestral allele
gorAlTGorilla ancestral allele
orangAlTOrangutan ancestral allele
denZygHOMODenisova zygosity of ancestral allele (homozygous/heterozygous)
dbSNP.dbSNP rs ID, if available
tgpFreq11000 Genomes Project frequency of modern human allele
flagCpGFlag(s): CpG if in CpG island; RM if in repeat masked region;
LowQual if conflicting GATK calls; SysErr if prone to systematic errors
geneStrand.Gene strand: '+' or '-', if applicable; otherwise '.'

Sample Rows
 
chromchromStartchromEndnamefeaturegeneextraconsequencecdnaPositioncdsPositionprotPositionaminoAcidscodonshumanAldenAlchimpAlgorAlorangAldenZygdbSNPtgpFreqflaggeneStrand
chr12634851426348515G/TENST00000374280ENSG00000158008ENSP=ENSP00000363398; HGNC=EXTL15PRIME_UTR245----GTTTTHOMO.1CpG.
chr13554502235545024GC/-ENST00000373330ENSG00000197056ENSP=ENSP00000362427; HGNC=ZMYM15PRIME_UTR31----GC----HOMO.1.0CpG.
chr13845654438456545T/CENST00000373019ENSG00000183431ENSP=ENSP00000362110; HGNC=SF3A35PRIME_UTR49----TCCCCHOMO.1CpG,RM.
chr14280148042801481T/AENST00000372573ENSG00000198815ENSP=ENSP00000361654; HGNC=FOXJ35PRIME_UTR68----TAAAAHOMO.1..
chr1104108165104108166G/AENST00000361355ENSG00000240038ENSP=ENSP00000354610; HGNC=AMY2B5PRIME_UTR520----GAAAAHOMO.1CpG.
chr1144994860144994861C/TENST00000369356ENSG00000178104ENSP=ENSP00000358363; HGNC=PDE4DIP5PRIME_UTR162----CTTTTHOMO.1CpG.
chr1169764992169764992-/GENST00000286031ENSG00000000460ENSP=ENSP00000286031; HGNC=C1orf1125PRIME_UTR443-----GGGGHOMO.1..
chr1181057825181057826A/GENST00000367577ENSG00000162783ENSP=ENSP00000356549; HGNC=IER55PRIME_UTR189----AGGGGHOMO.1CpG.
chr1181058037181058037-/GENST00000367577ENSG00000162783ENSP=ENSP00000356549; HGNC=IER55PRIME_UTR400-----GGGGHOMO.1..
chr1205782235205782236A/GENST00000367137ENSG00000133065ENSP=ENSP00000356105; HGNC=SLC41A15PRIME_UTR69----AGGGGHOMO.1..

Modern Derived (dhcHumDerDenAnc) Track Description
 

Description

This track shows mutations in the modern human lineage that rose to fixation or near fixation since the split from the last common ancestor with Denisovans, along with predicted functional effects from Ensembl's Variant Effect Predictor (VEP).

Methods

Methods and analysis are described in detail in Note 19 of supplementary online materials of (Meyer, 2012).

Whole genome Enredo-Pecan-Ortheus (EPO) alignments of human, chimpanzee, gorilla and orangutan were combined with modern human genotypes from the 1000 Genomes Project Phase 1 (1000G) to identify sites that are fixed (>99.0% frequency in 1000G) or high frequency (>90.0% frequency in 1000G) derived in modern humans and ancestral in chimpanzee and at least one other great ape (gorilla or orangutan). In order to avoid paralogous regions, human and chimpanzee sequences were required to appear in only one EPO alignment block. Some "fixed" sites are in dbSNP; these were separated out from fixed sites not in dbSNP, so three categories of frequency are displayed: Fixed, Fixed+dbSNP, and High Frequency.

Various quality filters were applied to Denisova genotypes: minimum 40 PHRED genotype likelihood from the Genome Analysis Toolkit (GATK); minimum 30 RMS map quality score; coverage at least 14X and at most 66X; no sites in positions identified as systematic errors or deemed to be of low quality due to conflicting genotype calls in a second iteration of GATK (Note 6, supplementary online materials of Meyer, 2012).

The derived-in-modern-human sites were intersected with the high-confidence-in-Denisova sites and annotated using VEP to predict effects on protein structure and transcriptional regulation.

Credits

Thanks to the Max Planck Institute for Evolutionary Anthropology for providing the data files used for this track.

References

Meyer M, Kircher M, Gansauge MT, Li H, Racimo F, Mallick S, Schraiber JG, Jay F, Prüfer K, de Filippo C et al. A high-coverage genome sequence from an archaic Denisovan individual. Science. 2012 Oct 12;338(6104):222-6. PMID: 22936568; PMC: PMC3617501; supplementary online materials, Note 19