Schema for Mod Hum Variants - Variant Calls from 11 Modern Human Genome Sequences
  Database: hg19    Primary Table: dhcVcfHGDP01029
VCF File Download: /gbdb/hg19/bbi/HGDP01029.vcf.gz
Format description: The fields of a Variant Call Format data line
fielddescription
chromAn identifier from the reference genome
posThe reference position, with the 1st base having position 1
idSemi-colon separated list of unique identifiers where available
refReference base(s)
altComma separated list of alternate non-reference alleles called on at least one of the samples
qualPhred-scaled quality score for the assertion made in ALT. i.e. give -10log_10 prob(call in ALT is wrong)
filterPASS if this position has passed all filters. Otherwise, a semicolon-separated list of codes for filters that fail
infoAdditional information encoded as a semicolon-separated series of short keys with optional comma-separated values
formatIf genotype columns are specified in header, a semicolon-separated list of of short keys starting with GT
genotypesIf genotype columns are specified in header, a tab-separated set of genotype column values; each value is a colon-separated list of values corresponding to keys in the format column

Sample Rows
 
chromposidrefaltqualfilterinfoformatgenotypes
1060969rs187110906CA..DP=3;HaplotypeScore=0.0000;MQ=0.00;MQ0=3;1000gALT=A;AF1000g=0.04;AMR_AF=0.04;ASN_AF=0.01;ASN_AF=0.03GT:A:C:G:T:IR./.:0,3:0,0:0,0:0,0:0
1061005rs192025213AG..DP=2;HaplotypeScore=0.0000;MQ=0.00;MQ0=2;1000gALT=G;AF1000g=0.01;ASN_AF=0.01;ASN_AF=0.01GT:A:C:G:T:IR./.:0,0:0,0:1,1:0,0:0
1061020rs115033199GC..DP=2;HaplotypeScore=0.0000;MQ=10.61;MQ0=1;1000gALT=C;AF1000g=0.00GT:A:C:G:T:IR./.:0,0:0,0:1,1:0,0:0
1061372.CAC1091.52.AC=2;AF=1.00;AN=2;DP=43;FS=0.000;HRun=1;HaplotypeScore=65.0108;MQ=22.28;MQ0=16;QD=25.38GT:DP:GQ:PL:A:C:G:T:IR1/1:27:81.27:1134,81,0:0,0:16,25:0,0:0,1:31
1061373.AT72.21.AC=2;AF=1.00;AN=2;DP=43;Dels=0.74;FS=0.000;HRun=1;HaplotypeScore=0.0000;MQ=22.28;MQ0=16;QD=1.68GT:DP:GQ:PL:A:C:G:T:IR1/1:11:12:105,12,0:3,3:0,1:0,0:2,2:0
1065878.CG5.06LowQualAC=1;AF=0.50;AN=2;DP=4;Dels=0.00;HRun=0;HaplotypeScore=0.0000;MQ=17.17;MQ0=1;QD=1.27GT:DP:GQ:PL:A:C:G:T:IR0/1:4:1.76:33,3,0:0,0:0,0:4,0:0,0:0
1066183.GA10.13LowQualAC=1;AF=0.50;AN=2;BaseQRankSum=-0.456;DP=19;Dels=0.11;HRun=2;HaplotypeScore=25.2565;MQ=24.11;MQ0=3;MQRankSum=1.552;QD=0.53;ReadP ...GT:DP:GQ:PL:A:C:G:T:IR0/1:17:39.68:40,0,251:1,1:0,0:6,8:0,0:0
1066397.CT326.02.AC=2;AF=1.00;AN=2;DP=28;Dels=0.00;FS=0.000;HRun=2;HaplotypeScore=0.0000;MQ=20.25;MQ0=17;QD=11.64GT:DP:GQ:PL:A:C:G:T:IR1/1:28:30.08:359,30,0:0,0:7,0:0,0:12,6:0
1066913.GA71.13.AC=1;AF=0.50;AN=2;BaseQRankSum=0.530;DP=38;Dels=0.00;FS=14.211;HRun=3;HaplotypeScore=0.0000;MQ=19.92;MQ0=3;MQRankSum=1.041;QD=1. ...GT:DP:GQ:PL:A:C:G:T:IR0/1:38:99:101,0,392:6,3:0,0:11,16:0,0:0
1066957.CT58.64.AC=1;AF=0.50;AN=2;BaseQRankSum=-2.060;DP=28;Dels=0.00;FS=15.441;HRun=1;HaplotypeScore=0.0000;MQ=14.97;MQ0=11;MQRankSum=-0.151;QD ...GT:DP:GQ:PL:A:C:G:T:IR0/1:28:79.27:89,0,79:0,0:5,7:0,0:4,9:0

Mod Hum Variants (dhcVcfModern) Track Description
 

Description

The Modern Human Variants track shows variant calls made from sequence reads of eleven individuals mapped to the human genome. The purpose of this track is to put the divergence of the Denisova genome into perspective with regard to present-day humans.

Methods

DNA was obtained for each of ten individuals from the CEPH-Human Genome Diversity Panel (HGDP):

  • HGDP00456 (Mbuti)
  • HGDP00521 (French)
  • HGDP00542 (Papuan)
  • HGDP00665 (Sardinian)
  • HGDP00778 (Han)
  • HGDP00927 (Yoruba)
  • HGDP00998 (Karitiana)
  • HGDP01029 (San)
  • HGDP01284 (Mandenka)
  • HGDP01307 (Dai)
DNA was also extracted from a Dinka individual from Sudan (DNK02). To minimize biases due to instrument variability, the samples were pooled for sequencing, using four barcoded libraries per sample. The paired-end reads were aligned to the human genome using the Burrows-Wheeler Aligner and potential PCR duplicates were filtered using Picard.

Genotype calls for single nucleotide variants and small insertions and deletions were made using the Unified Genotyper from the Genome Analysis Toolkit (GATK), with an additional iteration using a modified reference genome in order to reduce reference bias (Note 6, supplementary online materials of Meyer, 2012).

Variant Call Format (VCF) files were enhanced by adding information from Ensembl Compara EPO alignments of 6 primates and of 35 Eutherian mammals, phastCons conservation scores generated using EPO alignments, 1000 Genomes Project integrated variant call files, University of Washington background selection scores, ENCODE/Duke Uniqueness of 20mers (see the Mappability track), segmental duplications from the Eichler lab (see the Segmental Dups track), and samtools mpileup summaries of mapped reads.

Credits

Thanks to the Max Planck Institute for Evolutionary Anthropology for providing the variant-only VCF files used for this track.

References

Meyer M, Kircher M, Gansauge MT, Li H, Racimo F, Mallick S, Schraiber JG, Jay F, Prüfer K, de Filippo C et al. A high-coverage genome sequence from an archaic Denisovan individual. Science. 2012 Oct 12;338(6104):222-6. PMID: 22936568; PMC: PMC3617501; supplementary online materials, Note 6.