Home
Genomes
Genome Browser
Tools
Mirrors
- Euro/Asia Mirrors
- Mirroring Instructions
- US Server
- European Server
- Asian Server
Downloads
My Data
Projects
Help
About Us
- News
- Publications
- Blog
- Cite Us
- Credits
- Release Log
- Staff
- Conditions of Use
- Our History
- Jobs
- Licenses
- Contact Us

field

example

SQL type

info

description

bin

589

smallint(5) unsigned

range

Indexing field to speed chromosome range queries.

chrom

chr1

varchar(255)

values

Reference sequence chromosome or scaffold

chromStart

569874

int(10) unsigned

range

Start position in chromosome

chromEnd

569954

int(10) unsigned

range

End position in chromosome

bin

chrom

chromStart

chromEnd

589

chr1

569874

569954

1285

chr1

91852761

91853155

1510

chr1

121354274

121354349

1510

chr1

121355719

121355801

1510

chr1

121357977

121358058

1511

chr1

121478613

121478695

1511

chr1

121483235

121483314

1511

chr1

121483334

121483400

1511

chr1

121484109

121485434

1678

chr1

143278758

143278832

Description

This track displays regions of the reference genome that have exceptionally high sequence depth, inferred from alignments of short-read sequences from the 1000 Genomes Project. These regions may be caused by collapsed repetitive sequences in the reference genome assembly; they also have high read depth in assays such as ChIP-seq, and may trigger false positive calls from peak-calling algorithms. Excluding these regions from analysis of short-read alignments should reduce such false positive calls.

Methods

Pickrell et al. downloaded sequencing reads for 57 Yoruba individuals from the 1000 Genomes Project's low-coverage pilot data, mapped them to the Mar. 2006 human genome assembly (NCBI36/hg18), computed the read depth for every base in the genome, and compiled a distribution of read depths. They then identified contiguous regions where read depth exceeded thresholds corresponding to the top 0.001, 0.005, 0.01, 0.05 and 0.1 of the per-base read depths, merging regions which fall within 50 bases of each other. The regions are available for download from http://eqtl.uchicago.edu/Masking/ (see the readme file).

Credits

Thanks to Joseph Pickrell at the University of Chicago for these data.

References

Pickrell JK, Gaffney DJ, Gilad Y, Pritchard JK. False positive peaks in ChIP-seq and other sequencing-based functional assays caused by unannotated high copy number regions. Bioinformatics. 2011 Aug 1;27(15):2144-6. Epub 2011 Jun 19.