Schema for TCGA Pan-Cancer - TCGA Pan-Cancer mutations: 33 TCGA Cancer Projects Summary (Pan-Can 33)
|
|
Database: hg38 Primary Table: COAD Data last updated: 2019-05-03
Big Bed File Download: /gbdb/hg38/gdcCancer/COAD.bb Item Count: 253,759 The data is stored in the binary BigBed format.
Format description: somatic variants converted from MAF files obtained through the NCI GDC
field | example | description |
chrom | chr1 | Chromosome (or contig, scaffold, etc.) | chromStart | 166070116 | Start position in chromosome | chromEnd | 166070117 | End position in chromosome | name | C>A | Name of item | score | 1 | Score from 0-1000 | strand | . | + or - | thickStart | 166070116 | Start of where display should be thick (start codon) | thickEnd | 166070117 | End of where display should be thick (stop codon) | reserved | 0,0,0 | Used as itemRgb as of 2004-11-22 | blockCount | 1 | Number of blocks | blockSizes | 1 | Comma separated list of block sizes | chromStarts | 0 | Start positions relative to chromStart | sampleCount | 1 | Number of samples with this variant | freq | 0.00250626566416 | Variant frequency | Hugo_Symbol | FAM78B | Hugo symbol | Entrez_Gene_Id | 149297 | Entrez Gene Id | Variant_Classification | 3'UTR | Class of variant | Variant_Type | SNP | Type of variant | Reference_Allele | C | Reference allele | Tumor_Seq_Allele1 | C | Tumor allele 1 | Tumor_Seq_Allele2 | A | Tumor allele 2 | dbSNP_RS | novel | dbSNP RS number | dbSNP_Val_Status | | dbSNP validation status | days_to_death | -- | Number of days till death | cigarettes_per_day | -- | Number of cigarettes per day | weight | 68.0 | Weight | alcohol_history | -- | Any alcohol consumption? | alcohol_intensity | -- | Frequency of alcohol consumption | bmi | 31.0445580716 | Body mass index | years_smoked | -- | Number of years smoked | height | 148.0 | Height | gender | female | Gender | project_id | TCGA-COAD | TCGA Project id | ethnicity | not hispanic or latino | Ethnicity | Tumor_Sample_Barcode | TCGA-AM-5821-01A-01D-1650-10 | Tumor sample barcode | Matched_Norm_Sample_Barcode | TCGA-AM-5821-10A-01D-1650-10 | Matcheds normal sample barcode | case_id | 605baa86-79e3-484d-82d2-4de27d405ba1 | Case ID number |
|
| |
|
|
Sample Rows
|
|
chrom | chromStart | chromEnd | name | score | strand | thickStart | thickEnd | reserved | blockCount | blockSizes | chromStarts | sampleCount | freq | Hugo_Symbol | Entrez_Gene_Id | Variant_Classification | Variant_Type | Reference_Allele | Tumor_Seq_Allele1 | Tumor_Seq_Allele2 | dbSNP_RS | dbSNP_Val_Status | days_to_death | cigarettes_per_day | weight | alcohol_history | alcohol_intensity | bmi | years_smoked | height | gender | project_id | ethnicity | Tumor_Sample_Barcode | Matched_Norm_Sample_Barcode | case_id |
chr1 | 166070116 | 166070117 | C>A | 1 | . | 166070116 | 166070117 | 0,0,0 | 1 | 1 | 0 | 1 | 0.00250626566416 | FAM78B | 149297 | 3'UTR | SNP | C | C | A | novel | | -- | -- | 68.0 | -- | -- | 31.0445580716 | -- | 148.0 | female | TCGA-COAD | not hispanic or latino | TCGA-AM-5821-01A-01D-1650-10 | TCGA-AM-5821-10A-01D-1650-10 | 605baa86-79e3-484d-82d2-4de27d405ba1 |
chr1 | 166070250 | 166070251 | G>T | 1 | . | 166070250 | 166070251 | 0,0,0 | 1 | 1 | 0 | 1 | 0.00250626566416 | FAM78B | 149297 | Missense_Mutation | SNP | G | G | T | novel | | -- | -- | 58.0 | -- | -- | 20.5498866213 | -- | 168.0 | male | TCGA-COAD | not hispanic or latino | TCGA-CA-6717-01A-11D-1835-10 | TCGA-CA-6717-10A-01D-1835-10 | 268c01b3-d2ce-44c0-a2fe-ea846a1253cc |
chr1 | 166070335 | 166070336 | G>A | 1 | . | 166070335 | 166070336 | 0,0,0 | 1 | 1 | 0 | 1 | 0.00250626566416 | FAM78B | 149297 | Missense_Mutation | SNP | G | G | A | | | -- | -- | 59.0 | -- | -- | 23.9360623149 | -- | 157.0 | female | TCGA-COAD | not hispanic or latino | TCGA-CM-6678-01A-11D-1835-10 | TCGA-CM-6678-10A-01D-1835-10 | d7048098-09b3-4763-ad54-9c5c6fe14bc0 |
chr1 | 166070352 | 166070353 | C>T | 1 | . | 166070352 | 166070353 | 0,0,0 | 1 | 1 | 0 | 1 | 0.00250626566416 | FAM78B | 149297 | Missense_Mutation | SNP | C | C | T | rs754839641 | byFrequency | -- | -- | 61.0 | -- | -- | 20.6192536506 | -- | 172.0 | male | TCGA-COAD | not hispanic or latino | TCGA-CA-6719-01A-11D-1835-10 | TCGA-CA-6719-10A-01D-1835-10 | 9cdca5d8-ea77-4df9-b2ba-0d5e8ec37bcd |
chr1 | 166070403 | 166070404 | C>T | 1 | . | 166070403 | 166070404 | 0,0,0 | 1 | 1 | 0 | 1 | 0.00250626566416 | FAM78B | 149297 | Missense_Mutation | SNP | C | C | T | rs775991053 | byFrequency | 2134.0 | -- | -- | -- | -- | -- | -- | -- | female | TCGA-COAD | not hispanic or latino | TCGA-CK-4951-01A-01D-1408-10 | TCGA-CK-4951-10A-01D-2188-10 | 2db3c23d-c591-4ea0-b0a8-ed9c4343c80e |
chr1 | 166070404 | 166070405 | G>A | 1 | . | 166070404 | 166070405 | 0,0,0 | 1 | 1 | 0 | 1 | 0.00250626566416 | FAM78B | 149297 | Missense_Mutation | SNP | G | G | A | rs747327778 | | -- | -- | -- | -- | -- | -- | -- | -- | male | TCGA-COAD | not reported | TCGA-AA-3862-01A-01W-0995-10 | TCGA-AA-3862-10A-01W-0995-10 | 05ff79c5-1127-4cac-8f4c-84089a3f2cde |
chr1 | 166070410 | 166070411 | C>A | 1 | . | 166070410 | 166070411 | 0,0,0 | 1 | 1 | 0 | 1 | 0.00250626566416 | FAM78B | 149297 | Missense_Mutation | SNP | C | C | A | | | -- | -- | -- | -- | -- | -- | -- | -- | female | TCGA-COAD | not hispanic or latino | TCGA-CK-5913-01A-11D-1650-10 | TCGA-CK-5913-10A-01D-1650-10 | 3758f4f7-ac2f-4f9d-bee2-526196f8129a |
chr1 | 166070687 | 166070688 | A>T | 1 | . | 166070687 | 166070688 | 0,0,0 | 1 | 1 | 0 | 1 | 0.00250626566416 | FAM78B | 149297 | Silent | SNP | A | A | T | | | -- | -- | 99.6 | -- | -- | 33.6668469443 | -- | 172.0 | male | TCGA-COAD | not hispanic or latino | TCGA-A6-5657-01A-01D-1650-10 | TCGA-A6-5657-10A-01D-1650-10 | dbbee8f5-d83d-4195-85cf-f89d327da0a9 |
chr1 | 166165997 | 166165998 | C>A | 1 | . | 166165997 | 166165998 | 0,0,0 | 1 | 1 | 0 | 1 | 0.00250626566416 | FAM78B | 149297 | Missense_Mutation | SNP | C | C | A | | | -- | -- | 71.0 | -- | -- | 26.7228725206 | -- | 163.0 | female | TCGA-COAD | not hispanic or latino | TCGA-AD-A5EJ-01A-11D-A28G-10 | TCGA-AD-A5EJ-10A-01D-A28G-10 | 613aa3e8-a70b-45a9-9c08-0c2346c8bf00 |
chr1 | 166166058 | 166166059 | C>T | 2 | . | 166166058 | 166166059 | 0,0,0 | 1 | 1 | 0 | 2 | 0.00501253132832 | FAM78B | 149297 | Missense_Mutation | SNP | C | C | T | | | --,2047.0 | --,-- | 71.2,57.0 | --,-- | --,-- | 27.1300106691,24.5095404461 | --,-- | 162.0,152.5 | female,female | TCGA-COAD,TCGA-COAD | not hispanic or latino,not hispanic or latino | TCGA-CM-6171-01A-11D-1650-10,TCGA-G4-6302-01A-11D-1719-10 | TCGA-CM-6171-10A-01D-1650-10,TCGA-G4-6302-10A-01D-1719-10 | eed7405f-f8f3-41a4-a0e7-242f23087f59,7144a65f-a63f-4b79-8fe2-cfe48696de03 |
|
| |
|
|
TCGA Pan-Cancer (gdcCancer) Track Description
|
|
Description
This track shows the genomic positions of somatic variants found through whole genome sequencing of tumors
as part of The Cancer Genome Atlas (TCGA) by the National Cancer Institute, made available through
the Genomic Data Commons Portal. The
data shown here is sometimes called the "Pan-Cancer dataset", a collection of thirty-three
TCGA projects processed in a uniform way.
Display Conventions and Configuration
Variants can be filtered by project ID and gender from the track details page. Pressing the
"All" button allows the user to specify whether the checked values all have to be
true of a particular variant, or if only one of them need be present to satisfy the filter.
The vertical viewing range in full mode can also be used to filter what variants are shown. Variants
that have a sampleCount more or less than the min and max values specificed in the viewing range are
not displayed.
Data access
The raw data can be explored interactively with the Table Browser or the Data
Integrator.
For automated download and analysis, the genome annotation for all the thirty-three projects is
stored in a bigBed file that can be downloaded from
our
download server. There are also bigBed files for each of the thirty-three projects in that
directory. Individual regions or the whole genome annotation can be obtained using our tool
bigBedToBed which can be compiled from the source code or downloaded as a precompiled
binary for your system. Instructions for downloading source code and binaries can be found
here. The tool can also be used to obtain only features within a given range,
e.g.,
bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/gdcCancer/gdcCancer.bb -chrom=chr21 -start=0 -end=100000000 stdout
Methods
All MuTect Variant calls were downloaded from the GDC portal in January 2019 and reformatted at UCSC
to the bigBed format with a short
script, cancerMafToBigBed.
Credits
Thanks to GDC for making the TCGA data available on their web site.
| |
|
|
|