RIKEN CAGE Loc Track Settings
 
RNA Subcellular CAGE Localization from ENCODE/RIKEN   (All Expression tracks)

Maximum display mode:       Reset to defaults   
Select views (Help):
TSS HMM Clusters       Minus Signal ▾       Plus Signal ▾       Alignments ▾      
Select subtracks by localization and cell line:
 All Localization Whole Cell  Cytosol  Nucleus  Polysome  Nucleoplasm  Chromatin  Nucleolus 
Cell Line
GM12878 (Tier 1) 
H1-hESC (Tier 1) 
K562 (Tier 1) 
A549 (Tier 2) 
B cells CD20+ (Tier 2) 
HeLa-S3 (Tier 2) 
HepG2 (Tier 2) 
HUVEC (Tier 2) 
IMR90 (Tier 2) 
MCF-7 (Tier 2) 
Monocytes CD14+ (Tier 2) 
SK-N-SH (Tier 2) 
AG04450 
BJ 
CD34+ Mobilized 
HAoAF 
HAoEC 
HCH 
HFDPC 
HMEpC 
hMSC-AT 
hMSC-BM 
hMSC-UC 
HOB 
HPC-PL 
HPIEpC 
HSaVEC 
HVMF 
HWP 
NHDF 
NHEK 
NHEM.f M2 
NHEM M2 
Prostate 
SkMC 
SK-N-SH RA 
Cell Line
 All Localization Whole Cell  Cytosol  Nucleus  Polysome  Nucleoplasm  Chromatin  Nucleolus 
Select subtracks further by: (select multiple categories and items - help)
RNA Extract:
Replicate:

List subtracks: only selected/visible    all    ()
  Cell Line↓1 Localization↓2 RNA Extract↓3 views↑4 Replicate↓5   Track Name↓6    Restricted Until↓7
 
hide
 K562  Cytosol  PolyA-  TSS HMM Clusters  Pooled  K562 cytosol polyA- CAGE TSS HMM from ENCODE/RIKEN    Data format   2012-11-10 
 
hide
 Configure
 K562  Cytosol  PolyA-  Plus Signal  1st  K562 cytosol polyA- CAGE Plus start sites Rep 1 from ENCODE/RIKEN    Data format   2009-09-09 
 
hide
 Configure
 K562  Cytosol  PolyA-  Minus Signal  1st  K562 cytosol polyA- CAGE Minus start sites Rep 1 from ENCODE/RIKEN    Data format   2009-09-09 
 
hide
 K562  Cytosol  PolyA+  TSS HMM Clusters  Pooled  K562 cytosol polyA+ CAGE TSS HMM from ENCODE/RIKEN    Data format   2012-11-10 
 
hide
 Configure
 K562  Cytosol  PolyA+  Plus Signal  1st  K562 cytosol polyA+ CAGE Plus start sites Rep 1 from ENCODE/RIKEN    Data format   2011-06-21 
 
hide
 Configure
 K562  Cytosol  PolyA+  Minus Signal  1st  K562 cytosol polyA+ CAGE Minus start sites Rep 1 from ENCODE/RIKEN    Data format   2011-06-21 
 
hide
 K562  Nucleus  PolyA-  TSS HMM Clusters  Pooled  K562 nucleus polyA- CAGE TSS HMM from ENCODE/RIKEN    Data format   2012-11-10 
 
hide
 Configure
 K562  Nucleus  PolyA-  Plus Signal  1st  K562 nucleus polyA- CAGE Plus start sites Rep 1 from ENCODE/RIKEN    Data format   2009-09-09 
 
hide
 Configure
 K562  Nucleus  PolyA-  Minus Signal  1st  K562 nucleus polyA- CAGE Minus start sites Rep 1 from ENCODE/RIKEN    Data format   2009-09-09 
 
hide
 K562  Nucleus  PolyA+  TSS HMM Clusters  Pooled  K562 nucleus polyA+ CAGE TSS HMM from ENCODE/RIKEN    Data format   2012-11-10 
 
hide
 Configure
 K562  Nucleus  PolyA+  Plus Signal  1st  K562 nucleus polyA+ CAGE Plus start sites Rep 1 from ENCODE/RIKEN    Data format   2011-06-21 
 
hide
 Configure
 K562  Nucleus  PolyA+  Minus Signal  1st  K562 nucleus polyA+ CAGE Minus start sites Rep 1 from ENCODE/RIKEN    Data format   2011-06-21 
 
hide
 K562  Whole Cell  PolyA+  TSS HMM Clusters  Pooled  K562 whole cell polyA+ CAGE TSS HMM from ENCODE/RIKEN    Data format   2012-11-10 
 
hide
 Configure
 K562  Whole Cell  PolyA+  Plus Signal  1st  K562 whole cell polyA+ CAGE Plus start sites Rep 1 from ENCODE/RIKEN    Data format   2011-06-21 
 
hide
 Configure
 K562  Whole Cell  PolyA+  Minus Signal  1st  K562 whole cell polyA+ CAGE Minus start sites Rep 1 from ENCODE/RIKEN    Data format   2011-06-21 
     Restriction Policy
Assembly: Human Feb. 2009 (GRCh37/hg19)

Description

This track shows 5' cap analysis gene expression (CAGE) tags and clusters in RNA extracts from different sub-cellular localizations in multiple cell lines. A CAGE cluster is a region of overlapping tags with an assigned value that represents the expression level. The data in this track were produced as part of the ENCODE Transcriptome Project.

Display Conventions and Configuration

This track is a multi-view composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring multi-view tracks are here. To show only selected subtracks, uncheck the boxes next to the tracks that you wish to hide.

This track contains the following views:

TSS HMM
Transcriptional Start Sites based on Hidden Markov Modeling for pooled replicates where two replicates exist.
  • Expression levels are shown in reads per kilobase of exon per million reads mapped (RPKM).
  • The IDR value is the irreproducible discovery rate. This is a measurement that measures expression variances between genomic replicates in large scale experiments.
Plus and Minus Signals
These views display signals representing the amount of overlapping CAGE reads (clusters) mapped on the forward and reverse genomic strands.
Alignments
The Alignments view shows reads mapped to the genome and indicates where bases may mismatch. Every mapped read is displayed, i.e. uncollapsed. The alignment file follows the standard SAM format of Bowtie output. The custom tag XP can be ignored. See the Bowtie Manual for more information about the SAM Bowtie output (including other tags) and the SAM Format Specification for more information on the SAM/BAM file format. Where mapping quality is not available for this track, a score of 255 is used in accordance with the SAM Format Specification. Also, where the sequence quality scores are not available, all scores are displayed as 40.

Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks.

Replicate numbering in the track display page is done by rank. The first replicate available may be replicate number three.

Color differences in subtracks may be set as a visual cue to distinguish between the different cell types or between annotations on the plus and minus strand.

Downloadable Files

TSS GencV7
For some samples, there are download files in a modified gtf format with Transcriptional Start Sites based on GENCODE V7. A complete description of the TSS files is located in the supplemental materials directory.

Methods

Cells were grown according to the approved ENCODE cell culture protocols. RNA molecules longer than 200 nt were isolated from each subcellular compartment and then were fractionated into polyA+ and polyA- fractions as described in these protocols. The CAGE tags were sequenced from the 5' ends of cap-trapped cDNAs produced using RIKEN CAGE technology (Kodzius et al. 2006; Valen et al. 2009). To create the tag, a linker was attached to the 5' end of polyA+ or polyA- reverse-transcribed cDNAs which were selected by cap trapping (Carninci et al. 1996). The first 27 bp of the cDNA were cleaved using class II restriction enzymes. A linker was then attached to the 3' end of the cDNA.

After PCR amplification, the tags were sequenced using Illumina's Genome analyzer or HiSeq. The read lengths for each sample are specified in the metadata. Tags were mapped to the human genome (hg19) using the program Delve (T. Lassmann manuscript in preparation). Delve is a new probabilistic aligner focused on giving the best possible alignment of reads to a genome rather than focusing on speed. It calculates the mapping accuracy (probability of each alignment being true or not) for each alignment. There is no set limit on the number of errors allowed and therefore the mapping rate is commonly 100%. However, for analysis it is recommended to discard alignments with low mapping qualities.

Exceptions to the above protocol are the polyA- RNA samples from K562 cytosol, K562 nucleus, and prostate whole cell which were sequenced using ABI SOLiD technology. These reads were mapped using Bowtie with its default parameters. Clusters were defined as regions of overlapping CAGE reads. The expression level was computed as the number of reads making up the cluster, divided by the total number of reads sequenced, times 1 million.

Release Notes

This is Release 4 (July 2012) of Riken CAGE. Three missing Transcription Start Sites determined by Hidden Markov Models (TssHmm) tables have been added (H1-hesc Nucleus, H1-hesc Cytosol and NHEK Nucleus polyA+ samples).

As with previous releases, the orignal data from the hg18 version of this track is still included and can be noted in the metadata as having a bioRepId that starts with gen0. This older data may be missing some information and does not have replicates. If there are new data available for an older sample, only the newer data is displayed. The older data is still availble for downloads.

Credits

These data were generated and analyzed by Timo Lassmann, Phil Kapranov, Hazuki Takahashi, Yoshihide Hayashizaki, Carrie Davis, Tom Gingeras, and Piero Carninci.

Contact: Piero Carninci at RIKEN Omics Science Center

References

Carninci P, Kvam C, Kitamura A, Ohsumi T, Okazaki Y, Itoh M, Kamiya M, Shibata K, Sasaki N, Izawa M, et al. High-efficiency full-length cDNA cloning by biotinylated CAP trapper. Genomics. 1996 November 1; 37(3):327-336.

Kodzius R, Kojima M, Nishiyori H, Nakamura M, Fukuda S, Tagami M, Sasaki D, Imamura K, Kai C, Harbers M, et al. CAGE: cap analysis of gene expression. Nat Methods. 2006 March 1; 3(3):211-222.

Valen E, Pascarella G, Chalk A, Maeda N, Kojima M, Kawazu C, Murata M, Nishiyori H, Lazarevic D, Motti D, et al. Genome-wide detection and analysis of hippocampus core promoters using DeepCAGE. Genome Res. 2009 February; 19(2):255-265.

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.