Description
These tracks were generated by the ENCODE Consortium. They contain information
about mouse RNAs greater than 200 nucleotides in length obtained as short
reads off the Illumina platform. Data are available from biological replicates.
Display Conventions and Configuration
This track is a multi-view composite track that contains multiple data types
(views). For each view, there are multiple subtracks that
display individually on the browser. Instructions for configuring multi-view
tracks are
here.
To show only selected subtracks, uncheck the boxes next to the tracks that
you wish to hide.
Color differences among the views are arbitrary. They provide a
visual cue for
distinguishing between the different cell types and compartments.
- Contigs
- The Contigs represent blocks of overlapping mapped reads from the pooled biological replicates.
- Raw Signals
- The Plus Raw Signal and Minus Raw Signal views show the density of mapped reads on the plus and minus strands (wiggle format), respectively.
- Alignments
- The Alignments view shows individual reads mapped from biological replicates to the genome and indicates where
bases may mismatch. Every mapped read is displayed, i.e. uncollapsed. The alignment file follows the standard SAM format of Bowtie output. See the Bowtie Manual for more information about the SAM Bowtie output (including other tags) and the SAM Format Specification for more information on the SAM/BAM file format.
- Splice Junctions
- Subset of aligned reads that cross splice junctions. Specific column specifications can be found in the supplemental directory.
Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks.
Additional views are available on the Downloads page.
Methods
Tissue Samples
Individual tissues were harvested from mouse strain C57BL/6J at different timepoints
according to ENCODE
cell culture protocols.
Whenever possible, biological replicates were obtained from littermates.
Library Preparation
The published
cDNA sequencing protocol was used. This protocol generates directional libraries
and reports the transcripts' strand of origin. Exogenous RNA spike-ins
were added to each endogenous RNA isolate and carried
through library construction and sequencing.
The spike-in sequence and the concentrations are available for download
in the supplemental directory.
Sequencing and Mapping
The libraries were sequenced on the Illumina platform (either GAIIx or
Hi-Seq) in mate-pair fashion (either pair-end 76 or pair-end 101) to an average depth of 100 million
mate-pairs.
The data were mapped against mm9 using Spliced Transcript Alignment
and Reconstruction (STAR) written by Alex Dobin (CSHL). More
information about STAR, including the parameters used for these data,
is available from the
Gingeras lab.
For each experiment, there are additional
element data views
data files available for download.
These elements were assessed for reproducibility using a nonparametric
irreproducible detection (IDR) rate script. The IDR values for each element
are included in the files for end-users to use as a threshold. An IDR value of 0.1 means
that the probability of detecting that element in a third experiment equivalent
in depth to the sum of the bioreplicates is 90%. In addition,
expression values for annotated genes, transcripts and exons were computed. Further explanation of these
files is available for download in the
supplemental directory.
Verification
FPKM (fragments per kilobase of exon per million fragments mapped) values were calculated
for annotated exons and Spearman correlation coefficients were computed.
In general, Rho values are greater than 0.90 between biological replicates.
Release Notes
This is release 3 (Sept 2012) of this track. It adds data for bladder, cerebellum, CNS, cortex, frontal lobe, limb, liver, placenta, and whole brain. The samples for CNS, liver, limb and whole brain vary over age (developmental stage).
This release also contains replacement BAM files for the previous ones had the second read reverse complemented.
Credits
These data were generated and analyzed by the transcriptome group at
Cold Spring Harbor Laboratories and the Center for Genomic
Regulation (CRG in Barcelona), who are participants in the ENCODE Transcriptome Group.
Contacts:
Carrie Davis (experimental),
Roderic Guigo and lab (data processing),
Tom Gingeras (primary investigator)
References
Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, Gingeras TR, Oliver B.
Synthetic spike-in standards for RNA-seq experiments.
Genome Res. 2011 Sep;21(9):1543-51.
PMID: 21816910; PMC: PMC3166838
Parkhomchuk D, Borodina T, Amstislavskiy V, Banaru M, Hallen L, Krobitsch S, Lehrach H, Soldatov A.
Transcriptome analysis by strand-specific sequencing of complementary DNA.
Nucleic Acids Res. 2009 Oct;37(18):e123.
PMID: 19620212; PMC: PMC2764448
Data Release Policy
Data users may freely use ENCODE data, but may not, without prior
consent, submit publications that use an unpublished ENCODE dataset until
nine months following the release of the dataset. This date is listed in
the Restricted Until column, above. The full data release policy
for ENCODE is available
here.