Schema for Lens Patents - Lens PatSeq Patent Document Sequences
|
|
Database: hg19 Primary Table: patBulk Data last updated: 2016-03-01
Big Bed File Download: /gbdb/hg19/bbi/patBulk.bb Item Count: 19,048,427 The data is stored in the binary BigBed format.
Format description: Summary information about a patent sequence derived from all documents that reference the sequence
field | example | description |
chrom | chr1 | Chromosome (or contig, scaffold, etc.) | chromStart | 166167201 | Start position in chromosome | chromEnd | 166167214 | End position in chromosome | name | cwEIchF6v8yN0S-Fyq6-NA | Name of item | score | 0 | Score from 0-1000 | strand | - | + or | thickStart | 166167201 | Start of where display should be thick (start codon) | thickEnd | 166167214 | End of where display should be thick (stop codon) | reserved | 227,180,2 | Used as itemRgb as of 2004-11-22 | blockCount | 1 | Number of blocks | blockSizes | 13, | Comma separated list of block sizes | chromStarts | 0, | Start positions relative to chromStart | docCount | 2 | Number of documents | claimCount | 0 | Documents with this sequence in the claims | grantCount | 1 | Granted patents | org | D. melanogaster | Declared organisms | dateRange | 21. Jul 2011 - 15. May 2012 | Publication dates (earliest - latest) | patTitle | Ribonucleic acid interference molecules and binding sites derived by analyzing intergenic and intronic regions of genomes (2) | Patent document titles and document counts (max. 10 titles) | intDocIds | US_2011_0178283_A1,US_8178503_B2 | Links to documents | claimGrantSeqIds | | Patents with this sequence in the claims | grantSeqIds | US_8178503_B2/sequences/view/618240|US_8178503_B2-618240 (744661) | Patents with this sequence | claimSeqIds | | Applications with this sequence in the claims | appSeqIds | US_2011_0178283_A1/sequences/view/618240|US_2011_0178283_A1-618240 (744661) | Applications with this sequence | mouseOver | Ribonucleic acid interference molecules and binding sites derived by analyzing intergenic and intronic regions of genomes (2 documents, 0 in claims, 1 granted) | Mouseover | fprint | | PatSeq Fingerprint |
|
| |
|
|
Sample Rows
|
|
chrom | chromStart | chromEnd | name | score | strand | thickStart | thickEnd | reserved | blockCount | blockSizes | chromStarts | docCount | claimCount | grantCount | org | dateRange | patTitle | intDocIds | claimGrantSeqIds | grantSeqIds | claimSeqIds | appSeqIds | mouseOver | fprint |
chr1 | 166167201 | 166167214 | cwEIchF6v8yN0S-Fyq6-NA | 0 | - | 166167201 | 166167214 | 227,180,2 | 1 | 13, | 0, | 2 | 0 | 1 | D. melanogaster | 21. Jul 2011 - 15. May 2012 | Ribonucleic acid interference molecules and binding sites derived by analyzing intergenic and intronic regions of genomes (2) | US_2011_0178283_A1,US_8178503_B2 | | US_8178503_B2/sequences/view/618240|US_8178503_B2-618240 (744661) | | US_2011_0178283_A1/sequences/view/618240|US_2011_0178283_A1-618240 (744661) | Ribonucleic acid interference molecules and binding sites derived by analyzing intergenic and intronic regions of genomes (2 doc ... | |
chr1 | 166167204 | 166167223 | nWFf0bY8aW6XKZrGZRVhPw | 1 | - | 166167204 | 166167223 | 10,116,178 | 1 | 19, | 0, | 4 | 1 | 0 | Homo sapiens | 08. Dec 2005 - 03. Mar 2011 | Polynucleotides for causing RNA interference and method for inhibiting gene expression using the same (2); POLYNUCLEOTIDE CAUSIN ... | CA_2566286_A1,WO_2005_116204_A1,US_2008_0113351_A1,US_2011_0054005_A1 | | | US_2008_0113351_A1/sequences/view/10087|US_2008_0113351_A1-10087 (793825) | CA_2566286_A1/sequences/view/10087|CA_2566286_A1-10087 (793815),WO_2005_116204_A1/sequences/view/10087|WO_2005_116204_A1-10087 ( ... | Polynucleotides for causing RNA interference and method for inhibiting gene expression using the same (4 documents, 1 in claims, ... | |
chr1 | 166167987 | 166168289 | MQ5_Jyng_95szdSXmaL2Ug | 0 | + | 166167987 | 166168289 | 20,178,187 | 1 | 302, | 0, | 2 | 0 | 0 | Homo sapiens | 02. Aug 2001 - 24. Apr 2003 | NUCLEIC ACIDS, PROTEINS, AND ANTIBODIES (2) | WO_2001_055320_A2,US_2003_0077808_A1 | | | | WO_2001_055320_A2/sequences/view/6146|WO_2001_055320_A2-6146 (6642),US_2003_0077808_A1/sequences/view/6146|US_2003_0077808_A1-61 ... | NUCLEIC ACIDS, PROTEINS, AND ANTIBODIES (2 documents, 0 in claims, 0 granted) | |
chr1 | 166169239 | 166169251 | uM3Nu9hlDdg7f4jA7Logfg | 0 | + | 166169239 | 166169251 | 20,178,187 | 1 | 12, | 0, | 1 | 0 | 0 | Artificial | 02. Dec 2004 | Detection of single nucleotide polymorphisms (snp's) and cytosine-methylations | US_2004_0241651_A1 | | | | US_2004_0241651_A1/sequences/view/368120|US_2004_0241651_A1-368120 (381998) | Detection of single nucleotide polymorphisms (1 documents, 0 in claims, 0 granted) | |
chr1 | 166169455 | 166169471 | qkIz2oJr8TP3wU3TM0obtw | 0 | - | 166169455 | 166169471 | 227,180,2 | 1 | 16, | 0, | 2 | 0 | 1 | M. musculus | 21. Jul 2011 - 15. May 2012 | Ribonucleic acid interference molecules and binding sites derived by analyzing intergenic and intronic regions of genomes (2) | US_2011_0178283_A1,US_8178503_B2 | | US_8178503_B2/sequences/view/365638|US_8178503_B2-365638 (744661) | | US_2011_0178283_A1/sequences/view/365638|US_2011_0178283_A1-365638 (744661) | Ribonucleic acid interference molecules and binding sites derived by analyzing intergenic and intronic regions of genomes (2 doc ... | |
chr1 | 166171367 | 166171917 | hKyl9IlB_Eo2uRMzal0k9Q | 0 | - | 166171367 | 166171917 | 20,178,187 | 1 | 550, | 0, | 2 | 0 | 0 | Homo sapiens | 13. Oct 2005 - 05. Jun 2007 | Identification and mapping of single nucleotide polymorphisms in the human genome (2) | US_2005_0228172_A9_20051013,US_H002191_H1 | | | | US_2005_0228172_A9_20051013/sequences/view/302153|US_2005_0228172_A9_20051013-302153 (585873),US_H002191_H1/sequences/view/30215 ... | Identification and mapping of single nucleotide polymorphisms in the human genome (2 documents, 0 in claims, 0 granted) | |
chr1 | 166171377 | 166171917 | tBkeUNl3GcrHwrv8407ChQ | 0 | - | 166171377 | 166171917 | 20,178,187 | 1 | 540, | 0, | 3 | 0 | 0 | Homo sapiens | 16. Mar 2006 - 01. Jul 2008 | Identification and mapping of single nucleotide polymorphisms in the human genome (3) | US_2006_0057564_A1,US_2006_0057564_A1,US_H002220_H1 | | | | US_2006_0057564_A1/sequences/view/378417|US_2006_0057564_A1-378417 (794875),US_2006_0057564_A1/sequences/view/991826|US_2006_005 ... | Identification and mapping of single nucleotide polymorphisms in the human genome (3 documents, 0 in claims, 0 granted) | |
chr1 | 166171452 | 166171475 | YcZx1CRdXbs8965Bn-essA | 0 | - | 166171452 | 166171475 | 227,180,2 | 1 | 23, | 0, | 4 | 0 | 1 | Homo sapiens | 17. Aug 2006 - 21. May 2013 | Ribonucleic acid interference molecules (3); RIBONUCLEIC ACID INTERFERERNCE MOLECULES AND METHODS FOR GENERATING PRECURSOR/MATUR ... | CA_2588023_A1,US_2008_0125583_A1,US_2012_0040460_A1,US_8445666_B2 | | US_8445666_B2/sequences/view/156341|US_8445666_B2-156341 (167247) | | CA_2588023_A1/sequences/view/156341|CA_2588023_A1-156341 (167247),US_2008_0125583_A1/sequences/view/156341|US_2008_0125583_A1-15 ... | Ribonucleic acid interference molecules (4 documents, 0 in claims, 1 granted) | |
chr1 | 166171486 | 166171815 | i-UwGtL8r6eMg5eQirBGnw | 0 | + | 166171486 | 166171815 | 227,180,2 | 1 | 329, | 0, | 2 | 0 | 1 | Homo sapiens | 12. Apr 2007 - 29. Jun 2010 | Methods and systems for annotating biomolecular sequences (1); Human thrombospondin polypeptide (1) | US_2007_0083334_A1,US_7745391_B2 | | US_7745391_B2/sequences/view/142059|US_7745391_B2-142059 (649917) | | US_2007_0083334_A1/sequences/view/142059|US_2007_0083334_A1-142059 (649917) | Methods and systems for annotating biomolecular sequences (2 documents, 0 in claims, 1 granted) | |
chr1 | 166171495 | 166172173 | P_khN5Q0hJY8UxLVThSnyA | 0 | + | 166171495 | 166172173 | 227,180,2 | 1 | 678, | 0, | 6 | 0 | 2 | Homo sapiens | 08. Feb 2007 - 06. May 2008 | Human secreted proteins (4); HADDE71 polypeptides (2) | US_2007_0032413_A1,US_2007_0032413_A1,US_2007_0048818_A1,US_2007_0048818_A1,US_7368527_B2,US_7368527_B2 | | US_7368527_B2/sequences/view/12175|US_7368527_B2-12175 (8886),US_7368527_B2/sequences/view/12176|US_7368527_B2-12176 (8886) | | US_2007_0032413_A1/sequences/view/12175|US_2007_0032413_A1-12175 (8886),US_2007_0032413_A1/sequences/view/12176|US_2007_0032413_ ... | Human secreted proteins (6 documents, 0 in claims, 2 granted) | |
|
| |
|
|
Lens Patents (patSeq) Track Description
|
|
Description
This track shows genome matches to biomedical sequences submitted with patent application
documents to patent offices around the world. The sequences, their mappings, and selected
patent information were graciously provided by PatSeq, a search tool part of The Lens,
Cambia.
This track contains more data than the NCBI Genbank Division "Patents", as the
sequences were extracted from patents directly.
Display Convention and Configuration
The data is split into two subtracks: one for sequences that are only part of patents that
have submitted more than 100 sequences ("bulk patents")
and a second track for all other sequences ("non-bulk patents").
A sequence can be
part of many patent documents, with some being found in several thousand patents.
This track shows only a single alignment for every sequence, colored based on
its occurrence in the different patent documents and using a color schema similar to The Lens.
Based on the first sequence match, the four different item colors follow this priority ranking in
descending order:
| the sequence is referenced in the claims of a granted patent |
| the sequence is disclosed in a granted patent |
| the sequence is referenced in the claims of a patent application |
| the sequence is disclosed in a patent application |
Sequences referenced in the claims section of a
patent document define the scope of the invention and are important during
litigation. Therefore, they are given priority in the color scheme. Patent
grant documents form the basis of patent protection and are prioritized over
applications.
Hover over a feature with the mouse to see
the total number of documents where the sequence has been referenced, how many
of these documents are granted patents and how often the sequence has been
referenced in the claims. A randomly selected document title is also shown in
the mouseover.
Clicking on a feature will bring up the details page, which contains information about
the sequence and alignment of that feature.
The link at the top of the page opens the PatSeq Analyzer with
the chromosomal region covered by the feature that was clicked. The PatSeq Analyzer
is a specialized genome browser that allows for the viewing and filtering of patent
sequence matches in detail.
The next section of the details page is a list of up to ten patent documents that include this
sequence, with the number of occurrences within each document in parentheses.
This is followed by up to thirty links to patent documents. The patent documents listed in these
sections are displayed in order of the number of sequence occurrences in the document. Shown below
these are the links to the sequence in The Lens, in the format
"patentDocumentIdentifier-SEQIDNO (docSequenceCount)". The "SEQ ID NO"
is an integer number, the unique identifier of a patent sequence in a patent
document. When a protein sequence has been annotated on a nucleotide sequence,
the "SEQ ID NO" contains the reading frame separated by a ".", e.g.
"1.1" would indicate the first frame of SEQIDNO 1.
The total number of sequences submitted with the patent document ("docSequenceCount") is
shown in parentheses after the SEQIDNO. The links to the sequence are separated into the
categories "granted and in claims", "granted", "in claims"
and "applications" (=all others). Sequence identifiers link to the respective pages on PatSeq. A maximum of thirty documents
are linked from this page per category listed in order of the number of sequence occurrences;
please use PatSeq Analyzer to view all matching documents.
The score of the features in this track is the number of documents where the
sequence appears in the claims. For example, by setting the score filter to 1, only
sequences are shown that have been referenced at least once in the claims.
Methods
More than 96 million patent document files were collected by The Lens. The
ST.25-formatted
sequences were extracted and mapped to genomes with the aligners BLAT and BWA. The minimal
identity of the query over the alignment is 95%. Note that for hg19, no patents are shown
on chrM, as the mitochondrial chromosome used for the mapping was the one from
the Ensembl genome FASTA files.
Credits
Thanks to the team behind The Lens,
in particular,
Osmat Jefferson
and Deniz Koellhofer, for making these data available.
Feedback
Send suggestions on the way data in this track is visualized to our support
address
genome@soe.ucsc.edu.
Questions on the data itself are best directed to
support@cambia.org.
Data access
The raw data can be explored interactively with the Table Browser.
For automated download and analysis, the genome annotation is stored in a bigBed file that
can be downloaded from
our download server.
The files for this track are called patNonBulk.bb and patBulk.bb. Individual
regions or the whole genome annotation can be obtained using our tool bigBedToBed
which can be compiled from the source code or downloaded as a precompiled
binary for your system. Instructions for downloading source code and binaries can be found
here.
The command to obtain the data as a tab-separated table looks like this:
bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg19/bbi/patNonBulk.bb -chrom=chr5 -start=1000000 -end=2000000 output.tsv
A full log of the commands that were used to build this annotation is available
from our database
build description. In this text file, search for "patNonBulk" to find the right section.
References
Editorial: The patent bargain
Nature. 2013 Dec 12;504(7479):187-188.
Patently transparent.
Nat Biotechnol. 2006 May;24(5):474.
PMID: 16680110
Jefferson OA, Köllhofer D, Ehrich TH, Jefferson RA.
Transparency tools in gene patenting for informing policy and practice.
Nat Biotechnol. 2013 Dec;31(12):1086-93.
PMID: 24316644
| |
|
|
|