Tag Cluster Annotation

From Wiki
Revision as of 23:26, 2 March 2011 by Ameynert (talk | contribs) (Created page with '==Committed names== * Piero Carninci * Laurens Wilming * Timo Lassmann * Richard Baldarelli * Juha Kere * Leonard Lipovich(long ncRNA promoters, sense-antisense pair promoters, b…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Committed names

  • Piero Carninci
  • Laurens Wilming
  • Timo Lassmann
  • Richard Baldarelli
  • Juha Kere
  • Leonard Lipovich(long ncRNA promoters, sense-antisense pair promoters, bidirectional promoters, global human lncRNAome and sense-antisense coordinates)
  • Boris Lenhard(enhancers)
  • Alison Meynert (Ensembl gene models)

Output requirements/formats

Proposal - flat file format

OSCtable format tab-delimited file, one per species.

Here is an initial sketch of a possible flat-file format:

##
## Comments and meta-data TBD
## Species: Homo sapiens
## NCBI taxon id: 9606
## FANTOM5 UPDATE_009
## 
Tag_cluster_id Library_id  Annotation_class      Annotation_type      Annotation_id   Cluster_ref_pos_distance
TSC000001      CNhs11772   CORE_PROMOTER         ENSEMBL_TRANSCRIPT   ENST00000000001 -30
TSC000002      .           3_PRIME_UTR           UCSC_TRANSCRIPT      GENE1           .
TSC000003      .           CORE_PROMOTER         LONG_NC_RNA          LEONARD01       -2
TSC000004      CNhs11334   LONG_RANGE_REGULATION VISTA_ENHANCER       VISTA001        .
TSC000005      .           EXTENDED_PROMOTER     ENSEMBL_TRANSCRIPT   ENST00000000002 -503

If the library id is given, the tag cluster is associated with that specific library; otherwise, it is associated with the aggregate of all libraries. It is possible that some annotations will not be required on a per-library basis.

Some types of annotation (e.g. core promoter, extended promoter) we will want to include the distance from the tag cluster reference position to the annotation position (e.g. annotated protein-coding gene TSS). For other types (e.g. 3' UTR, exonic), it's enough to know that the tag cluster reference position overlaps that annotation, and the distance can be unspecified.

Milestones

  1. Agreement on annotations to use (Working group notes from Piero)
    • Mailing list request
  2. Annotation of release 009 clusters using agreed strategy
    • Are we waiting on the results of the tag cluster competition or is there a test set of clusters that we can start working on?
  3. Annotation of data freeze 1 - ASAP after freeze