Long noncoding RNA main paper: Difference between revisions

From Wiki
Jump to navigationJump to search
No edit summary
No edit summary
Line 5: Line 5:


=Tasks for the paper=
=Tasks for the paper=
If you are interested in assisting with a task below, please add your name before the task in parantheses. Names have already been added for people expressing interest or currently involved in tasks as discussed at the FANTOM5 Kouyou meeting. There are still tasks with no one assigned, if you are interested please put your name down. Conceivably, some of these tasks can and will be made into satellite papers which will be referred to by the main paper but we are including them here at present.
If you are interested in assisting with a task below, please add your name before the task in parantheses. Names have already been added for people expressing interest or currently involved in tasks as discussed at the FANTOM5 Kouyou meeting. There are still tasks with no one assigned, if you are interested please put your name down. Conceivably, some of these tasks will end up as satellite papers which will be referred to by the main paper but we are including them here at present.


==Annotation/analysis of the non-redundant lncRNAome across FANTOM5 dataset==
==Annotation/analysis of the non-redundant lncRNAome across FANTOM5 dataset==
Line 27: Line 27:


==General functional classification of lncRNAs==
==General functional classification of lncRNAs==
Here we are interested in defining relative percentages of lncRNAs likely to be acting in "cis" or "trans" and making general functional predictions for individual lncRNAs based on co-expression.
Here we are interested in isolating lncRNAs likely to be acting in "cis" and making general functional predictions for individual lncRNAs based on co-expression. We are also interested in probing to what extent we can identify which lncRNAs may be involved in "trans" like regulation and which may primarily function as precursors for small RNA biogenesis.


Specific tasks:
Specific tasks:
# (Nicolas, Leonard) Preliminary classification of lncRNAs into likely cis- and trans- acting
# (Nicolas, Leonard) Preliminary classification of lncRNAs into likely cis- and trans- acting
## (Nicolas) construction of "cis-acting" lncRNA chains
## (Nicolas) sense-antisense co-expression at all lncRNA-mRNA sense-antisense pairs for latest data updates
## (Leonard) annotation of the above into a curated set representing the "Chainome"
## (Leonard) annotation of the above into a curated set representing the "Chainome"
## (Nicolas) subtraction of above yields set of potential trans-acting lncRNAs
## (Nicolas, Yulia) identification of potential trans-acting lncRNAs
# cis-acting lncRNA analysis (each analysis performed on both the complete set extracted by Nicolas and the chainome curated by Leonard's lab)
# cis-acting lncRNA analysis (each analysis performed on both the complete set extracted by Nicolas and the chainome curated by Leonard's lab)
## (Timo, Robin, Nicolas) linking lncRNA expression to groups of locally-connected genes
## (Timo, Robin, Nicolas) linking lncRNA expression to groups of locally-connected genes
## (Eivind, Finn, Tom, Nicolas) co-expression analysis to inform function of individual lncRNAs and effects of lncRNAs on chains
## (Eivind, Finn, Tom, Nicolas) co-expression analysis to inform function of individual lncRNAs and effects of lncRNAs on chains
## (Michiel) MARA analysis to see influence of cis-acting lncRNAs on transcriptional network (see motif enrichment section below)
## (Eivind, Helena, Max) overlaying small RNA information with ncRNA found in chains
## (Eivind, Helena, Max) overlaying small RNA information with ncRNA found in chains
### similar to above, search for potential effects on expression of lncRNA and coding RNA in presence/absence of small RNA and its orientation
### similar to above, search for potential effects on expression of lncRNA and coding RNA in presence/absence of small RNA and its orientation
### (Eivind, Helena, Max, Martin, CRBC (see structure section below)) ncRNA serving as possible small RNA precursors
# trans-acting lncRNA analysis
# trans-acting lncRNA analysis
## (Eivind, Finn, Tom, Nicolas) co-expression analysis to inform function of individual lncRNAs
## (Eivind, Finn, Tom, Nicolas) co-expression analysis to inform function of individual lncRNAs
## (Nicolas) reverse, window-based homology analysis of trans-acting lncRNAs to determine potential sites of activity on the genome
## (Nicolas) reverse, window-based homology analysis of trans-acting lncRNAs to determine potential sites of activity on the genome
### overlay this analysis with co-expression results
### overlay this analysis with co-expression results
## (Yulia) direct/inverse co-expression patterns of lncRNAs with known-gene mRNAs with Alu-S in 3'UTRs (based on Gong and Maquat 2011 Nature paper)

==Identification of novel lncRNAs using RNA-seq/CAGE-scan==
see RNA-seq page for [https://fantom5-collaboration.gsc.riken.jp/wiki/index.php/FANTOM5_RNA-seq details].
# (Max) collection of usable public RNA-seq data
# (Max) integration with FANTOM5 RNA-seq
# (Nicolas, Max) CAGE-scan integration
# (Laurens) annotation of set of novel lncRNAs


==Motif enrichment in promoter regions of lncRNAs==
==Motif enrichment in promoter regions of lncRNAs==
# (Boris Jankovic) Comparisons of motif enrichment (difference in cis- vs. trans- ?)
# (Boris Jankovic) Comparisons of motif enrichment (difference in cis- vs. trans- ?)
# Location/orientation of binding motifs within promoters
# Location/orientation of binding motifs within promoters
# (Michiel) MARA analysis on lncRNAome
# (Michiel) MARA analysis on lncRNAome to identify possible candidates important to the transcriptional network


==lncRNA conservation in matching mouse primary cells==
==lncRNA conservation in matching mouse primary cells==
Line 63: Line 73:


Specific tasks:
Specific tasks:
# (Emily, Leonard) selection of candidate target networks
# (Emily, Leonard) selection of candidate target networks, likely those influencing tf transcription
# (coordinated by Haru, WP6) knockdown of lncRNAs, measuring influence of lncRNAs
# (coordinated by Haru, WP6) knockdown of lncRNAs, measuring influence of lncRNAs, probing transcriptional network perturbations

==Identification of novel lncRNAs using RNA-seq/CAGE-scan==
see RNA-seq page for [https://fantom5-collaboration.gsc.riken.jp/wiki/index.php/FANTOM5_RNA-seq details].
# (Max) collection of usable public RNA-seq data
# (Max) integration with FANTOM5 RNA-seq
# (Nicolas, Max) CAGE-scan integration
# (Laurens) annotation of set of novel lncRNAs


==Structure features/subclassification of lncRNAs==
==Structure features/subclassification of lncRNAs==
The intention here is to provide a comprehensive classification of lncRNAs based on structural features with the help of RNA-seq and short RNA data (e.g. splicing architecture, evidence of processed intermediates, translational potential, positioning relative to other genome markers, etc.) However, quite a bit of this was performed in a recent paper by Cabili so we will have to see if there is scope for something new in FANTOM5.
The intention here is to provide a comprehensive classification of lncRNAs based on structural features with the help of RNA-seq and short RNA data (e.g. splicing architecture, evidence of processed intermediates, translational potential, positioning relative to other genome markers, etc.) However, quite a bit of this was performed in a recent paper by Cabili so we will have to see if there is scope for something new in FANTOM5.


# (Martin Frith & colleagues?) structure of lncRNA with overlap to short RNA
# (Martin Frith & CBRC) structure of lncRNA with overlap with short RNA


==eRNA analysis==
==eRNA analysis==
Line 117: Line 120:


==General functional classification of lncRNAs==
==General functional classification of lncRNAs==
* Nicolas cis/trans differentiation
* Nicolas cis-acting classification
** Leonard annotation of formal "Chainome"
** Leonard annotation of formal "Chainome"
*** Timo, Robin, Nicolas establishing locally-connected genes with lncRNA chains
*** Timo, Robin, Nicolas establishing locally-connected genes with lncRNA chains
Line 125: Line 128:
* Nicolas identifying complete space of physical interaction for trans-acting lncRNAs
* Nicolas identifying complete space of physical interaction for trans-acting lncRNAs
** Nicolas overlaying the above two
** Nicolas overlaying the above two
* Yulia Alu-S role in trans-acting lncRNAs


==Motif enrichment in promoter regions of lncRNAs==
==Motif enrichment in promoter regions of lncRNAs==
Line 142: Line 146:


==Structure features/subclassification of lncRNAs==
==Structure features/subclassification of lncRNAs==
* CRBC general structural features of identified classes of lncRNAs
* take list of lncRNAs with overlapping short RNAs from above
* take list of lncRNAs with overlapping short RNAs from above
** ? identification of "precursor" structure from RNA-seq and short RNA
** CRBC identification of "precursor" structure from RNA-seq and short RNA
** ? secondary structure predictions of lncRNAs in short RNA regions
** CRBC secondary structure predictions of lncRNAs in short RNA regions
*** ? integration of the two above
*** CRBC integration of the two above


==eRNA analysis==
==eRNA analysis==

Revision as of 18:03, 31 October 2011

Welcome to the FANTOM5 long noncoding RNA (lncRNA) main paper page. This page will be used to list tasks and discuss ongoing analyses for the paper. For information on ncRNA data resources, see this page. Please keep in mind that this paper has already been the subject of extensive discussion in many forums and we need to move quickly on this paper. While we are always interested in exciting new analyses, if you have something new to introduce/propose please do so with the intention of personally carrying out the analysis.

Paper objectives

This paper aims to capture the complete breadth and diversity of long noncoding RNAs (lncRNAs) while leveraging the unique qualities of FANTOM5 to understand their cellular restriction and evolutionary impact on the human genome. For the purpose of this paper, it is important to note that our definition of lncRNAs is more broader than others and includes bidirectional/nested/cis-antisense lncRNAs and unspliced single-exon lncRNA genes with hCAGE support. In addition, this paper will use genome organization and context to probe functional properties and provide a comprehensive classification scheme for lncRNAs.

Tasks for the paper

If you are interested in assisting with a task below, please add your name before the task in parantheses. Names have already been added for people expressing interest or currently involved in tasks as discussed at the FANTOM5 Kouyou meeting. There are still tasks with no one assigned, if you are interested please put your name down. Conceivably, some of these tasks will end up as satellite papers which will be referred to by the main paper but we are including them here at present.

Annotation/analysis of the non-redundant lncRNAome across FANTOM5 dataset

Leonard Lipovich's lab has undertaken and completed the Herculean task of assembling and annotating the set of non-redundant, known lncRNAs and supplemented this with the set provided by Gencode. Preliminary viewing of the analysis in ZENBU suggests many lncRNAs are tissue-specific; this is an important point of order for the FANTOM5 data and this main paper.

Specific tasks:

  1. (Leonard) Final tweaks to the FANTOM5 lncRNAome
    1. (Leonard) Inclusion of latest lncRNAs from the Cabili paper (?)
  2. (WP4, Nicolas, Leonard) Obtain the list of hCAGE promoter peaks associating with lncRNAome from the final filtered and normalized clustering values
  3. (Lukasz) Primary-cell specific expression
    1. (Lukasz) Top-expressed lncRNAs in the total dataset and in different tissues (made available on the wiki to sample providers)
    2. (Lukasz) Identification of "cell-type" specific lncRNAs (made available on the wiki to sample providers)
  4. (Lukasz) Time course expression
    1. (Lukasz) Significant differences in lncRNA expression across time points across all time courses
    2. (Lukasz) lncRNA expression shared across multiple time courses
  5. (Lukasz) Analysis of lncRNAs (done in comparison with analysis of coding RNA--i.e. main promoterome paper analysis)
    1. (Lukasz) House-keeping vs tissue-specific lncRNAs (vs. coding RNAs)
    2. (Lukasz) Clustering of primary cells/tissues with respect to their lncRNA expression profiles
    3. (Lukasz) PCA and multidimensional scaling to find tissues with most lncRNA expression differences / similarity (vs. coding RNA)
  6. (Lukasz) All of the above tasks can be repeated to look for differences in cis- and trans-acting lncRNAs (see below)

General functional classification of lncRNAs

Here we are interested in isolating lncRNAs likely to be acting in "cis" and making general functional predictions for individual lncRNAs based on co-expression. We are also interested in probing to what extent we can identify which lncRNAs may be involved in "trans" like regulation and which may primarily function as precursors for small RNA biogenesis.

Specific tasks:

  1. (Nicolas, Leonard) Preliminary classification of lncRNAs into likely cis- and trans- acting
    1. (Nicolas) sense-antisense co-expression at all lncRNA-mRNA sense-antisense pairs for latest data updates
    2. (Leonard) annotation of the above into a curated set representing the "Chainome"
    3. (Nicolas, Yulia) identification of potential trans-acting lncRNAs
  2. cis-acting lncRNA analysis (each analysis performed on both the complete set extracted by Nicolas and the chainome curated by Leonard's lab)
    1. (Timo, Robin, Nicolas) linking lncRNA expression to groups of locally-connected genes
    2. (Eivind, Finn, Tom, Nicolas) co-expression analysis to inform function of individual lncRNAs and effects of lncRNAs on chains
    3. (Michiel) MARA analysis to see influence of cis-acting lncRNAs on transcriptional network (see motif enrichment section below)
    4. (Eivind, Helena, Max) overlaying small RNA information with ncRNA found in chains
      1. similar to above, search for potential effects on expression of lncRNA and coding RNA in presence/absence of small RNA and its orientation
      2. (Eivind, Helena, Max, Martin, CRBC (see structure section below)) ncRNA serving as possible small RNA precursors
  3. trans-acting lncRNA analysis
    1. (Eivind, Finn, Tom, Nicolas) co-expression analysis to inform function of individual lncRNAs
    2. (Nicolas) reverse, window-based homology analysis of trans-acting lncRNAs to determine potential sites of activity on the genome
      1. overlay this analysis with co-expression results
    3. (Yulia) direct/inverse co-expression patterns of lncRNAs with known-gene mRNAs with Alu-S in 3'UTRs (based on Gong and Maquat 2011 Nature paper)

Identification of novel lncRNAs using RNA-seq/CAGE-scan

see RNA-seq page for details.

  1. (Max) collection of usable public RNA-seq data
  2. (Max) integration with FANTOM5 RNA-seq
  3. (Nicolas, Max) CAGE-scan integration
  4. (Laurens) annotation of set of novel lncRNAs

Motif enrichment in promoter regions of lncRNAs

  1. (Boris Jankovic) Comparisons of motif enrichment (difference in cis- vs. trans- ?)
  2. Location/orientation of binding motifs within promoters
  3. (Michiel) MARA analysis on lncRNAome to identify possible candidates important to the transcriptional network

lncRNA conservation in matching mouse primary cells

Anayzing the presence/absence of lncRNA peaks in mouse and human under the assumption that lncRNAs play a specific role in shaping the human/primate transcriptome. Many of these analyses could also be extended to aortic smooth muscle cells in rat, dog, and chicken.

Specific tasks:

  1. (?) conservation frequency of human-specific lncRNAs in matching mouse primary cells
    1. (?) relative conservation of cis- and trans- acting
  2. (?) analysis of sequence conservation; promoter regions vs. the length of the transcript.
  3. (Leonard talks to Nicolas) conservation of "chainome"
  4. (Yulia for global analysis, Leonard for annotation) frequency and conservation of Alu-initiated TSS in lncRNA in humans vs. mouse

Network validation

Probing lncRNA function through perturbation in identified networks.

Specific tasks:

  1. (Emily, Leonard) selection of candidate target networks, likely those influencing tf transcription
  2. (coordinated by Haru, WP6) knockdown of lncRNAs, measuring influence of lncRNAs, probing transcriptional network perturbations

Structure features/subclassification of lncRNAs

The intention here is to provide a comprehensive classification of lncRNAs based on structural features with the help of RNA-seq and short RNA data (e.g. splicing architecture, evidence of processed intermediates, translational potential, positioning relative to other genome markers, etc.) However, quite a bit of this was performed in a recent paper by Cabili so we will have to see if there is scope for something new in FANTOM5.

  1. (Martin Frith & CBRC) structure of lncRNA with overlap with short RNA

eRNA analysis

eRNA (enhancer RNA) is a class of lncRNA of particular interest. Analysis of eRNA is being headed up by Robin Andersson (robin@binf.ku.dk).

Specific tasks:

  1. (Robin) identification/classification, percent lncRNAs that are eRNAs, along with rationale
  2. basic statistics (e.g. length distribution, etc.)
  3. cell specificity
  4. exploring relationship between eRNA and associated promoters interactions
    1. expression correlation
    2. mutual information approach
    3. intersection with publicly available spatial genomic organization data
  5. (Miura-san, Robin, Nicolas) validation of eRNA interaction with promoter regions by intersect with existing HiC (?) data and/or more targeted validations

lncRNA and human disease overlap

  1. (Kenny, Peter, Juha) overlaying GWAS data with lncRNA
    1. cis-/trans-enrichment, cell-specificity of affected lncRNAs, etc...
  2. (Leonard,Alka) Rhett syndrome and cis-chain

miRNA promoters

Satellite paper based on Eivind and Kawaji-san's work

  1. (Eivind/Kawaji-san) idefinition of miRNA promoters based on DROSHA-KD, small RNA-seq and upstream hCAGE peaks


Timeline/order of analyses

Instead of wasting time assigning a bunch of meaningless dates to each task, I'll work out some of the dependencies which gives an idea of the prioritization. Then we'll follow up with groups assigned to the tasks as soon as they can be accomplished. The lists below are structured to imply dependency (indented tasks follow non-indented tasks...)

This is all dependent on finalization and normalization of the Kawaji-san promoterome clusters; however, everything listed below can begin using available data. After RNA-seq is used to confirm novel lncRNAs from FANTOM5, we may need to rerun a selected portion of the analyses on this set and possibly on the integrated set.

Network validation

Given the time this will require, we should get moving with what we have currently.

  • Current best targets from Emily and Leonard sent to the OSC (Max and Al).
    • Max and Al--> discussion with WP6.

Annotation/analysis of the non-redundant lncRNAome across FANTOM5 dataset

  • Leonard submits final tweaks to the lncRNAome
    • Lukasz/Nicolas (?) perform listed tasks
      • listed tasks are performed again on the set of predicted cis-acting and trans-acting lncRNAs, looking for differences

General functional classification of lncRNAs

  • Nicolas cis-acting classification
    • Leonard annotation of formal "Chainome"
      • Timo, Robin, Nicolas establishing locally-connected genes with lncRNA chains
      • Eivind, Finn, Tom, Nicolas co-expression analysis on chains and complete set
        • Eivind, Helena, Max overlaying small RNA information on chains and effects of small RNA on expression
  • Eivind, Finn, Tom, Nicolas co-expression analysis on trans-acting lncRNAs
  • Nicolas identifying complete space of physical interaction for trans-acting lncRNAs
    • Nicolas overlaying the above two
  • Yulia Alu-S role in trans-acting lncRNAs

Motif enrichment in promoter regions of lncRNAs

  • Boris motif enrichment
  • Michiel MARA

lncRNA conservation in matching mouse primary cells

  • ? defines determinants of lncRNA conservation
    • ? basic statistics on human/mouse conservation (possibly dependent on cis-/trans- classification)
    • ? conservation in promoter region vs. remaining sequence
  • Leonard defines chain conservation
    • Nicolas looks at genome-wide conservation of chains

Identification of novel lncRNAs using RNA-seq/CAGE-scan

see RNA-seq page.

  • Laurens will receive the complete set of novel lncRNAs for further annotation

Structure features/subclassification of lncRNAs

  • CRBC general structural features of identified classes of lncRNAs
  • take list of lncRNAs with overlapping short RNAs from above
    • CRBC identification of "precursor" structure from RNA-seq and short RNA
    • CRBC secondary structure predictions of lncRNAs in short RNA regions
      • CRBC integration of the two above

eRNA analysis

  • Robin percentage of lncRNAs that are eRNAs and rationale for choosing this
  • Robin basic statistics/cell specificity
  • Robin eRNA and affected promoter analysis
    • Robin/others computational validation with public datasets
  • Miura-san wet lab validation

lncRNA and human disease overlap

  • Kenny/others GWAS overlap with lncRNAome set
    • accompanying analysis
  • Leonard and Alka pursue Rhett story