Enhancers

From Wiki
Jump to navigationJump to search

This page serves as a placeholder for data on transcribed enhancers, for sharing within FANTOM5. More information and data can be found at enhancer.binf.ku.dk.

Questions/requests could be sent to Robin Andersson [robin@binf.ku.dk], Claudia Gebhard [Claudia.Gebhard@klinik.uni-regensburg.de], Ilka Hoof [ilka@binf.ku.dk], Michael Rehli [michael.rehli@klinik.uni-regensburg.de] or Albin Sandelin [albin@binf.ku.dk].

Introduction

Active enhancers are weakly transcribed, and in many cases we can pick this up by CAGE. Briefly, enhancers are found by finding pairs of CAGE tag clusters that are on different strands (where the minus strand cluster is upstream of the plus strand cluster) and those clusters are not more than two nucleosome deficient regions apart. For more details in terms of cutoffs etc, see the manuscript below. There are several attractive analysis features with CAGE-defined enhancers:

  • The expression of CAGE tags correlate extremely well to enhancer activity over cells. This means that they can be used to define cell-specific regulators.
  • It is possible to predict target genes by correlating enhancer and TSS CAGE expression. Word of caution: this is a correlation-based prediction so the correlation might be due to other things like indirect regulation.
  • The CAGE tags define a quite narrow region compared just about every other method for finding enhancers (typically 180 nt) which correspond very clearly with DNase cleavage sensitivity and the nucleosome boundaries, meaning that the tags are good proxies for the borders of the accessible region. Depending on your analysis, you might want to keep the sizes of these enhancers or extend the the boundaries.
  • Many subsets of disease-associated SNPs are highly enriched in these regions, and the expression of those enhancers often make sense for the disease

We have made enhancer sets for phase 1 and phase 2 using the same basic algorithm but the statistics are slightly different since phase 2 have replicates (see below).

Phase1 enhancers (snapshot samples)

Manuscript available here: /webdav/home/albin/enhancerome/.

Enhancers

Enhancers are represented in BED12 file format. Start and end specify the outer regions of DPI tag clusters used to identify the bidirectional loci specifying the enhancers. First and second block denote the reverse strand and forward strand transcribed regions, respectively. Thick start specifies the mid position, between inner boundaries of blocks. Column 4 gives the enhancer id used for reference.

ZENBU main views with phase1 enhancers

Enhancer usage across FANTOM5 ontology facets

tracks/

Libraries were grouped into mutually exclusive "facets" according to the FANTOM5 sample ontology mapping to UBERON and cell ontologies. Mappings can be found here:

Enhancer selection tool by custom expression constraints (sliders)

enhancer.binf.ku.dk Also has most of the above bed files and motif search results from the Rehli group for "facet-specific" enhancers. Password is 9EPefA4e

Phase2 enhancers (time course samples)

Data updated April 12, 2013
Old data (uploaded January 21 2013, email fantom5:01848) are still available from here.

Enhancers

Phase2 enhancers were identified in human and mouse from phase2 DPI tag clusters (webdav/home/kawaji/121001-phase2-DPI/, file tc.decompose_smoothing_merged.bed) similar to phase1 enhancers. In order to not redefine phase1 enhancers, detected bidirectionally transcribed loci were filtered to not overlap phase1 enhancers. The union of the two enhancer sets (phase1 and phase2) were used for further analyses.

Enhancers are represented in BED12 file format. Start and end specify the outer regions of DPI tag clusters used to identify the bidirectional loci specifying the enhancers. First and second block denote the reverse strand and forward strand transcribed regions, respectively. Thick start specifies the mid position, between inner boundaries of blocks. Column 4 gives the enhancer id used for reference.

Enhancer expression data (RLE TPM and count data) across all phase 1 and 2 samples can be found here: Enhancers#Data

ZENBU main views with phase2 enhancers

To be done

FANTOM5 time course data

Data updated April 12, 2013
Old data (uploaded January 21 2013, email fantom5:01848) are still available from here.

Quantification of enhancer expression (raw counts and TPM and RLE normalized) has been for each sample included in the time course main paper freeze (email fantom5:01916). Pair wise differential expression between time points in each time course was calculated using edgeR. Only enhancers with at least 3 tags supporting its expression were considered for differential expression. For each time course a number of matrices are available:

  • Raw counts expression matrix (tab separated, column names: CNhs IDs, row names: enhancer IDs)
  • RLE and TPM normalized expression matrix (tab separated, column names: CNhs IDs, row names: enhancer IDs)
  • edgeR log2 fold change matrix (tab separated, column names: time point comparisons, row names: enhancer IDs)
  • edgeR FDR matrix (tab separated, column names: time point comparisons, row names: enhancer IDs)

For each time course, data is available in gzipped tar balls from here.

Figures depicting edgeR BCV, number of expressed enhancers (tag counts >= 3) and the number of differentially expressed enhancers (FDR <= 0.05) row vs column, along with data can be accessed below:

Human

Mouse

Enhancer-promoter associations

Enhancer-promoter associations were predicted based on expression correlation (Pearson) between all pairs of enhancers and promoters within a distance of 500kb. Associations below have a FDR (Benjamini-Hochberg) < 1e-5 (permissive, further thresholding on distance and Pearson's r can be done). A negative distance indicates that the enhancer is upstream of the promoter. Only robust DPI tag clusters near a 5' end of an annotated transcript were considered.

Early response enhancers

Work in progress

Enhancer selection tool by custom expression constraints (sliders)

To be done

Data

Tab-separeated expression matrices for all UPDATE_022 human and mouse samples with non-zero RLE factors based one phase 1 and 2 enhancers (Enhancers#Enhancers_2) are available below. First line is a header.