Satellite submission: Difference between revisions
| Line 311: | Line 311: | ||
---- |
---- |
||
== Title: |
== Title: The specific transcriptome dynamics of mesenchymal stromal/stem cells from high-grade serous ovarian cancer relate their social context and activity == |
||
'''ManuscriptID: '''Phase1_036 <br> '''Status: '''Working draft<br> '''Abstract: '''The role of cancer microenvironment is being recognized as one of the critical hallmarks in both cancer progression and metastasis. Mesenchymal Stem/Stromal Cells (MSCs) are the precursors of various cell types that compose both normal and cancer tissue microenvironments. We have isolated MSCs from various High-Grade Serous Ovarian Carcinomas (HG-SOCs), demonstrated their normal genotype, and analyzed their transcriptome using deep-CAGE analysis with respect to similarly derived normal tissues MSCsall embedded in the large comprehensive FANTOM5 sample dataset. |
|||
'''ManuscriptID: '''Phase1_036 <br> '''Status: '''Working draft<br> '''Abstract: '''From the most recent and accumulating evidence, the role of cancer microenvironment is being recognized as one of the most critical hallmarks in both cancer progression and metastasis. Mesenchymal Stem/Stromal Cells (MSCs) are the precursors of various cell types that compose both normal and cancer tissue microenvironments. We have isolated MSCs from various High-Grade Serous Ovarian Carcinomas (HG-SOCs), demonstrated their normal genotype, and analyzed their transcriptome using deep-CAGE analysis with respect to similarly derived normal tissues MSCs and to the comprehensive FANTOM5 sample dataset. The integrative analysis conducted against the extensive panel of primary cells and tissues of the FANTOM5 project allowed us to identify a cell-type specific transcriptional activity associated with the HG-SOC-MSCs. The hierarchical clustering analysis shows that MSCs derived from HG-SOCs co-cluster with other MSCs while retaining distinct transcriptional peculiarities. Their transcriptional activity shows a very strong correlation with that of primary mesothelial cells, which actually represent the embryonic cellular origin of serous ovarian cancer. Most importantly, this analysis has revealed HG-SOC-MSCs specific identity when compared to similarly derived MSCs from normal tissues such as bone marrow, heart and adipose tissues, enforcing the idea that the environment organized by the transformed serous ovarian cancer cells could be responsible for establishing such transcriptional specificity in the resident/mobilized stromal precursor cells. Integrating the identified transcriptional signatures of the HG-SOC-MSCs with the gene expression matrices of the publicly available TCGA HG-SOC dataset, we were able to trace HG-SOC-MSC signature in a fraction of the tumor samples. Altogether, the reported analysis support the hypothesis that HG-SOC-MSCs are bona-fide representatives of the ovarian district, either tracing their specific mesothelial origin or highlighting their epigenetic conditioning by the HG-SOC environment.<br> '''Authors: '''Roberto Verardo, Silvano Piazza, Enio Klaric, Yari Ciani, Antonio Beltrami, Daniela Cesselli, Stefania Marzinotto, RIKEN_OSC_members, Carlo Alberto Beltrami, Claudio Schneider <br> '''Authors contribution statement: '''RV conceived the project, performed some of the analysis and most manuscript writing; SP conceived the project developed, carried out statistical tests and results interpretation and wrote parts of the manuscript; YC implemented part of the software and prepared some figures; EK perfermed molecular biology assays ; SM AB and CAB; CS supervised the study <br> '''Datasets used: '''Helicos CAGE on all of F5freeze1 <br> '''Target journal(s): '''<br> '''Internal submission date: '''October 15th 2012 <br> '''Contact by email: '''[mailto:schneide@lncib.it Claudio Schneider] <br>'''Word document version of manuscript for editors: '''[[Image:Xxx claudio.doc]] <br>'''PDF version for general viewing (including all figs in one PDF): '''[[Image:Claudio.pdf]] |
|||
The integrative analysis conducted against the extensive panel of primary cells and tissues of the FANTOM5 project allowed us to identify a cell-type specific transcriptional activity associated with the HG-SOC-MSCs. The hierarchical clustering analysis shows that MSCs derived from HG-SOCs co-cluster with other MSCs while retaining distinct transcriptional peculiarities. Most importantly, this analysis has revealed an HG-SOC-MSCs specific identity when compared to similarly derived MSCs from normal tissues such as bone marrow, heart and adipose tissues, Overall their transcriptional activity shows a very strong correlation with that of primary mesothelial cells, which actually represent the embryonic cellular origin of serous ovarian cancer. |
|||
Moreover, a validated mesothelial gene signature (MGS), composed of genes over-expressed in HG-SOC-MSCs with respect to N-MSCs, shows significant association with cancer outcome when investigated in multiple ovarian cancer microarray datasets. |
|||
Altogether, the reported analysis support the hypothesis that HG-SOC-MSCs are bona-fide representatives of theovarian district, tracing their specific origin either from local mesothelium or highlighting the epigenetic conditioning of externally recruited MSCs by the HG-SOC cancer cell compartment |
|||
<br> '''Authors: '''Roberto Verardo, Silvano Piazza, Enio Klaric, Yari Ciani, Antonio Beltrami, Daniela Cesselli, Stefania Marzinotto, RIKEN_OSC_members, Carlo Alberto Beltrami, Claudio Schneider <br> '''Authors contribution statement: '''RV conceived the project, performed some of the analysis and most manuscript writing; SP conceived the project developed, carried out statistical tests and results interpretation and wrote parts of the manuscript; YC implemented part of the software and prepared some figures; EK perfermed molecular biology assays ; SM AB and CAB; CS supervised the study <br> '''Datasets used: '''Helicos CAGE on all of F5freeze1 <br> '''Target journal(s): '''<br> '''Internal submission date: '''October 15th 2012 <br> '''Contact by email: '''[mailto:schneide@lncib.it Claudio Schneider] <br>'''Word document version of manuscript for editors: '''[[Image:Xxx claudio.doc]] <br>'''PDF version for general viewing (including all figs in one PDF): '''[[Image:Claudio.pdf]] |
|||
---- |
---- |
||
Revision as of 04:59, 15 January 2013
Satellite manuscript internal review page
Welcome to the FANTOM5 Satellite review page. As discussed at the Ume and Koyo meetings, all papers will be visible to consortium members. This is to allow everyone to know what is going on, promote collaboration, carry out due process regarding co-authorship and to avoid competition.
Authorship
The author list will basically be selected by the first author and the corresponding author of each satellite paper on the basis of the scientific contribution to the manuscript. Remember to include an authors contribution statement for all authors named in your manuscript (of the form AB carried out the cell isolation, SB carried out the network predictions etc.).
In addition the FANTOM5 headquarter will name RIKEN OSC members who should be co-authors for their input on each manuscript and to the entire FANTOM5 project. For those of you who have participated in previous FANTOMs you will be familiar with this process, for those new to FANTOM please look at the author lists on the satellite paper collections for FANTOM2-4. FANTOM5 headquarter is currently discussing the policy for RIKEN OSC co-authorship on the FANTOM5 satellites, but basically satellites papers will be considered on a case by case basis, and will take into account datasets used, intellectual input and facilitating technologies/analyses for each paper.
At this stage please name any authors from the OSC that you think should definitely be included as co-authors, in addition for all satellite submissions include the following term RIKEN_OSC_members as an additional author.
Instructions
Please make a copy of the template below and enter your manuscript details.
If you are not able to edit the wiki yourself please email the secretariat with the subject line "FANTOM5_satellite", but please understand that these will be processed when we can rather than immediately. You must fill in all of the details below and provide both a PDF that contains all figures, and word doc of the main text, for reviewers to mark up directly.
Manuscripts
Title: Analysis of DNA methylation and transcription during granulopoiesis reveals timed methylation changes in low CpG areas and regulation of transcription factor expression and motif activity
ManuscriptID: Phase1_001
Status: Good draft
Abstract: In development epigenetic mechanisms such as DNA methylation have been suggested to provide cellular memory to maintain pluripotency but also stabilize cell fate decisions and direct lineage restriction. In this study we set out to characterize changes in DNA methylation levels and gene expression during granulopoiesis using four distinct cell populations ranging from the oligopotent common myeloid progenitor stage to terminally differentiated neutrophils. We found a general decrease of DNA methylation during granulopoiesis. Methylation levels appear to change at specific differentiation stages and correlate with changes in transcription and motif activity of key hematopoietic transcription factors. Differentially methylated sites (DMSs) are preferentially located in areas distal to CpG islands and shores and are overrepresented in potentially regulatory enhancer elements. Overall this study depicts in detail the epigenetic and transcriptional changes that occur during granulopoiesis and supports the role of DNA methylation as a regulatory mechanism in cell differentiation.
Authors: Michelle Rönnerblad, Tor Olofsson, Sören Lehmann, RIKEN_OSC_members, Karl Ekwall*, Erik Arnér* & Andreas Lennartsson*
Authors contribution statement: MR did most of the practical experiments, the bioinfo analysis (except CAGE related) and most manuscript writing, TO isolated the cells from bone marrows, SL gave valuable input to the planning, analysis and critically reviewed the manuscript, KE planned and supervised the study and contributed to the manuscript writing , EA supervised the bioinformatic analysis and performed the ones related to CAGE and contributed to the manuscript writing, AL initiated, planned and supervised the study and contributed to the manuscript writing and did some experiments.
Datasets used: Helicos CAGE on granulo precursor populations
Target journal(s): Blood
Internal submission date: April 7th 2012
Contact by email: andreas lennartsson, Karl Ekwall, Erik Arner
Word document version of manuscript for editors: File:Rönnerblad.doc
PDF version for general viewing (including all figs in one PDF): File:Rönnerblad Aprl07.pdf
Title: Cell-type specificity and co-expression of regulatory polymorphisms associated with human disease
ManuscriptID: Phase1_002
Status: Good draft
Abstract: Our ability to use genetic associations with disease to develop better treatments has been limited by the difficulty of identifying a biological process, or cell type, on which to focus investigation. Most disease-associated polymorphisms do not lie within protein-coding genes, raising the possibility that variation in regulatory sequence plays a critical role in disease phenotypes. We have used genome-scale 5’RACE (CAGE) to identify the location and usage of transcription start sites in 864 human tissues, primary cells and cell lines, and show here that there is a strong enrichment for disease-associated variants within the sequence immediately adjacent to transcription start sites. Using the expression profiles of known variants associated with disease susceptibility, we identify experimentally-available cell types significantly associated with specific diseases and traits. The expression of genes known to be associated with particular diseases was positively correlated. Such co-expression was used to identify unreported candidate disease-associated regulatory regions within published genome-wide association studies (GWAS). The approach was validated by identifying candidate loci in a 2007 GWAS study that were subsequently validated in larger independent datasets These functional genomics approaches directly inform choices of model system and identify disease- and cell type-specific co-regulated networks for a wide range of common diseases.
Authors: Baillie JK*, Haley CS, Schaefer U, Faulkner GJ, Freeman T, Brown JB, [others...], [Numerous RIKEN authors, order etc. TBC, at least including: Kawaji H, Forrest A, Carninci P]*, Hume DA*
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: Helicos CAGE on Primary Cells
Target journal(s): Nature Genetics
Internal submission date: ...
Contact by email: Kenneth Baillie, David Hume
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: What classes of mammalian promoter are there?
ManuscriptID: Phase1_003
Status: Working draft
Abstract: This study uses the comprehensive FANTOM5 promoter data, and careful methodology, to identify classes of mammalian promoter. In agreement with previous results, we find that promoters fall into two classes with narrow or wide spread of transcription start sites. In stark contrast to previous studies, we find little association between width and either CpG rate or TATA signals. Width correlates with expression level, suggesting that strength of promoter signal is on average proportional to promoter length. The data are consistent with a simple null hypothesis for CpG islands: that they are a passive consequence of expression (and thus cytosine demethylation and reduced CpG mutation) in germ-line cells. Finally, we show that measures of tissue specificity are prone to statistical artifacts, and specificity is not correlated with promoter narrowness, in contrast to previous claims. These results clarify some fundamental properties of mammalian promoters.
Todo: Use Charles's good way of measuring tissue specificity.
Authors: Frith, maybe Drabløs et al., open to others
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: All human Phase1 CTSSs (plan to add mouse)
Target journal(s):
Internal submission date: August 2012?
Contact by email: Martin Frith
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: Epigenetic factors regulating Hematopoiesis
ManuscriptID: Phase1_004
Status: Good draft
Abstract: The hematopoietic differentiation pathway is a complex regulatory program for generating different lineages of blood cell types from multipotent, hematopoietic stem cells. The transcriptional program dictating hematopoietic cell fate and differentiation requires an epigenetic memory function consisting of a network of enzymes controlling DNA methylation, histone posttranslational modifications and chromatin structure. Defective interactions between epigenetic enzymes and transcription factors cause perturbations in blood cell differentiation, which often leads to various types of hematopoietic disorders such as leukemia. To elucidate the contribution of different epigenetic factors in human hematopoieis, high-throughput Cap Analysis of Gene Expression (CAGE) sequencing was used to build comprehensive transcription profiles of 199 epigenetic factors in a wide range of blood cells. These epigenetic factors include proteins that covalently modify DNA/histones or alter chromatin structure dynamics. Our analysis revealed several epigenetic factors to have expression profiles specific for cell type, lineage type and/or leukemic cell lines. In this report the ‘epigenetic transcriptome’ has been systematically studied to predict their potential functions in the epigenetic regulatory network of human hematopoiesis. The potential of such a comprehensive study is not only to identify putative epigenetic regulators of normal hematopoiesis and postulate their function but also to serve as a resource for the scientific community for further characterization and validation of differentially expressed transcripts.
Authors: Punit Prasad, Michelle Rönnerblad,...FANTOM5, Erik Arner, Karl Ekwall and Andreas Lennartsson
Authors contribution statement: PP and MR have done analysis and written the manuscript. EA has performed the initial CAGE analysis for the epigenetic factors and assisted in writing the manuscript. AL and KE have assisted in writing the manuscript, planned and coordinated the study. The authors declare no conflict of interest.
Datasets used: Helicos CAGE on ...
Target journal(s): Blood or other
Internal submission date: December 06, 2012
Contact by email: Andreas Lennartsson, Erik Arner
Word document version of manuscript for editors: File:Prasad et al amnuscriot Blood.docx, File:Prasad et al Blood Figs.pdf, File:Prasad et al Table S1 .xlsx, File:Prasad et al Table S2.xlsx, File:Prasad Table S3.docx
PDF version for general viewing (including all figs in one PDF): File:Phase1 004.pdf
Title: Ab Initio Prediction of Tissue-Specific Regulatory Modules in the FANTOM5 Project
ManuscriptID: Phase1_005
Status: Final draft
Abstract: One of the major goals of the FANTOM5 project, the broadest TSS-based promoter-level expression atlas of transcriptional regulatory networks, is the identification of coding and non-coding, annotated and novel transcriptional units being transcribed in a cell-specific mode across the different biological states/samples. In this work we analyzed the FANTOM5 dataset using ScanAll, a newly developed software here described, to ab initio predict the presence of conserved elements in the genomic regions surrounding FANTOM5 promoters. Firstly we aimed at identifying motifs that were conserved in a subset of the selected genomic regions and that possibly corresponded to Transcription Factor Binding Sites (TFBS); we then expanded our analysis to pinpoint the existence of more complex, structured regulatory modules, that is groups of conserved motifs co-occurring in the aforementioned (co-expressed) regions within a fixed distance. We confirmed the sample-specificity of our output by showing that the majority of the obtained combinations of modules were able to divide the specimens into sample-specific groups, thus possibly explaining the peculiarities of regulatory events occurring in each tissue. Among these sites it was possible to confirm the presence of TFBS for known regulators already associated to those samples together with an additional and significant portion of motifs remaining unannotated, thus representing putative novel binding elements. In addition we were able to associate the presence of a significant portion of the identified motifs to distinct families of repeated elements, thus confirming a structural/functional feature of mammalian promoters that is currently emerging as one of the most peculiar regulatory aspects associated to mammalian phylogeny. Finally, we were able to identify previously uncharacterized aspects of the regulatory networks occurring in early-development samples thus confirming the significant advantage deriving from our modular approach.
Authors: Emiliano Dalla, Yari Ciani, Marco Zantoni, RIKEN_OSC_members, Alberto Policriti, Claudio Schneider, Silvano Piazza
Authors contribution statement: ED conceived the project, developed part of the software, oversaw implementation, performed some of the analysis and most manuscript writing; YC implemented part of the software and prepared some figures; MZ developed and implemented part of the software; AP developed part of the software and contributed to the manuscript writing; CS supervised the study; SP developed and implemented part of the software, carried out statistical tests and results interpretation and wrote parts of the manuscript.
Datasets used: Helicos CAGE on all of F5freeze1
Target journal(s):
Internal submission date: June 1st 2012; Update: December 21st 2012
Contact by email: Emiliano Dalla
Word document version of manuscript for editors: File:FANTOM5 PromoteromeSatelliteLNCIB.doc
PDF version for general viewing (including all figs in one PDF): File:FANTOM5 PromoteromeSatelliteLNCIB wFigures.pdf
Title: Homotypic clusters of transcription factor binding sites in the vicinity of transcription start sites
ManuscriptID: Phase1_006
Status: Finished draft
Abstract:
Background
Transcription factors (TFs) specifically recognizing DNA binding sites (TFBS) play a key role in regulation of gene expression. Groups of closely localized TFBSs for a particular TF, so-called homotypic TFBS clusters (HCBSs), were originally detected in yeast and extensively studied in fruit fly early development. Recently HCs were found to be highly important for several human regulatory systems.
Motivation
It is a general practice to estimate an enrichment of binding sites in regulatory sequences. Still there is no systematized data whether the presence of HCBSs is common for promoter regions of human genes. The general properties of HCBSs also remain unclear as well as possible relation between HCBSs and regulation of tissue-specific expression.
Results
Using data on sample-specific transcription start sites (TSSs) detected in FANTOM5 and high quality binding models for more than 400 TFs from the HOCOMOCO TFBS model collection we have predicted TFBSs and corresponding HCBSs in promoter regions surrounding TSSs. TFBS models for most TFs were shown to form statistically significant HCBSs often formed by separate distant binding sites. For HCBSs of most of TFs we were able to identify samples having significant association between promoters of sample-specific or housekeeping TSSs. Thus for most of TFs we predict putative preferences for sample-specific or housekeeping HCBSs activity and provide a genome-wide map of HCBSs nearby FANTOM5-defined TSSs.
Supplementary information
https://fantom5-collaboration.gsc.riken.jp/webdav/home/vigg/homotypicus/
Authors: I.V. Kulakovskiy, Y.A. Medvedeva, M.S. Polishchuk, A.V. Favorov, S. Schmeier, T. Lassman, I.E. Vorontsov, RIKEN_OSC_members, V.J. Makeev
Authors contribution statement: IVK implemented the software and drafted the manuscript. YAM carried out statistical tests and results interpretation. MSP developed the homotypic cluster detection algorithm. AVF selected proper statistical tests. SS provided the housekeeping set of TSS-clusters. TL provided the set of sample-specific TSS-clusters. IEV estimated proper thresholds for PWMs used in the study. VJM coordinated the study. All the authors participated in writing and finalizing the manuscript.
Datasets used: Helicos CAGE - FANTOM5 FREEZE1, "robust" subset
Target journal(s): Nucleic Acids Research, Bioinformatics
Internal submission date: 18 June 2012 / Updated: 12 September 2012 / Minor fixes: 1 December 2012
Contact by email: Vsevolod Makeev, Ivan Kulakovskiy
Word document version of manuscript for editors: File:HOMOTYPICUS-FANTOMsatellitepaper.r1.doc
PDF version for general viewing (including all figs in one PDF): File:HOMOTYPICUS-FANTOMsatellitepaper.r1.pdf
Title: A high resolution spatial-temporal promoterome of the human brain (was Brain CAGE)
ManuscriptID: Phase1_007
Status: Good draft
Abstract:
The human brain is an extremely complex organ that governs our abilities for cognition, reasoning and emotions and is the control center for the body. Its morphology and functionality during development have been well studied, but the molecular mechanisms contributing to its function and maintenance later in life remain poorly understood. Complexity at the transcriptional level is likely to play a major role in defining its morphological and functional characteristics. To investigate this we used single molecule CAGE and created a high resolution atlas of transcription start sites for 15 anatomical regions of the human central nervous system, using post-mortem samples derived from infant and aged adult donors. On the transcriptional level brain is clearly distinguishable from other tissues even if we consider only non-coding genes or expression from genomic regions often described as genomic dark matter. Using these differences we identify a specific set of transcription start sites that characterizes the brain. We show extensive differences in transcription between infant and adult that in some cases can be linked to loci associated with major neurodegenerative diseases. The differential expression across distinct regions correlates well with developmentally and/or functionally related anatomical districts and is refelected by distinct networks of interacting transcription factors, a range of lncRNAs and novel transcripts co-expressed in a regionally biased manner. Overall we provide the scientific community with a powerful expression resource based on post-mortem tissue, particularly highlighting the contribution of non-coding RNAs to the transcriptional complexity of human central nervous system.
Authors: Margherita Francescatto, Morana Vitezic, Patrizia Rizzu, Javier Simon-Sanchez, Robin Andersson, FANTOM5_RIKEN_OSC_members, Carsten O Daub, Albin Sandelin, MIchiel JL de Hoon, Piero Carninci, Alistair RR Forrest, Peter Heutink
Authors contribution statement: MF and MV did the analyses; MF, MV and PH wrote the manuscript, PR selected all samples, evaluated medical and pathological records and isolated RNA, JSS curated the list of disease loci, RA and AS provided the list of enhancers, ARRF, PC and PH designed the study ...
Datasets used: Helicos CAGE on VUMC provided brain samples (adult and newborn); full list of samples presented in Supplementary Table 1
Target journal(s): Genome Research
Internal submission date:
Contact by email: Peter Heutink, Margherita Francescatto, Morana Vitezic
Word document version of manuscript for editors: File:BrainCAGE manuscript presubmission enquiery.doc File:BrainCAGE figures presubmission enquiery.pdf
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: Pathogen specific monocyte transcriptional responses
ManuscriptID: Phase1_008
Status: Working draft
Abstract:
Authors: Wells
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: Helicos CAGE on ...
Target journal(s):
Internal submission date:
Contact by email: Christine Wells, Anthony Beckhouse
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: Transcriptome profiling of human skin mast cells by deep CAGE identifies unexpected gene activity patterns through direct comparison with multiple cell and tissue subsets
ManuscriptID: Phase1_009
Status: Working draft
Abstract: Despite their haematopoietic origin, mast cells (MCs) mature exclusively in peripheral tissues, hampering research into their developmental and functional programs. Here, we employed deep-CAGE on skin-derived MCs to generate the most comprehensive view of the human MC transcriptome ever reported. A particular advantage is that MCs were embedded in the FANTOM5 project, giving the opportunity to contrast their molecular signature against an extensive panel of human samples. We demonstrate that MCs possess a unique and surprising transcriptional landscape, combining expression of typical haematopoietic genes with those exclusively active in MCs, and genes not previously reported as expressed in MCs. Specifically we found that MCs express functional BMP receptors, which transduce pro-survival and activatory signals. Conversely, several genes frequently studied in MCs were either not or only weakly expressed in direct comparison with other myelocytes. By the parallel use of MCs ex vivo and following culture, we also found that MCs change their transcriptome in in vitro surroundings. Befitting their uniqueness, MCs had no close relative in the haematopoietic network. This rich dataset reveals that our knowledge of human MCs is still fairly limited. It can be anticipated that with this resource novel functional programs of MCs will soon be discovered.
Authors: Efthymios Motakis,1,* Sven Guhl,2,* Yuri Ishizu,1 RIKEN OSC members,1 Torsten Zuberbier,2 Alistair R R Forrest,1¶ Magda Babina2¶
Authors contribution statement: E.M. carried out bioifnormatics analayses S.G. isolated the mast cells and performed most experiments, M.B. performed several experiments, was involved in planning, supervision, and data analysis, and wrote the first draft of the manuscript, E.M. S.G., A.R.R.F. and T.Z. helped with planning, data analysis and manuscript writing.
Datasets used: Helicos CAGE on mast cell samples in comparison to freeze 1 data
Target journal(s): Blood, eBlood
Internal submission date:
Contact by email: Magda Babina, Sven Guhl
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:MC satellite Jan 6 merged.pdf
Title: Effect of cytosine methylation on transcription factor binding sites and regulation of transcription
ManuscriptID: Phase1_010
Status: Good draft
Abstract: Motivation: DNA methylation of gene promoters is strongly linked to gene repression. However, the mechanism of interaction between DNA methylation and gene repression is not fully understood. We cannot strictly state that DNA methylation of gene promoters is a cause of gene repression or, vise versa, that gene repression induced either by chromatin modification or by binding of Polycomb proteins leads to subsequent DNA methylation. Potential mechanism for transcriptional regulation by DNA methylation can be driven by methylation-induced changes in either accessibility of transcription factors (TFs) binding sites (TFBSs) or affinity of TFs to their TFBSs. This idea is supported by non-systematic evidences. Until now, this hypothesis has not been tested systematically for a wide spectrum of TFs with known TFBS models and across large number of cell types.
Methods: To estimate DNA methylation in 50 different cell types we used data obtained by reduced representation bisulfite sequencing (RRBS) provided by the ENCODE project. To evaluate genome-wide expression in the corresponding cell types we utilized FANTOM5 data obtained by cap-analysis of gene expression (CAGE). To predict TFBSs we used remote dependency model (RDM), a generalization of a position weight matrix (PWM), which takes into consideration the correlation of remote nucleotides within a binding site and has been shown to effectively decrease false positive rate compared to the widely used PWM approach.
Results and conclusions: In this work we surprisingly show that only 5% of CpG dinucleotides correspond to “traffic lights” genome positions, i.e. they manifest moderate to high negative correlation of their methylation profile and an expression profile of a neighboring TSS across cell samples. Significant share of TFBSs tend to avoid CpG “traffic lights”. This tendency is less pronounced if a binding site is surrounded by a homotypic cluster of TFBSs, suggesting that a loss of function for one TFBS due to methylation can be compensated by closely located weaker TFBSs for the same TF. In a way, this puts into a different perspective the current common perception of the link of methylation and gene expression.
Authors: Medvedeva YA, Khamis A, Ba-Alawi W, Bhuyan MdSI, [potential F5 collaborators], Kulakovskiy IV, Bajic VB
Authors contribution statement: YAM designed the computational experiments, selected and preprocessed the data, produced statistical analysis and wrote the manuscript; AK performed most of the data analysis; WBA and MdSIB contributed RDM models and tools for threshold estimation and mapping; [potential F5 collaborators], IVK performed part of the analysis, contributed to the design of the experiments and writing of the manuscript; VBB contributed to the design of the experiments and writing of the manuscript.
Datasets used: Helicos CAGE on 50 sample types, ENCODE RRBS data for the same samples
Target journal(s):
Internal submission date: December, 16
Contact by email: Yulia Medvedeva
Word document version of manuscript for editors: File:Effect of cytosine methylation on transcription factor binding sites and regulation of transcription.doc
PDF version for general viewing (including all figs in one PDF): File:Effect of cytosine methylation on transcription factor binding sites and regulation of transcription.pdf
Title: Transcription and enhancer profiling in human monocyte subsets
ManuscriptID: Phase1_011
Status: Good draft
Abstract: Human blood monocytes comprise at least three subpopulations that differ in phenotype and function. Here we present the first in-depth regulome analysis of classical (CD14++CD16-), intermediate (CD14+CD16+), and nonclassical (CD14dimCD16+) monocytes. Cap Analysis of Gene Expression (CAGE) adapted to Helicos single molecule sequencing was used to map transcription start sites throughout the genome in all three subsets. In addition, global maps of H3K4me1 and H3K27ac deposition were generated for classical and nonclassical monocytes defining enhanceosomes of the two major subsets. We identify differential regulatory elements (including promoters and putative enhancers) that were associated with subset-specific motif signatures corresponding to different transcription factor activities and exemplarily validate a novel downstream enhancer of the CD14 locus. In addition to known subset specific features, pathway analysis revealed marked differences in metabolic gene signatures. While classical monocytes expressed higher levels of genes involved in carbohydrate metabolism priming them for anaerobic energy production, nonclassical monocytes expressed higher levels of oxidative pathway components and showed a higher routine mitochondrial activity. Our findings describe promoter/enhancer landscapes and provide novel insights into the specific biology of human monocyte subsets.
Authors: Christian Schmidl, Kathrin Renner, Ruediger Eder, Katrin Peter, Petra Hoffmann, Reinhard Andreesen, Marina P. Kreutz, RIKEN_OSC_members, Matthias Edinger, Michael Rehli
Authors contribution statement: CS performed experiments, computational analyses and wrote parts of the manuscript writing, KR performed experiments and contributed to manuscript writing, RE isolated the cells, KP performed experiments, PH, RA, MK, and ME contributed to planning and supervision, RIKEN_OSC_members who organized or performed Helicos sequencing and provided aligned data; MR initiated, planned and supervised the study, performed computational analyses, and wrote the manuscript.
Datasets used: Helicos CAGE on monocyte subsets (Regensburg samples)
Target journal(s): Blood, eBlood, other
Internal submission date: September 1 ,2012
Contact by email: Michael Rehli, Christian Schmidl
Word document version of manuscript for editors: File:Schmidl MonoSub.docx
PDF version for general viewing (including all figs in one PDF): File:Schmidl MonoSub.pdf
Title: ...
ManuscriptID: Phase1_012
Status: Gone
Abstract:
Authors: ...
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: Helicos CAGE on ...
Target journal(s):
Internal submission date:
Contact by email: Mr Blobby
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: The Evolution of Human Cells in terms of Protein Innovation
ManuscriptID: Phase1_013
Status: Working draft
Abstract: Humans are complex organisms composed of a great many cell types. Since the genomic DNA of each cell is identical, cell type is determined by what is expressed. We examine the evolutionary history of each human cell type at the molecular level via the collective histories of proteins, the principal product of gene expression. Sequence data from the FANTOM5 consortium are used to provide cell-type specific digital expression of protein-coding genes, and the SUPERFAMILY and dcGO resources provide domain and function annotation respectively. Cross-referencing with the domain annotation of all other completely-sequenced genomes provides the evolutionary context for each protein. We combine all of this to generate a description of cellular evolution at the molecular level.
We present a protein domain view of the evolution of cell type. To achieve this we first identify the most recent common ancestor (MRCA) or ‘creation epoch’ of every protein in the repertoire of the human genome. We are then able to use the protein creation epochs to describe the history of the emergence of each cell type over evolution in terms of the collective histories of the proteins expressed in that cell type. Each cell type has an evolutionary profile consisting of a timeline along the lineage from the ancient cellular ancestor to modern day human. The profile of each cell type shows at which epochs along the timeline innovations in protein evolution took place; required to allow the observed expression in that type of cell. By clustering cell types on these profiles, we find groups of cell types that share a parallel protein evolutionary history and thus potentially possess a common progenitor cell type or are evolving in cooperation. A functional enrichment analysis of these clusters reveals key proteins responsible for evolutionary shifts and functional innovations; it also suggests a possible order in which different cells could have emerged during evolution, which we discuss in relation to the human immune system. The structural domain-centric perspective which we employ in this work can also be used as the basis for a comparison of the molecular basis of functional and phenotypic differences between cell types within these evolutionary clusters, exemplified by an inspection of our results on different regions of the brain.
We present a view of the landscape of nature’s innovation of protein structure and architecture required to explain the creation of the different human cell types. This landscape has some important features such as the possibility that the last universal ancestor of life provided most of the innovation for the innate immune system whilst brain cells have been making use of novel proteins that first appeared in opisthokonta (animals and fungi) and continued to do so right up until homo sapiens. The landscape also lends itself to identifying candidate genes for disease by highlighting those that were important in enabling certain phenotypic shifts at key points in evolution.
Authors: Julian Gough, Owen Rackham, Adam Sardar, Matt Oates + Sample Providers + RIKEN OSC
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: Helicos CAGE on all samples
Target journal(s): Bioinformatics?
Internal submission date:
Contact by email: Julian Gough, Owen Rackham
Word document version of manuscript for editors: Rough draft available on request
PDF version for general viewing (including all figs in one PDF): File:TrapDraftv2.pdf
Title: Transcriptional profiling by deep CAGE of the human fibrillin/LTBP gene family, key regulators of mesenchymal cell functions.
Manuscrjavascript:void(0)iptID: Phase1_014
Status: Working draft
Abstract: The fibrillins and latent transforming growth factor binding proteins (LTBPs) form a superfamily of extracellular matrix (ECM) proteins characterized by the presence of a unique domain, the 8-cysteine transforming growth factor beta (TGFβ) binding domain (TB domain). These proteins are involved in both maintaining the extracellular matrix and controlling the bioavailability of TGFβ family members. Genes encoding these proteins show differential expression in mesenchymal cell types which synthesise the extracellular matrix and give rise to connective tissues. We have investigated the promoter regions of the seven gene family members using the FANTOM5 CAGE data base for human. Although the protein and nucleotide sequences showed considerable homology (for the protein sequence of fibrillins the maximum sequence homology was 68% between fibrillin1 and fibrillin2; minimum sequence homology was 59% between fibrillin1 and fibrillin3), the promoter regions were quite diverse. The three fibrillin genes had a single predominant promoter cluster, while LTBP1 and LTBP4 showed promoter switching. The depth of the current CAGE study revealed that most of the family members were expressed in a range of mesenchymal and other cell types, often associated with use of alternative promoters or changes in the transcription start site within a compound promoter. FBN3 was the lowest expressed gene, and was expressed only in embryonic and fetal tissues, primarily neurological. There was evidence of enhancer activity in the regions of the genes. Each gene showed a unique pattern of transcription factor motifs or activity. This study highlights the role of alternative transcription start sites in regulating the tissue specificity of closely related genes and suggests that this important class of extracellular matrix genes is subject to subtle regulatory variations that explain the differential roles of members of this gene family..
Authors: Margaret R Davis, RIKEN OSC members, Kim M Summers
Authors contribution statement: MRD performed most of the analysis and contributed to writing the paper, RIKEN OSC did ..., KMS performed the analysis and contributed to writing the paper
Datasets used: Helicos CAGE on ...
Target journal(s):
Internal submission date:
Contact by email: kim.summers@roslin.ed.ac.uk
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: Quantifying the informational complexity of transcriptional regulatory programmes
ManuscriptID: Phase1_015
Status: On-hold. Focussing on the biological results Phase1_016 rather than methods. Hope to return to methods later (phase2).
Abstract: The regulation of gene expression defines cellular identity, it is the basis for organism development and it underlies many cellular responses to the environment. Its disruption is implicated in many diseases and changes in gene regulation appear to underlie many adaptations evident between species. Previously, genes have been grouped and interpreted based on their specificity of expression, for example house-keeping genes that are expressed by all cells in all conditions versus highly tissue restricted genes expressed by only one cell type at a particular developmental time. Although such studies have been informative they fail to capture important aspects of how a gene is regulated or account for the heterogeneous relatedness of samples. The expression pattern of a gene is the output of a regulatory program within the cell. A program that must affect many state changes (on, off, up, down) is likely to require more regulatory information (Kolmogorov complexity) than a program effecting fewer state switches. If we can quantify this "regulatory complexity" we can then start to address deeper questions as to where that regulatory information is encoded, how malleable it is through evolution and how susceptible it is to perturbation by mutation. For example, a greater regulatory complexity could correspond to a higher concentration of cis-regulatory sequences around the gene or alternatively a single binding site for a transcription factor that is the output of an extensive intracellular signalling network. To address these questions we have explored a range of possible measures regulatory complexity including distance weighted entropies, diversity and richness scores. This leads us to introduce a novel measure of regulatory complexity (CR). It is implemented as a hierarchical Baysian model parametrised through MCMC. The CR method can be thought of as a relative measure of the number of gene expression state changes occurring over a tree relating all analysed samples. A by-product of this analysis is a probabilistic scoring of gene expression state switches between all analysed gene expression libaries. CR is weighted to account for the genome wide similarity of gene expression between samples but does not depend on the inference of a fixed underlying tree topology. Note - this is intended as essentially a methods paper, see Phase1_016 for the biological insights paper
Authors: Sarah Baker, Martin Taylor
Authors contribution statement: SB developed and implemented methods and performed general analyses; MT conceived the project and oversaw implementation and performed some of the analysis
Datasets used: Helicos CAGE on primary cells from human and mouse.
Target journal(s): Bioinformatics or Genome Research
Internal submission date: ETA July 2013
Contact by email: Martin Taylor, Sarah Baker
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: Cis encoding of the master developmental regulatory programme
ManuscriptID: Phase1_016
Status: Working draft, starting dataset being regenerated to incorporate improved method
Abstract: The regulation of gene expression defines cellular identity, it is the basis for organism development and it underlies many cellular responses to the environment. Its disruption is implicated in many diseases and changes in gene regulation appear to underlie many adaptations evident between species.
Authors: Sarah Baker, Martin Taylor
Authors contribution statement: SB developed and implemented methods and performed general analyses; MT conceived the project and oversaw implementation and performed some of the analysis
Datasets used: Helicos CAGE on primary cells from human and mouse. We may also want to use time course data for this paper (does that push it into phase2?).
Target journal(s): PLoS Biology
Internal submission date: ETA March 2013
Contact by email: Martin Taylor, Sarah Baker
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: Correspondence between CAGE clusters and chromatin marks
ManuscriptID: Phase1_017
Status: Preliminary Draft (Moved to Phase 2)
Abstract: The paper presents an analysis of the correlation between cell-type-specific CAGE clusters and chromatin marks, using FANTOM CAGE data and ENCODE ChIP-Seq data for the four ENCODE cell lines K562, Gm12878, Helas3 and Hepg2. It shows that active chromatin marks are present at both expressed and repressed clusters. Chromatin profiles around expressed CAGE clusters have various shapes, and can be grouped into combinatorial subclusters based on their profiles. Repressed clusters with active chromatin mark represents a set of poised CAGE clusters enriched for Pol II and linked to immune response. The latter clusters also have a well-positioned nucleosome at the TSS. The manuscript is only preliminary, and some of the analysis still remains to be performed. The general content of the paper is considerably different than what is described in the report posted here previosly. Media:CAGE_cluster_evaluation_Drablos_Rye_13012012.pdf and April Media:CAGE_clusters_and_chromatin_Drablos_Rye_26042012.pdf.)
Authors: Morten Rye, Finn Drablos
Authors contribution statement: MR and FD did data analysis and wrote the paper
Datasets used: Helicos CAGE data, ENCODE chromatin ChIP-Seq and DNase HS data
Target journal(s):
Internal submission date: Most likely February 2013
Contact by email: Finn Drablos,Morten Rye
Word document version of manuscript for editors: File:Preliminary Draft Drablos Rye 11-12-2012.docx
PDF version for general viewing (including all figs in one PDF): File:All draft figs Drablos Rye 11-12-2012.pdf Supplementary figures: File:All supplem figs Drablos Rye 11-12-2012.pdf
Title: Promoter specificity in transcription determines cell lineage choice
ManuscriptID: Phase1_018
Status: Delayed (as of September 12th)
Abstract: This paper will use pathprint (pathway fingerprinting) to develop an overall phylogenetic tree of all samples in F5 freeze1. This tree will be used to determine relative ancestry of samples and cluster them accordingly. SwitchEngine will be run to find switching in TSS at key junctions in differentiation. Will show TSS dynamics at these informative sites is associated with lineage-commitment.
Authors: Emmanuel Dimont, Gabriel Altschuler, Winston Hide
Authors contribution statement: ED did ..., GA did ..., WH did ...
Datasets used: Helicos CAGE on all of F5freeze1
Target journal(s):
Internal submission date: Most likely July-August 2012
Contact by email: Winston Hide, Emmanuel Dimont, Gabriel Altschuler
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: Patterns of expression space change in the F5-CAGE encyclopedia of vertebrate gene expression.
ManuscriptID: Phase1_019
Status: Finished draft
Abstract: F5-CAGE encyclopedia of expression patterns is arguably the most comprehensive and technologically uniform functional genomics dataset ever generated. F5-CAGE includes 952 human and 396 mouse tissues (T), primary cells (PC) and cancer cell-lines (CCL). Here, we use F5-CAGE to explore expression space change in multiple contexts.
Brain exhibits unique transcriptional features, including clustering into fetal, newborn, and adult samples. All samples group into three distinct categories with respect to expression evolution rate. There is trend for young genes to be tissue-specific, with the exception of taxon Eutheria. A major divide between leukemias and solid tumors is seen in CCL. Paralog expression pattern divergence suggests global devolution of expression in CCL. We explore global differences between T and CCL samples further, though family analysis and self-organizing maps. As a focused family evolution example, we use cdc42 family which features many tissue-specific genes and dramatic expression pattern shifts, correlated with ENCODE Tfbs. PhyloSigs suggest novel hypotheses for animal evolution: CNS and reproductive track are discussed as two examples.
Most genes have multiple TSSes, with up to 87 for tintin, contributing to multiple isoforms which were previously attributed to alternative splicing alone. TSSes correlate between human and mouse, older genes tend to have more TSSes, and TSS-rich genes are associated with cancer.
Finally, we test the hypothesis of CTCF acting as isolator between paralogs, and instead show its function is more likely in bringing duplicates under the control of the same enhancer. The trend is illustrated with semenogelins and pregnancy specific glycoproteins
Authors: Lukasz Huminiecki, Oxana Sachenkova and Core RIKEN Authors
Authors contribution statement:
LH: gathered and prepared the data, planned the study and analyzed the data, wrote the manuscript
OS: wrote the software to analyze the data, performed the analysis, prepared the figures
Datasets used: Helicos CAGE, TreeFam8, ENCODE TFBS ChIP-Seq
Target journal(s): Genome Research
Internal submission date: November 30th
Contact by email: Lukasz Huminiecki ,Oxana Sachenkova
Word document version of manuscript for editors : File:The structure of animal expression pattern evolution.doc (only text)
PDF version for general viewing (including all figs in one PDF): File:The structure of animal expression pattern evolution.pdf (this file includes all the figures)
Title: Gene duplication and promoter divergence in mammals.
ManuscriptID: Phase1_020
Status: Delayed
Abstract:
Authors: Lukasz Huminiecki and Core RIKEN Authors
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ... and LH did everything else
Datasets used: Helicos CAGE on ..., F5 promoter and enhancer datasets, TreeFam8
Target journal(s): Genome Research
Internal submission date: September 1st
Contact by email: Lukasz Huminiecki
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: Gene duplication and TF/miRNA regulatory network evolution in mammals.
ManuscriptID: Phase1_021
Status: Delayed
Abstract:
Authors: Lukasz Huminiecki and Core RIKEN Authors
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ... and LH did everything else
Datasets used: Helicos CAGE on ... TreeFam8, miRBase, microRNA target predictions
Target journal(s): Genome Research
Internal submission date: December 1st
Contact by email: Lukasz Huminiecki
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: Analysis of antisense transcription in loci associated to neurodegenerative diseases
ManuscriptID: Phase1_022
Status: Working draft
Abstract: The FANTOM5 sequencing datasets represent the largest collection of transcriptomes from human cell lines, primary cells and whole tissues of various origin. Transcription starting sites are mapped at high resolution by the use of a modified protocol of Cap-Analysis of Gene Expression (CAGE) for high-throughput single molecule next-generation sequencing with Helicos (hCAGE). We employed the FANTOM5 collection of data to address the role of antisense transcription in neurodegeneration. We focused our analysis exclusively on tissues and primary cells, to avoid artifacts due to cellular transformation in culture cell lines. Among the >1261 human hCAGE libraries, we selected those of brain origin. Libraries from total blood and selected blood cell populations were also included in the analysis. A total of 66 tissue- and 244 cell-specific libraries were interrogated for the presence of antisense transcription to well-established loci associated to Alzheimer’s disease, Amyotrophic Lateral Sclerosis, Frontotemporal Dementia, Huntington’s and Parkinson’s disease. Almost all analyzed genes display some degree of antisense transcription mainly in their 5’ or 3’ UTRs. 5’ head-to-head divergent antisense transcription appears enriched compared to global distribution of sense/antisense pairs. Identified antisense transcripts may have coding and non-coding capabilities, with lncRNAs being more represented. Expressed transcripts are generally poorly annotated and may contain repetitive elements of the Alu, SINE and LINE families. Antisense transcription was validated for a subset of genes, including amyloid precursor protein, microtubule-associated protein tau, DJ-1, leucin-rich repeat kinase 2 and α-synuclein. The validated transcripts are predicted to have non-coding functions and most of them were not annotated. Quantitative analysis of antisense transcripts in human tissues indicates enrichment in the brain, compatible with FANTOM 5 data. Overall, these results represent the most comprehensive analysis of antisense transcription at loci associated to neurodegeneration and provide evidence for the existence of additional regulation of disease-related genes by previously not-annotated long non-coding RNAs.
Authors: Zucchelli SIlvia, Paolo Vatta, Stefania Fedele, Raffaella Calligaris, XXXX (from F5 consortium), Al Forrest, Piero Carninci and Stefano Gustincich
Authors contribution statement: SZ designed the experiments, analyzed the data, wrote the manuscript; PV performed the bioinformatics analysis, prepared some figures; SF designed the experiments, performed the experiments and analyzed the data; RC provided reagents, designed the experiments and analyzed the experiments; SG analyzed the data, wrote the manuscript
Datasets used: Helicos CAGE on human brain and blood samples
Target journal(s): Genome Research, Plos Genetics, Human Molecular Genetics
Internal submission date: beginning of june
Contact by email: Stefano Gustincich, Silvia Zucchelli
Word document version of manuscript for editors: File:Zucchelli FANTOM5 satellite 2012 09 14.doc
PDF version for general viewing (including all figs in one PDF): File:Zucchelli Figures.pdf
Title: Higher order chromatin structure and promoter activity
ManuscriptID: Phase1_023
Status: Delayed -> moved to PHASE2
Abstract:
Authors: Semple CA, Prendergast JG, et al
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: Helicos CAGE on ...
Target journal(s):
Internal submission date: October 2012
Contact by email: Colin Semple, James Prendergast
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title:Building context depending TSS regions from thousands of profiles
ManuscriptID: Phase1_024
Status: Unknown
Abstract: about DPI
Authors: Kawaji H, et al.
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: Helicos CAGE on phase1 freeze
Target journal(s):
Internal submission date:
Contact by email: KAWAJI Hideya
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title:Gateways to the promoter level mammalian expression atlas covering thousands of biological states in FANTOM5
ManuscriptID: Phase1_025
Status: working draft
Abstract: Monitoring RNA transcribed within a cell is an essential step toward the identification of active information within the genome, and the understanding the entire cellular system ultimately. Most previous studies involving the collection of a large set of genome-wide transcription profiles consist of tissues and/or cell lines. In the FANTOM5 (Functional ANnotation Of Mammals 5) project we monitored transcription in more than one thousand mammalian samples, including nearly two hundred primary cell types in human and more than one hundred cell types in mouse. We used a sequencing-based digital counting technology, CAGE (Cap Analysis Gene Expression), which skips any PCR amplification steps relying on a single molecule sequencer. This technology quantifies transcription starting site (TSS) activities at a single base pair resolution across the genomes, and the result is one of the largest sets of expression data available, consisting of diverse range of samples with a single platform based on the state-of-the-art technology.
We assembled the FANTOM5 TSS profiles and subsequent analyses into a centralized data archive and set up various on-line resources available for the scientific community. Researchers in cell biology can easily search samples of interest to inspect active elements within a cell type. Researchers in molecular biology can search genes or transcription factors of interest to inspect in which biological context they are highly activated. Researchers in genome biology and other fields can explore the data within dynamic and interactive graphical user interfaces dedicated for genomic viewing and expression. We based all analysis and database systems on careful annotation of the diverse range of samples, including an application ontology consisting of cell types, anatomy, and diseases. This large set of expression data combined with the extensive and systematic sample annotation enables the scientific community to explore, examine, and slice the data from multiple aspects. Here we introduce the on-line resources and underlying data structure as well as discuss its potential impact in multiple research fields.
Authors: WP4, database providers, and analysis providers
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: Helicos CAGE on phase1 freeze
Target journal(s):
Internal submission date:
Contact by email: KAWAJI Hideya
Word document version of manuscript for editors: File:130111-F5web-resource-main JH HK.docx
PDF version for general viewing (including all figs in one PDF): File:130104-F5web-resource-fig.pdf
Title:Application of Semantic MediaWiki to snapshot of thousands of biological states in transcription
ManuscriptID: Phase1_026
Status: Unknown
Abstract: overview and instruction to the resource browser
Authors: Shimoji H, Kawaji H., WP4
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: Helicos CAGE on phase1 freeze
Target journal(s):
Internal submission date:
Contact by email: KAWAJI Hideya
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title:Comparison of CAGE and RNA-seq transcriptome profiling using a clonally amplified and single molecule next generation sequencing
ManuscriptID: Phase1_027
Status: finished manuscript
Abstract: CAGE (Cap Analysis Gene Expression) and RNA-seq are two major technologies used for transcript quantification. These protocols measure expression by from either the 5’ end of capped molecules (CAGE) or tags randomly distributed along the length of a transcript (RNA-seq). Library protocols for clonally amplified (Illumina, SOLiD, 454, Ion Torrent) 2nd generation sequencing platforms typically employ PCR pre-amplification prior to clonal amplification, while 3rd generation single molecule sequencers can sequence unamplified libraries. While these protocols individually have been demonstrated to be highly reproducible, no systematic comparison has been carried out between the protocols. Here we compare CAGE using both 2nd and 3rd generation sequencers and RNA-seq using a 2nd generation sequencer based on a panel of RNA mixtures from two human cell lines (THP-1 and HeLa, 100%, 50%, 20%, 10%, 5%, 1% and 0% of HeLa RNAs) to examine power to discriminate biological states, to detect differentially expressed genes, linearity of measurements as well as quantification reproducibility. Quantification by CAGE with the 2nd and 3rd generation sequencers (Illumina GA-IIx and HeliScope) were consistent at gene level, however we observed several differences, which can be explained by differences in their protocols and sequencing platforms. These include significant bias in the Illumina library, such as GC biases and over-estimation of transcripts harboring internal Ecop15I sites., A poorer correlation at the level of individual TSS positions, which is likely to be due to higher indel rate in HeliScope, is also found. We found high consistency between HeliScopeCAGE with RNA-seq (spearman correlations 0.88). Differences between CAGE and RNA-seq are explained by incompleteness of existing gene models in most cases, where 5’-ends of gene models do not reflect actual transcription starting site in the profiled cells, or RNA polymerase run through the poy adenylation site resulting in fusion of neighboring genes.
Authors: WP3
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used:
Target journal(s): Genome Res.
Internal submission date: 23rd Dec, 2012
Contact by email: KAWAJI Hideya
Word document version of manuscript for editors: File:121223-PlatformEval.docx
PDF version for general viewing (including all figs in one PDF): File:121223-PlatformEval.pdf
Title:Identification of miRNA promoters and primary structures
ManuscriptID: Phase1_028
Status: Unknown
Abstract: ...
Authors: Kawaji H.
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: phase1 CAGE peaks
Target journal(s):
Internal submission date:
Contact by email: KAWAJI Hideya
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: Differential roles of epigenetic conversion and Foxp3 expression in regulatory T cell-specific transcriptional regulation
ManuscriptID: Phase1_029
Status: good draft
Abstract: Naturally occurring regulatory T (Treg) cells are engaged in the maintenance of immune tolerance and homeostasis. The development of Treg cells requires both the expression of the transcription factor Foxp3 and the establishment of Treg cell-type DNA hypomethylation pattern. By transcriptional start site (TSS) cluster analysis, we here assessed possible correlation of genome-wide DNA methylation pattern or Foxp3-binding pattern with Treg-specific gene expression. We found that Treg cell-specific DNA hypomethylated regions were closely correlated with Treg-upregualted TSS clusters, whereas Foxp3-binding regions had no significant correlation with either up- or down-regulated clusters, in non-activated Treg cells. On the other hand, in activated Treg cells, Foxp3-binding regions showed a strong correlation with down-regulated clusters. In silico search for transcription factor-binding motifs revealed that the motifs enriched in Foxp3-binding or Treg-specific DNA hypomethylated regions were mostly different. These results collectively indicate that Treg cell-specific DNA hypomethylation is conducive to up-regulation in the steady state Treg cells whereas Foxp3 expression to down-regulation of its target genes in activated Treg cells. Thus, the combination of the two events is required for the establishment of Treg cell-specific gene expression and function.
(185 words)
Authors: Hiromasa Morikawa1,2, Naganari Ohkura1, Alexis Vandenbon3, RIKEN_OSC_members 4, Daron Standley3, Hiroshi Date2, Shimon Sakaguchi1
1. Department of Experimental Immunology, World Premier International Immunology Frontier Research Center, Osaka University, Suita 565-0871, Japan
2. Department of Thoracic Surgery, Kyoto University, 54 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto, 606-8507, Japan
3. Department of Systems Immunology, World Premier International Immunology Frontier Research Center, Osaka University, Suita 565-0871, Japan
4. RIKEN Omics Center, Yokohama, Japan
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: phase1 CAGE peaks
Target journal(s): Genome Research
Internal submission date: 2012/12/18
Contact by email: Hiromasa Morikawa
Word document version of manuscript for editors: manuscript121218.docx
PDF version for general viewing (including all figs in one PDF): manuscript121228.pdf
Title:Automated clustering and quality control pipeline for CAGE technologies
ManuscriptID: Phase1_030
Status: Working draft
Abstract: To understand the manner and mechanisms of transcription initiation by RNA Polymerase II, different strategies for genome-wide detection of transcription start sites (TSSs) have been developed. We propose the clustering and quality control pipeline suitable for the Cap Analysis of Gene Expression (CAGE) sequence tags. The new framework uses parametric clustering at multiple scales and adopts the irreproducible discovery rate (IDR) to measure reproducibility between replicates of each cluster. Our pipeline reveals that genes have complicated structures of transcription initiation events and discover novel alternative promoters which were not detected by previous approaches.
Authors: Hiroko Ohmiya1, Morana Vitezic1, Martin Frith, Yoshihide Hayashizaki1, Timo Lassmann1 and many more
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used:
Target journal(s):
Internal submission date:
Contact by email: Timo Lassmann
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title: Mogrify: Identifying Defined Factors For Direct Reprogramming Using Next-Generation Sequencing Data And Network Analysis.
ManuscriptID: Phase1_31
Status: Working draft -> PHASE2?
Abstract:
We now know that cellular state is a plastic phenomenon which it is possible to control. There is an increasing number of reports in the literature where cells have been made to go from fully differentiated cell types to pluripotency and also from one fully differentiated cell type to another. Each of these experiments has relied heavily on a process of trial and error as well as expert knowledge in order to discover the transcription factors capable of inducing a cell conversion. Here we present a novel network based technique (Mogrify) that can identify the factors required for cell conversion. The technique integrates next generation sequence data and biological network knowledge in order to identify transcription factors for over-expression and knock-down along with a conversion likelihood score.
We show that we are able to predict the known reprogramming factors for several successful trans-differentiations from the literature (eg between fibroblast and cardiomyocyte, neuron and hepatocyte) and then provide evidence for a number of unpublished conversions.
The technique is then run without human intervention on every possible combination of over 1000 libraries in the FANTOM 5 set. This information is then used to construct a computational “Waddington landscape”, identifying the best candidate source and target cell types for future cell conversion experiments. This is the first resource of it’s kind, only made possible by the new FANTOM5 promoterome data and represents a considerable step forward in regenerative medicine.
.
Authors: Owen and Julian
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: phase1 CAGE peaks in all samples
Target journal(s):
Internal submission date:
Contact by email: Owen Julian
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:Mogrify.pdf
Title: ADIPOKINES LINK FAT CELLS TO OBESITY-ASSOCIATED CANCER
ManuscriptID: Phase1_32
Status: Good draft
Abstract: Obesity confers an increased risk of developing specific cancer forms. Although the mechanisms are unclear, increased fat cell secretion of specific proteins (adipokines) may promote/facilitate development of malignant tumors in obesity by cross-talk between adipose tissues and the tissues prone to develop cancer among obese. This was investigated using expression data from human adipose tissue of obese and non-obese as well as from a large panel of human cancer cell lines and corresponding primary cells and tissues. We identified three previously described adipokines, SERPINE1, SERPINE2 and C3 sharing a common cognate receptor LRP1 which was expressed in all cancer cell lines associated with obesity. Expression and secretion of SERPINE1 and C3 were increased in obese adipose tissue and their plasma levels were elevated in obese subjects. We also identified genes enriched in obesity-associated cancer cells compared to cell lines and corresponding healthy tissues or primary cells. We found expression of ceruloplasmin to be the most enriched in obesity-associated cancer cells. This gene was also significantly up-regulated in adipose tissue of obese subjects. Ceruloplasmin is the body’s main copper carrier and is involved in angiogenesis. We demonstrated that ceruloplasmin was a novel adipokine and that obese adipose tissue contributed markedly (22%) to the total protein level. In summary, we have identified several adipokines, which can serve as endocrine signals facilitating growth of obesity-associated cancer tumors. These adipocyte signals are increased in obesity and may be important for development of cancer associated with excess body fat.
Authors: Erik Arner, Alistair Forrest, Anna Ehrlund, Niklas Mejhert, [Additional RIKEN people?], Jurga Laurencikiene, Mikael Rydén, Peter Arner
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: phase1 CAGE peaks
Target journal(s): Cancer Research
Internal submission date:
Contact by email: Erik Arner
Word document version of manuscript for editors: File:Fat cells and cancer draft 120816 EA.docx File:Figs 2012-08-15.ppt
PDF version for general viewing (including all figs in one PDF):
Title: ZENBU
ManuscriptID: Phase1_33
Status: Working draft
Abstract:The world of genome sciences has dramatically changed over the last 5 years. With the advent of next generation sequencers and RNA-expression sequencing, genome science is no longer the domain of a few elite centralized "genome centers" like in the early days of the field. The advance of next-generation sequencers has spurred an ever-growing body of tag-based data allowing the survey of chromatin states and transcriptome dynamics. Visualization of expression levels of genomic regions was achieved by displaying expression levels in various experimental conditions in dedicated tracks allowing investigators a direct comparison of their dynamics. Novel file formats and browser design have allowed for dealing efficiently with the depth of data produced by next-generation sequencer based technologies. Researchers need to interact within global collaborations and need easy ways to process, share and visualize their data in a secured manner prior to publication. To this end we have developed the ZENBU system. ZENBU is a web based system which is a social networking platform for secured data upload and data sharing with collaborators, a data processing system, and a visualization system. ZENBU provides the infrastructure for working with 100s of terrabytes of sequence data in the form of BAM sequence alignment files and genome annotation formats like BED and GFF, to efficiently cross-analyze these databsets using a Map-Reduce/autonomous-agent based parallel processing system, and provide fast efficient web services for user interfaces. The user interfaces for ZENBU is based on Web2.0 technologies in the form of a new expression-enhanced genome browser, and data manipulation interfaces for data upload, data processing, and data download. ZENBU currently contains the entire FANTOM 3/4/5 datasets, the entire ENCODE datasets, and much of the UCSC genome annotation data. ZENBU is planned to be a corner stone in the expanding global network of scientific sharing web systems.
Authors: Jessica Severin*, Marina Lizio, Jayson Harshbarger, Hideya Kawaji, Carsten Daub, The FANTOM5 consortium, Yoshihide Hayashizaki, Nicolas Bertin*, Alistair Forrest*
Authors contribution statement: JMS, ML, JH, HK, CD, YH, NB, AL
- JMS, wrote the software/webservices.
- JMS, NB, planned the study.
- NB supervised the study.
- JMS, NB, contributed to the manuscript writing.
- JMS, NB, gave valuable input to the analysis in the manuscript.
- JMS, NB, critically reviewed the manuscript.
- [addition of any other, clearer or more precise statement is very welcome]
Datasets used: phase1 CAGE peaks
Target journal(s): Nature Biotech/Genome Research
Internal submission date:
Contact by email: Jessica Severin, Nicolas Bertin, Alistair Forrest
Word document version of the most up to date manuscript draft: File:ZENBU manuscript.014 (1).docx
Word document version of manuscript for editors: [[]]
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title:The enhancer and promoter landscape of regulatory and conventional T cell subpopulations
ManuscriptID: Phase1_34
Status: almost finished manuscript
Abstract: CD4+CD25+FOXP3+ human regulatory T cells (Treg) are essential for self-tolerance and immune homeostasis. Here, we describe the promoterome of CD4+CD25highCD45RA+ naïve and CD4+CD25highCD45RA– memory Treg and their CD25– conventional T cell (Tconv) counterparts both before and after in vitro expansion by cap analysis of gene expression adapted to single molecule sequencing (HeliscopeCAGE). We performed comprehensive comparative digital gene expression analyses and revealed new orphan transcription start sites, of which several were validated as alternative promoters of known genes including FOXP3 and CTLA4. For all in vitro expanded subsets, we additionally generated genome-wide maps of poised and active enhancer elements marked by histone H3 lysine 4 monomethylation and histone H3 lysine 27 acetylation. Analysis of cell type-specific regulatory elements revealed a specific enrichment of several transcription factor binding motifs. We validated promising candidates by chromatin immunoprecipitation coupled to next generation sequencing and identified STAT5 and FOXP3 as well as RUNX1 and ETS1 as global regulators of Treg- and Tconv-specific enhancers, respectively. In summary we provide a highly detailed and easily accessible resource of gene expression and -regulation in Treg and Tconv subpopulations.
Authors: R
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: phase1 CAGE peaks
Target journal(s): Blood
Internal submission date:
Contact by email: Christian Schmidl, Michael Rehli
Word document version of manuscript for editors: File:121027 FANTOM Treg manuscript.docx
PDF version for general viewing (including all figs in one PDF): File:Schmidl Treg.pdf
Title:Systematic in-vivo characterization of active enhancers across the human body
ManuscriptID: Phase1_35
Status: Almost finished manuscript
Abstract: In higher organisms, cellular development and diversity is highly controlled by enhancers, which regulate the correct temporal and cell type-specific activation of gene expression. Despite their obvious importance for development and disease, the exact locations, target genes and mechanisms of enhancers are still poorly defined. Thus, there is an urgent need not only to identify enhancer locations, but also to elucidate their specific usage across the wide diversity of cells within the human body, their impact on regulation in healthy and diseased individuals, and how enhancers interact with target genes. Here, we use the FANTOM5 panel of tissue and primary cell samples covering the majority of human tissues and cell types to define an atlas of active, in vivo bidirectionally transcribed enhancers across the human body. It enables comparison of regulatory programs between different cells and tissues at unprecedented depth, and makes it possible to define distinct subsets of enhancers, including fetal-specific, cell-specific and ubiquitous enhancers – a novel enhancer subtype with distinct properties. We show that known target genes of enhancers can be recaptured using expression correlations and predict many novel enhancer-TSS associations. We present models confirming the utility of multiple redundant enhancers, which explain TSS expression strength rather than expression patterns. We demonstrate that disease-associated functional single nucleotide polymorphisms are over-represented in enhancers and that such enhancers often have disease-relevant expression patterns. The human enhancer atlas can be accessed through an online database and is a unique resource for studies on tissue/cell-specific enhancers and their gene interactions.
Authors: Robin Andersson1#, Claudia Gebhard2#, Irene Miguel-Escalada3, Ilka Hoof1, Xiaobei Zhao1, Christian Schmidl2, Eivind Valen1,4, Kang Li1, Lucia Schwarzfischer2, Dagmar Glatz2, Johanna Raithel2, Yun Chen1, Berit Lilje1, Nicolas Rapin1,5, Frederik Otzen Bagger1,5, Mette Jørgensen1, Mette Boyd1, Jette Bornholdt1, Kenneth Baillie6, Chris Mungall7, Timo Lassmann8, Hideya Kawaji8, Andreas Lennartsson9, Carsten Daub8,9, David Hume6, Peter Heutnik10, Alistair Forrest8, Piero Carninci8, Yoshihide Hayashizaki8, Ferenc Müller3, Michael Rehli2*, Albin Sandelin1*
Authors contribution statement: RA, IH, EV, KL, YC, BL, XZ, MJ, HK, TL, KB, CM, NR, FOB, MR, AS made the computational analysis. TL, HK, CD, AF, PC, YH prepared, mapped and analyzed CAGE libraries. RA, CG, IH, EV, FM, PC, AF, AK, MB, JBL, AL, CD, DH, PH MR, AS interpreted results. CG, CS, ME, MR made the blood cell ChIP experiments, methylation assays and in vitro blood cell validations. IME, FM made zebrafish in vivo validations and interpretations. RA, CG, IH, FM, MR, AS wrote the paper.
Datasets used: phase1 CAGE peaks and raw CAGE mapped data from human, internal ChIP and other validation data
Target journal(s): To be decided
Internal submission date:
Contact by email: [robin@binf.ku.dk, michael.rehli@klinik.uni-regensburg.de, albin@binf.ku.dk , Michael Rehli Albin Sandelin]
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf File:Enhancerome full.pdf
Title: The specific transcriptome dynamics of mesenchymal stromal/stem cells from high-grade serous ovarian cancer relate their social context and activity
ManuscriptID: Phase1_036
Status: Working draft
Abstract: The role of cancer microenvironment is being recognized as one of the critical hallmarks in both cancer progression and metastasis. Mesenchymal Stem/Stromal Cells (MSCs) are the precursors of various cell types that compose both normal and cancer tissue microenvironments. We have isolated MSCs from various High-Grade Serous Ovarian Carcinomas (HG-SOCs), demonstrated their normal genotype, and analyzed their transcriptome using deep-CAGE analysis with respect to similarly derived normal tissues MSCsall embedded in the large comprehensive FANTOM5 sample dataset.
The integrative analysis conducted against the extensive panel of primary cells and tissues of the FANTOM5 project allowed us to identify a cell-type specific transcriptional activity associated with the HG-SOC-MSCs. The hierarchical clustering analysis shows that MSCs derived from HG-SOCs co-cluster with other MSCs while retaining distinct transcriptional peculiarities. Most importantly, this analysis has revealed an HG-SOC-MSCs specific identity when compared to similarly derived MSCs from normal tissues such as bone marrow, heart and adipose tissues, Overall their transcriptional activity shows a very strong correlation with that of primary mesothelial cells, which actually represent the embryonic cellular origin of serous ovarian cancer.
Moreover, a validated mesothelial gene signature (MGS), composed of genes over-expressed in HG-SOC-MSCs with respect to N-MSCs, shows significant association with cancer outcome when investigated in multiple ovarian cancer microarray datasets.
Altogether, the reported analysis support the hypothesis that HG-SOC-MSCs are bona-fide representatives of theovarian district, tracing their specific origin either from local mesothelium or highlighting the epigenetic conditioning of externally recruited MSCs by the HG-SOC cancer cell compartment
Authors: Roberto Verardo, Silvano Piazza, Enio Klaric, Yari Ciani, Antonio Beltrami, Daniela Cesselli, Stefania Marzinotto, RIKEN_OSC_members, Carlo Alberto Beltrami, Claudio Schneider
Authors contribution statement: RV conceived the project, performed some of the analysis and most manuscript writing; SP conceived the project developed, carried out statistical tests and results interpretation and wrote parts of the manuscript; YC implemented part of the software and prepared some figures; EK perfermed molecular biology assays ; SM AB and CAB; CS supervised the study
Datasets used: Helicos CAGE on all of F5freeze1
Target journal(s):
Internal submission date: October 15th 2012
Contact by email: Claudio Schneider
Word document version of manuscript for editors: File:Xxx claudio.doc
PDF version for general viewing (including all figs in one PDF): File:Claudio.pdf
Title: Investigating tissue-specificity of cancer-causing mutations
ManuscriptID: Phase1_037
Status: Working draft
Abstract: Over the past 10 years an increasing number of mutated genes have been associated with familial predisposition to cancer. Interestingly for more than half of these genes their involvement in cancer is restricted to only a few cancer types (e.g. BRCA1 mutations in breast and ovarian cancers). Even more interestingly some of these genes are expressed in all cell types, and perhaps we would expect to see them causing many more different types of cancer but they don’t. This paper will examine how these mutations are tolerated in most cell types but not in others by considering the network of genes expressed in different cell types and how that determines whether they are susceptible or resistant.
Authors: Jessica Mar, Daniel Carbajo, RIKEN_OSC_members, Alistair Forrest
Authors contribution statement: JM and AF conceived the project, DC conducted the analyses.
Datasets used: phase1 CAGE peaks
Target journal(s):
Internal submission date:
Contact by email: Jessica Mar
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Title:FANTOM5 reveals the genomic architecture of the genes implicated in Rett Syndrome
ManuscriptID: Phase1_038
Status: Working draft
Abstract: Mutations in MECP2, FOXG1 and CDKL5 genes cause Rett Syndrome, a neuro-developmental disorder of the grey matter of the brain that almost exclusively affects females. We analyzed the RNA expression data from the FANTOM5 project in both human and mouse to investigate the genomic architecture of the three genes involved in Rett syndrome. Data from FANTOM 5 provides the unprecedented opportunity to study the expression profile, identify transcription start sites and, in conjunction with the recently released ENCODE dataset, identify the regulatory regions and transcription regulators of the three genes implicated in Rett Syndrome. Even though MECP2 and CDKL5 are expressed ubiquitously, mutations in these genes cause a brain specific phenotype suggesting that their role in brain is distinctly important from their function in other tissues.
Authors: Morana Vitezic, Leonard Lipovitch, Alistair RR Forrest, Piero Carninci, Alka Saxena
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: phase1 CAGE peaks
Target journal(s): NAR
Internal submission date: December 2012
Contact by email: Morana Vitezic Alka Saxena
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf
Manuscript template
NOTE: Make a copy of the format below, paste it above and then edit with your details
Title:COPY THEN EDIT THIS TEMPLATE
ManuscriptID: Phase1_00x (INCREMENT THIS)
Abstract: ...
Authors: R
Authors contribution statement: MR did ..., TO did ..., KE did ..., EA did ..., AL did ...
Datasets used: phase1 CAGE peaks
Target journal(s):
Internal submission date:
Contact by email: CHANGETHIScorresponding1 CHANGETHIScorresponding2
Word document version of manuscript for editors: File:XXXYOUR.doc
PDF version for general viewing (including all figs in one PDF): File:XXXYOUR.pdf