Working Group 4 - Sample annotation+ additional samples

From Wiki
Jump to navigationJump to search

We will use ontologies to annotate cell types, tissues, disease states

  • Establishment of a user group.
  • Mapping of ontology in several stages
    • Mapping of terms to ontologies (create terms if necessary)
    • CL
    • UBERON
    • CELL LINE ONTOLOGY

(END OF MARCH)

Tap into knowledge of sample providers

  • Addition of commonly used markers for cell type or tissue

(END OF APRIL?)

  • Disease associations –OMIM, ICD9 codes
  • Capture of information relating to sample to be formalized later
    • biopsy
    • culturing
    • time-course

Enhancement of develop lineage for CL
(OCTOBER)

Begun discussion on sampling.

Kim's notes follows...

Sample annotation and additional samples.

Ontology – representing knowledge through structure vocabulary. Cell ontology. 200 primary cells and cell lines. Put in order/network and see what’s missing.

OK for cross species – chicken plus mammals – can use the same ontology.

Experimental conditions, what is the minimum we need to know about each sample.

Can help with putting tissues into ontology. Cell lines re difficult - U Mich about to release cell line ontology – 900 cell lines, will be linked to cell ontology – mapped to primary cell type (although not the same).

Riken cancer cell lines, relatively well characterised and publicly available.

What about karyotypes of cell lines?

Cell ontology will include major labs with the cells, eg HeLa, because of drift in karyotype etc over time.

What does annotation mean? Group survey. 1. Essential information you need to understand: sex, basic culture conditions, tissue sample 2. For each sample, cell type/cell line ID, sex 3. Atmosphere, liquid, buffer 4. Cell ontology and tissue ontology is most important at this stage; sex, age, genotype, culture conditions are already available 5. Marker information that are used to identify the cell type (Terry can add marker information to cell ontology) 6. Users interface – experimental information doesn’t have to be in formalised annotation as long as linkable/retrievable; focus on cell type, perturbations, abstract and pull all related cells by various key words. 7. Primary cells vs cell lines – primary not immortalised – lose phenotype whereas early passage primaries (3 and below) retain characteristics. Matrix matters (being in a tissue). 8. Relationships between cell types, pull out similar/related cells 9. Being able to relate cell types, search on a particular cell phenotype 10. Age of donor, eg haematopoietic stem cells fetal vs adult 11. Culture conditions, eg low oxygen tension might be important, whether cells grow attached, serum vs serum free plus growth factors etc. 12. Context in as fine a detail as possible, knowing exactly where things lie on a tree of cell types; relationships between cells not always known, so ontology represents consensus view, but genetic data will inform this – is marker as specific as you think 13. Tree from literature vs tree from expression data – current move to molecular characterisation, so this is the time to do it right 14. How different are two cell type; distance from other cell types, molecular or literature 15. Disease samples – disease ontology; use ICD9 + OMIM – use as many terms and ontologies as known and annotation team can sort out 16. Heterogenous samples don’t always give simple pattern; even with single cells don’t know which type of cells (within neural cells) – markers important; molecular curation 17. Timecourse data – transition states. May need separate ontologies to capture this information. 18. For brain, most complicated tissue – need to sort different regions, define neuronal cell types to cover all species then specific for human 19. Sample variation – sex, age, experimental description: investigator, sample, assay can be addressed; how to identify the specific variant of the sample, also can vary from the same subline same investigator different experiment 20. Different way to expand lymphocytes – need some way to specify this 21. Other information – experimental assay 22. Markers 23. Fresh or expanded in culture; gene expression patterns; relationship among cells 24. Markers – traditional markers for specific cell types (eg haemoglobin for rbc); markers can turn on and off so can’t be exclusionary; differentiated and cancer cells can change markers 25. Consistency of culture conditions eg oxygen state, serum, confluency 26. Harvesting method – eg fresh tissue time after wounding, scrape or enzyme 27. Data must be searchable by community so need ontology to put searchable details; how complete is our sample set so need tree and derive wish list; known markers are positive controls but this study is looking at characteristics of cells, so will be developing new signatures to expand or improve so need framework 28. How was material treated from beginning, eg “fresh” but gene expression can change rapidly after removal


Terry will take e-mail addresses, do the ontology and check with providers.

RE markers, good to link marker and cell type – will be contacting people re markers.

Primary cells can be (a) ex vivo (no culture) vs (b) cells that need a few passages.

Some of this information not known for commercial cells.

What is the minimal information for each sample to make meaningful interpretations, in data-compliant format – without this sample is useless. What is the most complete information we can provide about any sample.

Three levels: (1) annotate on what we think it is; (2) do the gene expression profile; (3) go back to cell type and develop better annotation.

Missing samples

With respect to additional samples, can be submitted but will take time. 1. Early developmental time points – for mouse starts at E10; could get these for mouse but for human this would be difficult from primaries – very low quantities of RNA because starting with single cells; biggest amounts from 50 cells. What cell types might give this information? Fibroblasts keep memory of embryological programming. Muscle cell types that contribute to regeneration (satellite cells already in the collection) – can be easily sorted. 2. Brain cell types 3. Skin and skin appendices – hair follicles. Originally ordered commercial RNA from various parts of hair follicule; keratinocytes, melanocytes (in vitro differentiation of neural crest cells). What about fresh keratinocytes with zero culturing? 4. Additional time courses a. Heat shock b. Time post operation/wounding for skin c. Diabetes related: chronic culture in high glucose d. Co-culture experiments e. Mixed lymphocyte culture f. Transdifferentiation 5. For every single major organ, differentiated cells and stem cell types (liver, heart, kidney, pancreas) Need phase 1 samples – data freeze mid April; samples by mid March. Time courses by end of August. Primary disease samples are difficult – eg known mutation or homogeneous disease easier but heterogeneous disease to find novel marker need many samples.

Summary 1. Mailing list 2. Get samples into cell ontology 3. Terry to develop template for submission of information on cells 4. Send Terry information available about the cell type and culture conditions etc.


List of perturbations: chronic hypoxia etc, prioritise samples.