SampleClassification: Difference between revisions

Revision as of 03:22, 1 April 2011

The link above will take you to a separate page expanding on all groups' analysis done so far, including:

1) Win

input data is gene expression
phylogenetic algorithm is neighbour-joining (Manhattan distance)

2) Robin

input data is level 2 promotor expression
phylogenetic technique is neighbour-joining (KL divergence distance)

3) Owen

input data is gene expression (presence/absence using a threshold)
phylogenetic technique is maximum likelihood

4) Owen --preliminary result--

input data is presence/absence of TF network edges based on Motif activity (same as FANTOM4 Nature Genetics paper)
phylogenetic technique is maximum likelihood

5) Hai

input data is domain-level expression (converted from gene-level)
phylogenetic technique is average linkage clustering

6) Kawaji

input data is gene expression
phylogenetic technique is average linkage clustering (Pearson correlation)

Comparing these trees (attached and numbered), we can see that 1 and 6 use the same input data, and 2 is very similar (adding in non-coding). 3 takes the same data from 1 and 6 but applies a threshold to convert it to binary (discarding information), and 4 is edge information instead of node information only for transcription factors. 5 is the same data from 1 and 6, but converted to to domains from genes.

1 and 2 use the same phylogenetic algorithm, 3 and 4 use a different one, and 5 and 6 use a third. The information contained in the input data should make more difference than the phylogenetic algorithm applied to it.

If the trees are compared, the results are in general very similar in 1,2, 6 and 5. They all group obvious clades such as macrophage, brain and blood. The trees in 3 and 4 fail to separate primary cells from tissues although they do separate the obvious clades similar to the others. The tree in 5 is unique in that it separates organisms perfectly as well as separating the sample type and obvious clades within organisms (where available).

SampleClassification: Difference between revisions

Revision as of 03:22, 1 April 2011

Navigation menu

Page actions

Page actions

Personal tools

Menu

Search

Special topics

Resources

ZENBU genome browser

UCSC Genome Browser RIKEN mirror

Navigation

Tools

@@ Line 1: / Line 1: @@
+The link above will take you to a separate page expanding on all groups' analysis done so far, including:
-) Win
+*1) Win
 input data is gene expression
+<br>
 phylogenetic algorithm is neighbour-joining (Manhattan distance)
-) Robin
+*2) Robin
 input data is level 2 promotor expression
+<br>
 phylogenetic technique is neighbour-joining (KL divergence distance)
-) Owen
+*3) Owen
 input data is gene expression (presence/absence using a threshold)
+<br>
 phylogenetic technique is maximum likelihood
-) Owen  --preliminary result--
+*4) Owen  --preliminary result--
 input data is presence/absence of TF network edges based on Motif activity (same as FANTOM4 Nature Genetics paper)
+<br>
 phylogenetic technique is maximum likelihood
-) Hai
+*5) Hai
 input data is domain-level expression (converted from gene-level)
+<br>
 phylogenetic technique is average linkage clustering
-) Kawaji-san
+*6) Kawaji
 input data is gene expression
+<br>
 phylogenetic technique is average linkage clustering (Pearson correlation)
 Comparing these trees (attached and numbered), we can see that 1 and 6 use the same input data, and 2 is very similar (adding in non-coding). 3 takes the same data from 1 and 6 but applies a threshold to convert it to binary (discarding information), and 4 is edge information instead of node information only for transcription factors. 5 is the same data from 1 and 6, but converted to to domains from genes.
+<p>
 and 2 use the same phylogenetic algorithm, 3 and 4 use a different one, and 5 and 6 use a third. The information contained in the input data should make more difference than the phylogenetic algorithm applied to it.
+<p>
 If the trees are compared, the results are in general very similar in 1,2, 6 and 5. They all group obvious clades such as macrophage, brain and blood. The trees in 3 and 4 fail to separate primary cells from tissues although they do separate the obvious clades similar to the others. The tree in 5 is unique in that it separates organisms perfectly as well as separating the sample type and obvious clades within organisms (where available).