SampleClassification: Difference between revisions
No edit summary |
No edit summary |
||
| Line 29: | Line 29: | ||
<p> |
<p> |
||
1 and 2 use the same phylogenetic algorithm, 3 and 4 use a different one, and 5 and 6 use a third. The information contained in the input data should make more difference than the phylogenetic algorithm applied to it. |
1 and 2 use the same phylogenetic algorithm, 3 and 4 use a different one, and 5 and 6 use a third. The information contained in the input data should make more difference than the phylogenetic algorithm applied to it. |
||
</p> |
|||
<p> |
<p> |
||
If the trees are compared, the results are in general very similar in 1,2, 6 and 5. They all group obvious clades such as macrophage, brain and blood. The trees in 3 and 4 fail to separate primary cells from tissues although they do separate the obvious clades similar to the others. The tree in 5 is unique in that it separates organisms perfectly as well as separating the sample type and obvious clades within organisms (where available). |
If the trees are compared, the results are in general very similar in 1,2, 6 and 5. They all group obvious clades such as macrophage, brain and blood. The trees in 3 and 4 fail to separate primary cells from tissues although they do separate the obvious clades similar to the others. The tree in 5 is unique in that it separates organisms perfectly as well as separating the sample type and obvious clades within organisms (where available). |
||
</p> |
|||
Revision as of 03:22, 1 April 2011
The link above will take you to a separate page expanding on all groups' analysis done so far, including:
- 1) Win
input data is gene expression
phylogenetic algorithm is neighbour-joining (Manhattan distance)
- 2) Robin
input data is level 2 promotor expression
phylogenetic technique is neighbour-joining (KL divergence distance)
- 3) Owen
input data is gene expression (presence/absence using a threshold)
phylogenetic technique is maximum likelihood
- 4) Owen --preliminary result--
input data is presence/absence of TF network edges based on Motif activity (same as FANTOM4 Nature Genetics paper)
phylogenetic technique is maximum likelihood
- 5) Hai
input data is domain-level expression (converted from gene-level)
phylogenetic technique is average linkage clustering
- 6) Kawaji
input data is gene expression
phylogenetic technique is average linkage clustering (Pearson correlation)
Comparing these trees (attached and numbered), we can see that 1 and 6 use the same input data, and 2 is very similar (adding in non-coding). 3 takes the same data from 1 and 6 but applies a threshold to convert it to binary (discarding information), and 4 is edge information instead of node information only for transcription factors. 5 is the same data from 1 and 6, but converted to to domains from genes.
1 and 2 use the same phylogenetic algorithm, 3 and 4 use a different one, and 5 and 6 use a third. The information contained in the input data should make more difference than the phylogenetic algorithm applied to it.
If the trees are compared, the results are in general very similar in 1,2, 6 and 5. They all group obvious clades such as macrophage, brain and blood. The trees in 3 and 4 fail to separate primary cells from tissues although they do separate the obvious clades similar to the others. The tree in 5 is unique in that it separates organisms perfectly as well as separating the sample type and obvious clades within organisms (where available).