SampleClassification: Difference between revisions

From Wiki
Jump to navigationJump to search
(Created page with '1) Win input data is gene expression phylogenetic algorithm is neighbour-joining (Manhattan distance) 2) Robin input data is level 2 promotor expression phylogenetic technique is…')
 
No edit summary
Line 1: Line 1:
The link above will take you to a separate page expanding on all groups' analysis done so far, including:
1) Win
*1) Win
input data is gene expression
input data is gene expression
<br>
phylogenetic algorithm is neighbour-joining (Manhattan distance)
phylogenetic algorithm is neighbour-joining (Manhattan distance)
2) Robin
*2) Robin
input data is level 2 promotor expression
input data is level 2 promotor expression
<br>
phylogenetic technique is neighbour-joining (KL divergence distance)
phylogenetic technique is neighbour-joining (KL divergence distance)
3) Owen
*3) Owen
input data is gene expression (presence/absence using a threshold)
input data is gene expression (presence/absence using a threshold)
<br>
phylogenetic technique is maximum likelihood
phylogenetic technique is maximum likelihood
4) Owen --preliminary result--
*4) Owen --preliminary result--
input data is presence/absence of TF network edges based on Motif activity (same as FANTOM4 Nature Genetics paper)
input data is presence/absence of TF network edges based on Motif activity (same as FANTOM4 Nature Genetics paper)
<br>
phylogenetic technique is maximum likelihood
phylogenetic technique is maximum likelihood
5) Hai
*5) Hai
input data is domain-level expression (converted from gene-level)
input data is domain-level expression (converted from gene-level)
<br>
phylogenetic technique is average linkage clustering
phylogenetic technique is average linkage clustering
6) Kawaji-san
*6) Kawaji
input data is gene expression
input data is gene expression
<br>
phylogenetic technique is average linkage clustering (Pearson correlation)
phylogenetic technique is average linkage clustering (Pearson correlation)



Comparing these trees (attached and numbered), we can see that 1 and 6 use the same input data, and 2 is very similar (adding in non-coding). 3 takes the same data from 1 and 6 but applies a threshold to convert it to binary (discarding information), and 4 is edge information instead of node information only for transcription factors. 5 is the same data from 1 and 6, but converted to to domains from genes.
Comparing these trees (attached and numbered), we can see that 1 and 6 use the same input data, and 2 is very similar (adding in non-coding). 3 takes the same data from 1 and 6 but applies a threshold to convert it to binary (discarding information), and 4 is edge information instead of node information only for transcription factors. 5 is the same data from 1 and 6, but converted to to domains from genes.
<p>

1 and 2 use the same phylogenetic algorithm, 3 and 4 use a different one, and 5 and 6 use a third. The information contained in the input data should make more difference than the phylogenetic algorithm applied to it.
1 and 2 use the same phylogenetic algorithm, 3 and 4 use a different one, and 5 and 6 use a third. The information contained in the input data should make more difference than the phylogenetic algorithm applied to it.
<p>

If the trees are compared, the results are in general very similar in 1,2, 6 and 5. They all group obvious clades such as macrophage, brain and blood. The trees in 3 and 4 fail to separate primary cells from tissues although they do separate the obvious clades similar to the others. The tree in 5 is unique in that it separates organisms perfectly as well as separating the sample type and obvious clades within organisms (where available).
If the trees are compared, the results are in general very similar in 1,2, 6 and 5. They all group obvious clades such as macrophage, brain and blood. The trees in 3 and 4 fail to separate primary cells from tissues although they do separate the obvious clades similar to the others. The tree in 5 is unique in that it separates organisms perfectly as well as separating the sample type and obvious clades within organisms (where available).

Revision as of 03:22, 1 April 2011

The link above will take you to a separate page expanding on all groups' analysis done so far, including:

  • 1) Win

input data is gene expression
phylogenetic algorithm is neighbour-joining (Manhattan distance)

  • 2) Robin

input data is level 2 promotor expression
phylogenetic technique is neighbour-joining (KL divergence distance)

  • 3) Owen

input data is gene expression (presence/absence using a threshold)
phylogenetic technique is maximum likelihood

  • 4) Owen --preliminary result--

input data is presence/absence of TF network edges based on Motif activity (same as FANTOM4 Nature Genetics paper)
phylogenetic technique is maximum likelihood

  • 5) Hai

input data is domain-level expression (converted from gene-level)
phylogenetic technique is average linkage clustering

  • 6) Kawaji

input data is gene expression
phylogenetic technique is average linkage clustering (Pearson correlation)


Comparing these trees (attached and numbered), we can see that 1 and 6 use the same input data, and 2 is very similar (adding in non-coding). 3 takes the same data from 1 and 6 but applies a threshold to convert it to binary (discarding information), and 4 is edge information instead of node information only for transcription factors. 5 is the same data from 1 and 6, but converted to to domains from genes.

1 and 2 use the same phylogenetic algorithm, 3 and 4 use a different one, and 5 and 6 use a third. The information contained in the input data should make more difference than the phylogenetic algorithm applied to it.

If the trees are compared, the results are in general very similar in 1,2, 6 and 5. They all group obvious clades such as macrophage, brain and blood. The trees in 3 and 4 fail to separate primary cells from tissues although they do separate the obvious clades similar to the others. The tree in 5 is unique in that it separates organisms perfectly as well as separating the sample type and obvious clades within organisms (where available).