User:Tlassmann/rRNA filtering

From Wiki
< User:Tlassmann
Revision as of 12:41, 17 January 2011 by Tlassmann (talk | contribs) (Created page with '== Purpose == Remove reads corresponding to rRNA from Helicos CAGE datasets. == Method == Since the error rate of Helicos is high and includes many insertion / deletion errors…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Purpose

Remove reads corresponding to rRNA from Helicos CAGE datasets.

Method

Since the error rate of Helicos is high and includes many insertion / deletion errors, the only viable option was to match sequences against rRNA sequences (U13369.1) using a non-heuristic alignment algorithm. Due to the amount of data a SSE parallelized version of Myers bit-parallel algorithm was implemented.

All reads matching the reference rRNA sequences with up to 2 errors are discarded at this step.

Input

Helicos fasta sequences.

Output

Reads not matching rRNA.