CAGEscan mapping protocol: Difference between revisions

From Wiki
Jump to navigationJump to search
m (Cosmetic.)
(Be more verbose on each step of the CAGEscan pipeline (to be continued).)
Line 1: Line 1:
== Sample splitting and linker removal ==
== Input ==


5' and 3' paired-end fastq files
Input is 5′ and 3′ paired-end fastq files from the Illumina sequencers.


=== 5′ ===
=== 5′ ===
Line 11: Line 11:
The 6 first bases of the 3′ reads are trimmed because they derive from to the random part (N6) of the reverse-transcription primer, and therefore may not reflect the RNA sequences accurately, since the reverse-transcriptase tolerates mismatches even on the last two bases. See [http://pubmed.gov/9973624 Mizuno et al., 1999] for example of priming over mismatches.
The 6 first bases of the 3′ reads are trimmed because they derive from to the random part (N6) of the reverse-transcription primer, and therefore may not reflect the RNA sequences accurately, since the reverse-transcriptase tolerates mismatches even on the last two bases. See [http://pubmed.gov/9973624 Mizuno et al., 1999] for example of priming over mismatches.


== Output ==
=== Output ===

Pairs of FASTQ files (5′ and 3′), where all the reads originate from the same RNA sample and all the linkers have been trimmed.

== Final Output ==


Mapped paired-end tags in BAM format
Mapped paired-end tags in BAM format

Revision as of 11:00, 31 March 2011

Sample splitting and linker removal

Input is 5′ and 3′ paired-end fastq files from the Illumina sequencers.

5′

The 9 first bases of the 5′ reads are trimmed. The 6 first are the index sequence (“barcode”) and the 3 next are the linker (GGG).

3′

The 6 first bases of the 3′ reads are trimmed because they derive from to the random part (N6) of the reverse-transcription primer, and therefore may not reflect the RNA sequences accurately, since the reverse-transcriptase tolerates mismatches even on the last two bases. See Mizuno et al., 1999 for example of priming over mismatches.

Output

Pairs of FASTQ files (5′ and 3′), where all the reads originate from the same RNA sample and all the linkers have been trimmed.

Final Output

Mapped paired-end tags in BAM format