NanoCAGE
Summary
nanoCAGE is a low-quantity CAGE method based on template switching instead of CAP trapper. If the cDNAs are random primed and the 3′ end sequenced paired end, the method is called CAGEscan.
Currently, only CAGEscan libraries are produced for FANTOM5 (no single-end).
- Protocol summary: OP-SOLEXA-nanoCAGE-Direct-v1.4.
- Ume blossom meeting presentation: nanoCAGE.pdf
nanoCAGE and intron / exon painting
Cap-specificity of the nanoCAGE libraries derive from a two-step mechanism:
- The reverse transcriptase reverse-transcribes the 5′ cap, therefore adding a couple of Cs at the 3′ end of the first-strand cDNAs.
- The template switching oligonucleotide, that has three ribo-Gs at its 3′, hybridises with the first-strand cDNA and provides linker added to the cDNA by revers-transcription.
It was discovered that by changing the end of the template switching oligonucleotide to thee ribo-Cs, the resulting library is mostly exon painting and 5′ ends of snoRNAs.
Aligning nanoCAGE data
The cap specificity of nanoCAGE is lower than with CAGE protocols using CAP trapper. We usually remove with rRNAdust all reads that match to the ribosomal DNA units before aligning on the genome. nanoCAGE data can be aligned using BWA with default options. The following commands will align all the FASTQ file in a folder, against a genome after rRNA removal, and sort and index the alignment file in BAM format.
This is a simple example that is not optimised for production use (no threads, no sanity check to see if the filesystem is local or not, …).
RDNA=/analysisdata/genomes/rDNA/human_rDNA_U13369.1.fa GENOME=/analysisdata/genomes/hg19_male.fa for FASTQ in *fq do LIB=$(basename $FASTQ .fq) rRNAdust $RDNA $LIB.fq > $LIB.rRNAdust.fq bwa aln $GENOME -f $LIB.sai $LIB.fq bwa samse $GENOME $LIB.sai $LIB.fq | samtools view -uS - | samtools sort - $LIB samtools index $LIB.bam done