Grep beats FusionMap, FusionFinder and ChimeraScan

This is amazing!

I have never used FusionMap, FusionFinder or ChimeraScan myself, so I don’t know if they belong into the class of fancy named methods with only marginal improvements that I have been known to rant about (on and on) – but kudos to Panagopoulos et al for showing the amazing power of grep:

The “Grep” Command But Not FusionMap, FusionFinder or ChimeraScan Captures the CIC-DUX4 Fusion Gene from Whole Transcriptome Sequencing Data on a Small Round Cell Tumor with t(4;19)(q35;q13).
Panagopoulos I, Gorunova L, Bjerkehagen B, Heim S.
PLoS One. 2014 Jun 20;9(6):e99439. doi: 10.1371/journal.pone.0099439. eCollection 2014.

Here is their abstract (with my emphasis):

Whole transcriptome sequencing was used to study a small round cell tumor in which a t(4;19)(q35;q13) was part of the complex karyotype but where the initial reverse transcriptase PCR (RT-PCR) examination did not detect a CIC-DUX4 fusion transcript previously described as the crucial gene-level outcome of this specific translocation. The RNA sequencing data were analysed using the FusionMap, FusionFinder, and ChimeraScan programs which are specifically designed to identify fusion genes. FusionMap, FusionFinder, and ChimeraScan identified 1017, 102, and 101 fusion transcripts, respectively, but CIC-DUX4 was not among them.

Since the RNA sequencing data are in the fastq text-based format, we searched the files using the “grep” command-line utility. The “grep” command searches the text for specific expressions and displays, by default, the lines where matches occur.

The “specific expression” was a sequence of 20 nucleotides from the coding part of the last exon 20 of CIC (Reference Sequence: NM_015125.3) chosen since all the so far reported CIC breakpoints have occurred here. Fifteen chimeric CIC-DUX4 cDNA sequences were captured and the fusion between the CIC and DUX4 genes was mapped precisely. New primer combinations were constructed based on these findings and were used together with a polymerase suitable for amplification of GC-rich DNA templates to amplify CIC-DUX4 cDNA fragments which had the same fusion point found with “grep”.

In conclusion, FusionMap, FusionFinder, and ChimeraScan generated a plethora of fusion transcripts but did not detect the biologically important CIC-DUX4 chimeric transcript; they are generally useful but evidently suffer from imperfect both sensitivity and specificity.

The “grep” command is an excellent tool to capture chimeric transcripts from RNA sequencing data when the pathological and/or cytogenetic information strongly indicates the presence of a specific fusion gene.



