Grep beats FusionMap, FusionFinder and ChimeraScan

This is amazing!

I have never used FusionMap, FusionFinder or ChimeraScan myself, so I don’t know if they belong into the class of fancy named methods with only marginal improvements that I have been known to rant about (on and on) – but kudos to Panagopoulos et al for showing the amazing power of grep:

The “Grep” Command But Not FusionMap, FusionFinder or ChimeraScan Captures the CIC-DUX4 Fusion Gene from Whole Transcriptome Sequencing Data on a Small Round Cell Tumor with t(4;19)(q35;q13).
Panagopoulos I, Gorunova L, Bjerkehagen B, Heim S.
PLoS One. 2014 Jun 20;9(6):e99439. doi: 10.1371/journal.pone.0099439. eCollection 2014.

Continue reading


Methods vs Insights #4: The four stages of a project (and the fifth you should avoid)

Methods vs Insights is back. Today with a discussion of general research practice.

Most projects in my lab take years from start to finish. So it is important for me to manage the expectations my students and postdocs may have. Here is a plot I have developed to discuss the different stages of a scientific project with them and to prepare them for what’s ahead.

The four stages of a scientific project: Explore! Dig! Refine! Sell! And the stage you want to avoid: Waste! Plus the prevalent emotion in each stage and the key skill you will need to successfully navigate it. (x-axis it time, y-axis is work you’ve put in.)

Continue reading

Career, Science

Methods vs Insights #3: the data don’t fall from the sky

Methods vs Insights is back! In the first post on this topic, I distinguished between computational biologists and computational biologists. The boundaries between the two groups are blurred and my own group has people with computational and biological backgrounds working on very similar problems.

Having The Talk

But when a new team member joins us fresh out of a computer science or statistics degree, I need to have The Talk with them. The Talk about how our work at a biomedical research institute differs from the work in a computer science department. The Talk about how to get into journals with an impact factor bigger than 5.

I generally start by sketching a plot on my white board, which looks like this (yes, that’s true, my hand-drawn plots look just like fresh out of Illustrator):

The difference between biomedical research and methodological research.
The difference between biomedical research and methodological research.

Continue reading


An embargo on marginal improvements

There is an excellent discussion of an excellent post on marginal improvements over at biomickwatson. He calls for an embargo on short read alignment tools, because there now already 70+ of them out there.

I can’t help but say – I’m sorry, but isn’t this a waste of time, both yours and mine?

I have nothing against the authors of the new tool, whom I am sure are excellent scientists. […] But still, rather than write another tool, why not contribute to the codebase of an existing tool? If BWA is not accurate enough for you, then branch the code and make it so; if Stampy is too slow, speed it up. *

That reminds me of my own rant against marginal improvements: ‘Spare me your method, show me your finding‘. What is true for aligners is true for the rest of bioinformatics too. I have seen thousands of microarray clustering, network reconstruction and enrichment analysis papers – all of them rephrasing the same small number of ideas.

You need to read the comments thread under Mick’s post. Well-argued and diverse opinions. I particularly liked this comment:

It is a false and wasteful economy to value a paper over the software it describes, but will the madness never end?



Methods vs Insights #2: how the Pros do it

Welcome back! It’s Methods versus Insights – part 2.

In the last post in this series we discussed that there are too many methods in computational biology. Or to be more precise, that there are too many marginal improvements on existing methods.

Don’t get me wrong, methodological research can be great. Take this example: In 2002 Trey Ideker proposed a method to find active modules in networks. As so often, Trey was way ahead of everyone else identifying this problem, and –as so often– the method he proposed was a crude heuristic, without any guarantees on how close you come to the best possible solution. Some time later, Gunnar Klau and Tobias Müller together with their teams proposed an exact solution and implemented it in a software called BioNet. They got a price at ISMB for it and my group is using their approach in many applications. Substituting a heuristic with an exact solution is not a marginal improvement but solving a problem — computational biology as it should be.

But most methodological advances go unnoticed (unless you are at the Broad, get them into Science, and produce a video for them). Thus, the question for today is: If you are a computational biologist like me, how do you get papers published in the journals we like best?

Continue reading


Spare me your method; show me your finding!

lab bench

Developing statistical and computational methods is fun…

Figuring out the mathematics, making the algorithms more efficient, using every programming trick in the book — it’s like producing a piece of art. The rush of adrenaline, when you discover a bug; the satisfaction, when it all comes together and does what it is supposed to. Faster and more accurate, hopefully, than all competitors. Victory!

I can completely understand why so many people in computational biology are busy developing more and more methods. I even got some of my own. It’s fun and an intellectual challenge – the scientific equivalent of cross-word puzzles.

… but computational biology has too many methods already

But more and more I find myself wondering if I should indulge in this fun exercise. Computational biology has too many methods already. Only a tiny number make an impact. Most are just marginal improvements to existing methods. Not what you would call a game-changer.

Continue reading