Methods vs Insights #2: how the Pros do it

Welcome back! It’s Methods versus Insights – part 2.

In the last post in this series we discussed that there are too many methods in computational biology. Or to be more precise, that there are too many marginal improvements on existing methods.

Don’t get me wrong, methodological research can be great. Take this example: In 2002 Trey Ideker proposed a method to find active modules in networks. As so often, Trey was way ahead of everyone else identifying this problem, and –as so often– the method he proposed was a crude heuristic, without any guarantees on how close you come to the best possible solution. Some time later, Gunnar Klau and Tobias Müller together with their teams proposed an exact solution and implemented it in a software called BioNet. They got a price at ISMB for it and my group is using their approach in many applications. Substituting a heuristic with an exact solution is not a marginal improvement but solving a problem — computational biology as it should be.

But most methodological advances go unnoticed (unless you are at the Broad, get them into Science, and produce a video for them). Thus, the question for today is: If you are a computational biologist like me, how do you get papers published in the journals we like best?

Of course, if I had a really good recipe, my own publication list would look much better than it does. Luckily, there are success stories out there. Here are three from my field, cancer genomics, and maybe if we look at them hard enough we find out how they did it:

#1: A genomic strategy to elucidate modules of oncogenic pathway signaling networks

My first example is A genomic strategy to elucidate modules of oncogenic pathway signaling networks by Joe Nevins and Mike West from Duke. Using a statistical factor and regression model (called ‘BFRM’) they had published earlier, Nevins’ and West’s teams constructed 20 factors as signatures for transcriptional activity downstream of the RAS pathway, one of the key cancer pathways. The main of the paper is about making sense of the 20 signatures using the NCI60 cell line collection and using them for predicting clinical response.

What I took out of this paper is how much (computational, experimental, intellectual) work it is to interpret the results of a sophisticated statistical method and that even the leaders in the field only succeed with a small number of them (only a handful of the 20 signatures do something useful). It’s all too easy to write up the method and conclude that it produces ‘promising results’ without going all the way and actually testing them out like Nevins and West did — and that’s what got them into Molecular Cell.

#2: The transcriptional network for mesenchymal transformation of brain tumours

The second example is The transcriptional network for mesenchymal transformation of brain tumours from Andrea Califano‘s group. In it they make use of their (previously published) ARACNe approach for co-expression networks together with their (also previously published) Master Regulator Analysis to identify transcription factors driving transcriptional changes (based on gene set enrichment analysis) and a bit of step-wise linear regression — powerful methods for sure, but nothing too fancy. But taken together they pinpoint two transcription factors (TFs) as synergistic initiators for mesenchymal transformation in brain tumors. And to validate their results they did ChIP and gain/loss of function experiments, as well as observing morphological changes in neural stem cells after introducing these two TFs and changes in tumor progression after silencing them. Like in the last example, the statistical methods are just that: methods — and not the central message of the paper.

#3: An integrated approach to uncover drivers of cancer

My last example is Dana Pe’er‘s integrated approach to uncover drivers of cancer. Her approach is based on module networks, a method which already has a long history (long at least for my field) and were originally developed in yeast in 2003. Here, they are now applied to copy-number and expression data of melanoma. They identify two new cancer ‘drivers’ and show in experimental follow-up that they are indeed needed for proliferation.

What lessons did I learn

I took several things out of these papers:

  1. In none of the cases were the methods novel. The methodological key ideas were usually published a while ago.
  2. What was novel were the biological insights generated by those methods. All these papers contributed to a big question, something that got people excited in different fields: biology, medicine, statistics, …
  3. Experimental follow-up is really hard work. Really, really hard! But it’s just not enough to say ‘Someone should now please follow-up on my computational predictiions’ — if you want to have impact you need to do it yourself, or team up with somebody who can do it for you.

I will explore this more in the next post in the series, which will be titled ‘Big questions lead to big contributions’.

Find the other posts in the series under the tag methods versus insights.



  1. Chang et al. A genomic strategy to elucidate modules of oncogenic pathway signaling networks. Mol Cell. 2009 Apr 10;34(1):104-14.
  2. Carro et al. The transcriptional network for mesenchymal transformation of brain tumours.
    Nature. 2010 Jan 21;463(7279):318-25.
  3. Akavia et al. An integrated approach to uncover drivers of cancer. Cell. 2010 Dec 10;143(6):1005-17.

You gotta talk to me!

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s