The machine learners I know are nice people. Pleasently geeky. With a certain stare-at-your-own-shoes phenotype. But, mild-mannered as they generally are, at this year’s NIPS some MLers really got worked up.
What happens if my pragmatic approach to data analysis gets reviewed by a card-carrying statistician? “The absence of competent statistical guidance in this MS is becoming painfully obvious.” Oh my!
It’s been almost a year since Trinh et al, “Practical and Robust Identification of Molecular Subtypes in Colorectal Cancer by Immunohistochemistry” came out in Clinical Cancer Research. Let me tell you a bit about the reviews we got back in summer 2016.
“How reproducible is cancer biology?” stood in bold letters on a poster I had designed to advertise a talk by Tim Errington, one of the leaders of the Reproducibility Project: Cancer Biology (RP:CB), in Cambridge a few weeks ago. And I had told everyone “Come and learn who’s good, who’s bad and who’s ugly in cancer research!”
The RP:CB results are collected at eLife and the splash they made was big enough to be covered by Nature and Science. So it was great to finally meet somebody leading this project to learn first-hand which ideas are guiding their work.
Tim’s talk had the rather technical title “Improving Openness and Reproducibility of Scientific Research.” You can easily see why I felt the need to spice things up. To get a lecture theatre full of people you need to promise them blood, not balanced and nuanced views (limitations I luckily have never been accused of myself).
With all the effort I had put into advertising the talk, the people at my institute knew what to expect: A witchhunt by a posse of replication vigilantes, who abuse money diverted from real science to name and shame the actually successfull researchers! Hang them higher! Yihaah!
When Tim arrived, he took one look at the way I had advertised the talk, gently shook his head, and said “Well, you can of course do this, but I wouldn’t. It’s not really important which study reproduces and which doesn’t.”
In a recent paper in Nature Reviews Cancer, Maley et al set out to define a consensus framework for classifying neoplasms. The paper’s premise is that such a theoretical framework is a necessary first step for developing new quantitative approaches. I disagree. I argue that the paper highlights the limited practical relevance of a purely intellectual exercise. Solid classification frameworks of clinical relevance need more detail and need to be grounded on applicability to real data in clinical practice.
For those of you in hurry, let me sum up what my claims are:
- This is a very good review of the field. Its particular strength is combining cancer evolution with the tissue microenvironment. You should definitely read it.
- However, the review poses as something it is not: a classification scheme of clinical relevance.
- The proposed classification scheme fails because (a) there is no practical way how to classify patients with it, and (b) evidence of clinical impact is circumstantial and anecdotal.
- The authors recognise all these problems, but dismiss them as areas of future research, rather than testing prototypes of their scheme on real data.
- Methodological and measurement innovations happen as we speak – no one needed this framework to kick start innovation.
- Consensus on specific approaches will be much harder, much more interesting and much more useful, than consensus on lofty ideas.
20 researchers, hundreds of opinions, no powerpoint – yes, it’s been that time of the year again: Systems Genetics of Cancer rocked London.
When I started my PI career, my first cover letter to a glamour journal emphatically pointed out that my cutting-edge, ground-breaking work was the first and firstest to do X.
Feedback from senior colleagues was: “Drop that blech! Better say what your insight into X actually is, and in what way it is profound.” — Good advice. Because novelty is overrated, insight rules.
How should novelty be valued in science? Not exclusively.
So I wasn’t too surprised how Barak Cohen answered the question “How should novelty be valued in science?” in the last issue of eLife. I would never put a question mark into a title, if the answer is so clear:
Laborjournal.de just published a German translation of my opinion piece “All biology is computational biology” in PLoS Biology earlier this year.
Have a look at it here: http://www.laborjournal.de/rubric/essays/essays2017/e17_10.lasso
Luckily I didn’t have to translate it myself. My Deutsch has been getting pretty schlecht lately.
And this is reading quite well, don’t you think?
Superheroes like Mr Fantastic are used to being watched, and bioinformaticians better get used to it, too. Like superheroes, bioinformaticians are adored by the public for their powers as well as their dress sense. And while superheroes have their own Superbeing watching from the moon (Uatu the Watcher), bioinformaticians have their own tribe of sociologists stalking them, as a recent insider report has revealed.
When I opinionated on and on about All Biology being Computational Biology, I was aware that these were not really novel ideas. After all Hallam Stevens had written a whole book about it and my friends inside my intellectual bubble kept on asking why I had spent so much time on writing up something so glaringly obvious.
But what I had missed is that some of my points had already been made very clearly in an excellent piece by Pavel Pevzner and Ron Shamir in Science in 2009 titled “Computing Has Changed Biology—Biology Education Must Catch Up“.
The Human Cell Atlas preprint came out some days ago on bioRxiv. It describes a project to collect all the cell types in the human body in one big reference map.
Our mission: To create comprehensive reference maps of all human cells—the fundamental units of life—as a basis for both understanding human health and diagnosing, monitoring, and treating disease. [from humancellatlas.org]
The contributors to the project are a Who-is-who of the leaders in single cell genomics and this will be a fantastic data set when it comes out. Because in-depth analysis of resources like this provides the foundation of all biology, as you know.
I enjoyed reading the preprint. It puts the project into a historical perspective and discusses promises as well as limitations. It even references Borges’ `On Rigor in Science’. (I love well-read scientists!) And even if all that means nothing to you, it is still worth reading as a comprehensive summary of the current state-of-the-single-cell-art.
But I kept wondering, with a project like this, how do you know whether it is a success or not? How do you know that your reference map is really comprehensive and covers all (most?) of what it is supposed to find?
Have a look at this excellent editorial in Nature: Integrity starts with the health of research groups – Funders should force universities to support laboratories’ research health.
I really like the term ‘research health’, which encompasses both technical aspects of doing research right as well as the well-being of researchers.
Almost a month has passed since I published an opinion piece called “All biology is computational biology” in PLoS Biology.
In my paper, I envisioned a biology that explicitly and clearly acknowledges how much it has changed over the last 20 years, how much its questions have changed, and how much the practice of doing biology has changed. I envisioned a biology that gives credit broadly and fairly to everybody who contributed to key insights – regardless of what tools they used.
As intended, my paper provoked many responses from the community, and in the following you find my thoughts on some particularly interesting comments.
Check out Amber Dance’s comparison of single cell tumor phylogeny methods at The Scientist.
Almost makes it look like Niko and I know what we are doing in this field.
In this series, we ask leading scientists in their respective fields to explain clearly and engagingly for a lay audience why the research carried out in their laboratories – and those of their collaborators and their colleagues – matters.
It wasn’t immediately clear to me, what I should write about. I tend to label myself a cancer researcher nowadays, but cancer research does not need any explanation why it matters – unfortunate as that is.
At the same time, I am a computational biologist – and here I thought was a much bigger need to explain why it matters. The question is not so much why computational biology and bioinformatics are useful (nobody seems to question that it’s handy to have the geeks around) but why is it biological research, rather than just a support and service activity.