The machine learners I know are nice people. Pleasently geeky. With a certain stare-at-your-own-shoes phenotype. But, mild-mannered as they generally are, at this year’s NIPS some MLers really got worked up.

Lots of excitement were triggered by Ali Rahimi, the winner of the NIPS test-of-time award (what a nice idea to have such an award). Here is his talk.

You will agree: His talk is very clear, very engaging. And you might have been as amazed as I was to learn that ‘random kitchen sinks’ are a thing.

But what did he say that was so controversial?

Machine learning has become alchemy.

Alchemy is not bad. There is a place for alchemy.

Alchemy “worked”.

But …

For the physics and chemistry of the 17th century to usher in the sea change in the understanding of the universe that we are now experiencing, scientist had to dismantle alchemy.

AI should be built on rigorous knowledge, not on alchemy.

That does not sound too unreasonable …

… however, another leading ML* person, Yann LeCun, was not impressed, which spawned a lively discussion:

Sticking to a set of methods just because you can do theory about it, while ignoring a set of methods that empirically work better just because you don’t (yet) understand them theoretically is akin to looking for your lost car keys under the street light knowing you lost them someplace else.

Yes, we need better understanding of our methods. But the correct attitude is to attempt to fix the situation, not to insult a whole community for not having succeeded in fixing it yet.

In his talk, Rahimi had namechecked some of his friend who in the past had apparently formed a ‘rigor police’ to make sure all claims at NIPS were backed up by theory. LeCun answered: “I choose elbow grease over rigor police“.

As I said: lots of excitement.

Rahimi had definitely put his finger into an open wound.

All biology is alchemy**

I think this NIPS anecdote highlights a general rift between theory (rigor police) and practice (elbow grease) in science that applies to many more disciplines than just ML – let’s for example think of biology.

Biology does not have strong theoretical foundations. When your ER-positive cell line doesn’t respond to estrogen anymore, you will not get an explanation from anybody, just a shoulder shrug: “Maybe try that other batch we still have in the fridge.” It seems like the basic work horses of many labs are poorely understood. Alchemy!

My main exemplar to study the interplay between theory and practice in biology, between conceptual and technological advances, is the Human Cell Atlas and its efforts to identify all cell types in the human body.

In the single cell community, lots of elbow grease goes into technical developments. The genomics world is abuzz with advances to sequence more cells better and cheaper. And every five minutes or so someone proposes yet another improved algorithm to analyse single cell data – so without any doubt the technological side is progressing really well.

How are rigorous conceptual advances keeping up? Well, this is where alchemy comes in again. Some think those technological successes are overrated and don’t expand our conceptual understanding.

For example, one of my Cambridge colleagues wrote as an answer to the question “What Is Your Conceptual Definition of Cell Type”:

[B]iologists are suckers for techniques that we exploit to death. These days the arrival of single-cell transcriptomics has created a fad and, unconsciously, the thought that a cell can be defined in terms of the genes it expresses and by what people call their ‘‘epigenomes’’ is spreading.

So what is a cell type? His short response:

We don’t have the concepts, yet.

Eh .. ok, he sounds a lot like the biological rigor police to me.

I am critical of gung-ho sequence-first think-later approaches (because only concepts make data talk), but how will you do conceptual work without data? What will your new concepts explain, if not new observations? People can sometimes be carried away by new technologies, I agree, but just ignoring them looks equally silly to me.

Be like Newton

So here is the situation:

On one hand too much elbow grease: “You insult me by asking for more theory!”

On the other hand too much rigor police: “Ignore the data until we have better concepts!”

These extreme positions are illuminating, because they set the boundaries for any useful science, which -you guessed it- needs to combine elbow grease with rigor police.

I propose a solution: Be like Newton.

Write rigoruous paper and explore occult alchemy at the same time.

Theory and practice, elbow grease and rigor police, need to go hand in hand.



* Somebody please explain to me the difference between AI and ML.

** yes, I know what you think. But, see, all biology can be many things.


One thought on “The alchemy of science: elbow grease versus rigor police

  1. Great oversight! As Ali pointed out in his talk, ML is taking on a variety of inference roles critical to various societal infrastructures ranging from healthcare to traffic mapping. What scares me is the combination of ML alchemy with biological data alchemy; in other words trying to turn piss into piss and sell it to the general public as gold. What biological imperative supports the characterisation of single cells using transcriptomics? Even then, are our classification schema based on rigorous statistical frameworks or another algorithmic abomination cleverly disguised behind the facade of a Nature paper? Call me cynical, but we need to ensure that our fields stay true to their “translational science” roots for the sake of healthcare recipients, otherwise we may risk cancer genomics becoming another arm of holistic medicine.


You gotta talk to me!

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s