The machine learners I know are nice people. Pleasently geeky. With a certain stare-at-your-own-shoes phenotype. But, mild-mannered as they generally are, at this year’s NIPS some MLers really got worked up.
The Human Cell Atlas preprint came out some days ago on bioRxiv. It describes a project to collect all the cell types in the human body in one big reference map.
Our mission: To create comprehensive reference maps of all human cells—the fundamental units of life—as a basis for both understanding human health and diagnosing, monitoring, and treating disease. [from humancellatlas.org]
The contributors to the project are a Who-is-who of the leaders in single cell genomics and this will be a fantastic data set when it comes out. Because in-depth analysis of resources like this provides the foundation of all biology, as you know.
I enjoyed reading the preprint. It puts the project into a historical perspective and discusses promises as well as limitations. It even references Borges’ `On Rigor in Science’. (I love well-read scientists!) And even if all that means nothing to you, it is still worth reading as a comprehensive summary of the current state-of-the-single-cell-art.
But I kept wondering, with a project like this, how do you know whether it is a success or not? How do you know that your reference map is really comprehensive and covers all (most?) of what it is supposed to find?