Books, Science

Life out of sequence – Hallam Stevens’ data-driven history of bioinformatics


How do people like you ever get last-author papers?” The person who asked me this question in 2008 during the interview for my current job was (and still is) a well-known stem cell biologist with decades of experience in science. But she still didn’t really know what to think of ‘people like me‘: bioinformaticians and computational biologists. Aren’t bioinformaticians just service providers? Handy to have, but without any real scientific vision and contribution? She clearly worried about my ability to do independent research.

And she wasn’t alone. A couple of years later I interviewed for an EMBO fellowship, which I didn’t get because the panel –mostly cell biologists, no one computational or from genomics or medicine– thought my group was a “mathematical service unit” and my research was “overly driven by my collaborators”. I’m still not sure what a ‘mathematical service unit’ could be (proofing theorems on demand maybe?) but their comments showed me how far removed their research practice was from my own.

Even though bioinformatics is by now an established field these personal experiences show that ‘old school’ biologists, who form the scientific establishment and direct mainstream research, are still very uncomfortable with ‘people like me’ who were trained in other disciplines, pursue biological questions different from their own, and use approaches not covered in classical biological training.

Life Out Of Sequence Cover

Hallam Steven’s book Life Out Of Sequence, A Data-Driven History of Bioinformatics starts with the tension between old and new biology that ‘people like me’ experience every day and describes the way biology has been and is being changed by computational methods.

Stevens defines bioinformatics very broadly – from data bases and web services via robotics and LIMS to data analysis and algorithms, which shows how saturated biology is with informatics.

Bioinformatics […] is a set of attempts to work out how to use computers to do and make biology. This is a pressing issue precisely because computers bring with them commitments to practices and values of Big Science: money, large institutions, interdisciplinarity. (p6)

[Biologists’ mixed feelings about bioinformatics] hide deeper controversies about the institutional and professional forms that biology will take in the coming decades. Who gets money, who gets credit, who gets promoted, how will students be trained–these are high-stakes issues for individual biologists as well as for the future of biology as a whole. (p44)

Watching the bioinformaticians

Stevens is assistant professor at Nanyang Technological University in Singapore. He combines historical and ethnographic methods to describe how and why biology has changed due to computing. His results are based on field studies at MIT, the Broad, the EBI, and interviews with dozens of researchers.

The motivation for his study is that a change in biological practice is also a change in our basic understanding of life:

What is it like to do biology when the indispensable scientific instrument has become the computer […], and when the manipulation of data replaces the manipulation of organisms and their parts? (backcover)

The kinds of knowledge and practice [in bioinformatics] are already moving to the forefront of our understanding of life. It is crucial that we reflect on the consequences of this change–what difference does it make that we can now examine, manipulate, and understand life with and through computers? (p11)

I liked Stevens’ approach to “follow the data”:

Following the data helps us cut through the information metaphors and get closer to understanding the perplexity of things going on inside the machine. (p7)

“Data” is still a very abstract concept, but much less so than “information”. Data is the stuff on hard drives, that goes into analysis pipelines and gets visualized in plots. Data is the stuff bioinformaticians live with.

I also liked how Stevens characterizes bioinformatic work on databases and ontologies as an intellectual contribution to theoretical biology, the organization and systematization of our understanding of life — even though most biologists, experimental ones and computational ones including me, wouldn’t think it very exciting at all.

We shape our tools and then our tools shape us

Computers caused a revolution in biology – literally: an overthrow of what counts as a valid research question in biology. Bioinformatics constitutes a paradigm shift because it has changed the map of what counts as a valid and feasible research question in biology.

The central thesis of Stevens’ book can be summarized by a famous quote often (incorrectly) attributed to Marshall McLuhan: “We shape our tools and then our tools shape us.”

Computational methods of data storage and analysis had been shaped in physics and were then adapted in biological research, in particular the Human Genome Project. Stevens argues that computers did not change to become useful to biology, in the contrary:

Biology adapted itself to the computer, not the computer to biology. (p41)

Computers do not just scale up the old biology, they bring with them completely new tools and questions, like statistics, simulation, and data management, that completely re-shaped the way biological research is being done.

Computers became plausible tools for doing biology because they changed the questions that biologists were asking. They brought with them new forms of knowledge production (..) that were explicitly suited to reducing and managing large data sets and large volumes of information. (p39)

Politics and Power in Biology

Bioinformatics is at the center of major discussions about the self-image of biology, which started when biologists confronted with high-dimensional statistics, scale-free networks and other tools of systems biology fought back by stating the importance of focussed hypotheses and small-scale experimental validations:

[T]he terms “data-driven” and “hypothesis-free” have become focal points of debates about the legitimacy of bioinformatics techniques and methods. (…)

Indeed, the sharpness of these epistemiological disagreements is further evidence that bioinformatics entails a significant challenge to older ways of investigating and knowing life. (p66)

Stevens clearly describes how these discussion are not only scientific — they are about power. The power to define biology:

The pronouncement that “bioinformatics” occupies an under-laborer status within biology is designed to suppress the importance of certain new kinds of biological practices. (…) What is at stake is what biological practice should look like. (p78)

Well, I certainly felt suppressed and insulted by that EMBO panel. I will know that EMBO has started to keep up with modern times when they select the first member with a computer science background.

Until then, Hallam Stevens’ book is a great read for anybody interested in the future of biology.


Update 9.7.2104: I just saw there is a youtube teaser video of the book, maybe even narrated by the author himself: