Computational neuroscientist, University of California, San Francisco
Can you provide a high level description of what you do in your lab?
It’s been an incredibly exciting time at the lab the past few months. We’re making progress in identifying molecular signatures of human neural stem cells that come from the activity of specific genes. It’s the first time anyone has seen what many of those genes actually do in the brain. Within the next few years we’ll be able to compare the patterns we’re seeing to what happens in other primates, which I hope will allow us to finally identify what makes a human brain human. I feel like we’re closing in, but at the same time I know it’s a long road ahead.
How are you using data analytics?
We analyze very large data sets generated from human brain tissue to quantify the simultaneous activity of all of the genes in the human genome. There are about 22,000 genes in the genome, but each one comes in different flavors, so when you add up all the flavors, you get potentially millions of data points. To determine their similarity, we’re building matrices that can be a million-by-a-million data points. We use software that I’ve developed in the lab to identify patterns of gene activity and then try and relate those patterns to distinct cell types. So it’s very much a data-driven enterprise, which is in contrast to the long-dominating hypothesis-driven approach.
What can your model accomplish that the age-old model might not be able to—and do your peers consider you a threat?
Biology for the last 50 years has been dominated by a reductionist mode of thinking and the use of qualitative techniques. That approach has been very successful, so it’s not going away any time soon. But we have entered a new era with quantitative techniques that capture so much data that we can begin to see patterns emerge. There’s a lot to be learned by taking a step back and letting the data tell its own story. As for resistance, sure, there is some resistance from those who think science should be strictly hypothesis-driven. That’s not unique to biology. I suspect there are those in all manner of enterprises who feel the same way. But I feel like data analysis ultimately sells itself.
How do you reveal conclusions to your peers, academia or even potential patrons? It must be difficult considering the novelty of your approach.
There’s no denying that it’s hard. A lot of data sets are so multidimensional that they’re very hard to summarize. Rule number one is: Know your target audience and know what language they speak. That means I don’t throw up equations in a room full of non-scientists, for example. Instead, I try to explain things intuitively. Another way to do this is visually, through pithy, meaningful graphics and tables that distill large data sets into forms that people of all backgrounds can digest. There’s a world almost unto itself of designers and artists who specialize in that sort of thing.
You were an adman in a previous life. Do you see any common ground between those two disciplines—or between neuroscience and any other discipline, like finance?
I think there’s a lot of fertile ground for cross-pollination across many disciplines. One of the things that you realize pretty quickly when you start analyzing big data sets is that you can build networks out of anything. Once you get the data into a matrix, it’s the same set of tools, you can run the same algorithms on it. Different domains may have different types of data, but they all face the same low-level problems.
Such as figuring out how to account for missing data, noisy data or artifacts in the data. This is true in biology, finance or marketing. What makes for a really great analyst in biology is the same for any discipline. If you handed somebody like me marketing data and asked me to analyze it, I would approach it the way I do a biological problem and perhaps I might have insights that would not be obvious to someone coming from a purely mathematical or business mindset. But you can’t forget the importance of the individual doing the analysis. I’ve found there’s a huge element of creativity involved. Without that, most things go nowhere.
How does creativity come into play?
In different ways, but I guess the creativity really comes in knowing how to ask questions and how to find answers. It’s important to have a conceptual landscape of the field you’re studying, of course. In my case that’s my knowledge of the underlying biological reality in the brain. This is really the basis for my intuition, which guides my feeling for how best to tackle a problem. Intuition definitely still has a role when dealing with data. You still have to make a prediction in your mind, which you can then test by studying relationships in the data.
Have you had trouble building a team to follow in your footsteps, or are younger people naturally more open to data-centric research?
Again, I don’t think science is any different from other enterprises in this regard. A new generation comes along, is more familiar and comfortable with the new techniques, and quickly becomes evangelists for the new way of doing things. Our lab is still a relatively small operation, but so far weíve been able to attract people who want the opportunity to train using the tools that we use and the approach to data sets that we take. What we’re doing is very new and exciting, which helps attract people. Not everybody embraces it with the same level of enthusiasm, of course, and that’s okay. You have to play to individual strengths. But my expectation is that the power of the analytical approach will become increasingly self-evident.
Given the size of the data sets you’re dealing with, you must be highly dependent on your IT department. Does the technical team inform you about how new tools can make your job easier?
I hope that relationship will emerge over time. It’s not quite there at this point. The winning combination would be someone in that role who speaks a bit of my language and someone in my role who speaks a bit of theirs. In science there’s no business analyst to help you, so you have to play that role yourself. But Iíd love to see that type of function emerge over the next few years where we have someone who understands what we’re trying to do and who could tailor a computing infrastructure to optimally meet our needs.