Byline Nathan Collins
Scott Linderman always kind of knew he wanted to work at the intersection of computer science, statistics and neuroscience. He wrote about the interface between computation and biology for his college application essays back in high school. After college, he worked as an engineer at Microsoft, but maintained his interest in the brain, reading neuroscience textbooks on the bus to and from work. A class he took on machine learning and the brain at the University of Washington eventually motivated him to take the plunge and apply to graduate school. Not long after that he started his PhD at Harvard University.
Today, Linderman is an assistant professor of statistics, which on the surface might not seem like it has much to do with neuroscience. In fact, he said, statistics and computer science have become essential components of brain science in the last decade. A member of the Wu Tsai Neurosciences Institute who arrived on campus just two months ago, Linderman’s research is focused on developing statistical models and computer algorithms that try to make sense of a flood of data on the activity of neurons in animal brains. Here, he explains why neuroscience needs statistics and computer science and what he hopes to accomplish at that intersection.
When you started graduate school a little under a decade ago, neuroscience was undergoing a sort of data revolution. What was that about, and what did computer science and statistics have to do with it?
One thing is that recording techniques were really blossoming into what they’ve become today, which is a powerful set of tools for interrogating neural circuits at scales that we really haven’t had access to in the history of neuroscience. We already had techniques to study populations of neurons, but there’s been an exponential growth in the numbers of neurons that can be recorded simultaneously, and not only recorded but also perturbed using optogenetics.
These new recording techniques let us study the brain with unprecedented precision, but they also demanded new computational techniques and statistical techniques for analyzing the types of data that were being collected. Fortunately, the field of machine learning was taking off at the same time, offering new possibilities for modeling large, complex datasets.
Why did that data demand new techniques? Was it just that there was a lot more data?
The first challenge is that the sheer scale of the data is just orders of magnitude larger than what we’ve been dealing with in the past, and that presents computational challenges. A single experiment can generate hundreds of gigabytes of data per hour. You need new computational infrastructure and new techniques just for data wrangling. Small differences in an algorithm can make a big difference in your ability to rapidly iterate on new ideas versus having to spin something off and come back a week later to see what happens.
But scale is just scratching the surface. It’s also that as we record more and more neurons, we realize that the complexity of the data is greater. When you look at the computations that are being performed by neural circuits, they’re manifested in highly nonlinear dynamical systems, and so as we collect richer and richer datasets, we’re also trying to push the flexibility of the probabilistic and statistical models that we’re applying to those data in order to get a clearer picture of what’s going on in these neural circuits, how they’re operating and computing.
The third challenge is that it’s not just that we’ve recorded more neurons for a longer period of time from a single animal. We’re collecting very heterogeneous datasets that are stitched together from many individual subjects from different trials and different experimental conditions. In order to answer the scientific questions, we have to be thinking about how to put these things together in a principled way in order to make the most of the information we have.
That’s interesting – there’s this flood of data, but at the same time it’s got this complex structure and you have these complex hypotheses, many of them based in machine learning and artificial intelligence, that you’re trying to test. It’s an awesome challenge.
Yes, and that’s why I think it’s fun to be working in this space right now. There are incredible opportunities but still very clear challenges that need to be addressed in order to make progress.
My work has been thinking about how to take current models to the next level, endowing them with greater flexibility to capture more complex dynamics, while still retaining some interpretability. Our models need to be able to make predictions with high accuracy – this is critical for many applications, like building brain-machine interfaces – but to advance neuroscience, we also want to understand the principles that guide those predictions. As our models become more sophisticated, it becomes harder to understand how they work. So I’m trying to thread a needle, balancing flexibility and interpretability.
We’ve been talking about this in sort of abstract terms, but as a practical matter, you’re studying relatively simple animals including zebrafish and mice. One animal you’re interested in, a tiny worm, C. elegans, has just 302 neurons, compared to something like 100 billion in the human brain. Can we learn anything about our own brains from studying something so simple?
C. elegans is one of the few animals in which we can really go from recordings of the neural circuit in action to detailed measurements of behavior and sensory inputs, so we can see the worm’s neural computational system unfolding in real time. That’s only recently become possible, and so I’ve found it to be a really exciting model to be thinking about as we try to tie theories of neural computation to actual measurements of neural activity.
To your question about how studying model organisms ultimately translates to more complex nervous systems like ours, I think there is good reason to believe that certain principles of neural computation may be maintained from one species to the next. And in any case, the techniques that we need to develop in order to analyze this data and start to ask questions are going to persist as we move up the stack to more complex organisms. I think the types of lessons we’re learning as we look at simple organisms will translate to more complex organisms as well.
Now that you’re here at Stanford, what are your goals for the immediate future?
I’m really excited about the potential for new collaborations here. Not only are we conceptually at the intersection of many disciplines, we’re physically located within about a five minute walk of world class departments that are all contributing to this scientific endeavor. I’m looking for people who are working on problems in neuroscience and looking for new computational collaborators to help analyze and understand data, but also I think this is a great opportunity to get some word out to people on the computational and statistical side. Maybe you’re curious about neuroscience but haven’t done anything there. That’s not a problem from my perspective. I think those are really ideal candidates to lure into this type of interdisciplinary work. There’s a lot of really interesting challenges going on right now in neuroscience that could benefit from new thinking and new ideas.