Research Supported by the NITMB
Research supported by the NSF-Simons National Institute for Theory and Mathematics in Biology focuses on developing mathematical frameworks that illuminate emergent capabilities of biological systems. We are developing the theory and mathematics needed to highlight the fundamental roles of physical, chemical, and biological constraints as organizing principles for understanding biological mechanisms. NITMB will focus on fields of mathematics where the constraints of biological systems show promise for novel developments, including geometry, topology, optimization theory, dynamical systems, high-dimensional statistics, mathematical machine learning, inverse problems, statistical inference, and stochastic processes. Understanding constraints from mathematical and biological perspectives provides a unique opportunity for interdisciplinary work, with mathematical research that will advance our knowledge of biology and biology research that will catalyze new mathematics.
Explore detailed project highlights for NITMB supported research on the Research Highlights page
Emergence of Bistability in Circadian Clock Temperature Response
Michael Rust
(University of Chicago)
Principal Investigator
Aaron Dinner
(University of Chicago)
Collaborator
Abstract: A defining feature of biological time-keeping is temperature compensation, in which the amplitude of an oscillation varies such that the frequency is nearly invariant to temperature. Temperature compensation enables circadian rhythms to anticipate the length of the day correctly even as temperature varies. Because individual enzymatic reactions are typically sensitive to temperature, temperature compensation is thought to be an emergent property of a reaction network. What features of a reaction network can enable temperature compensation is an open mathematical question. In cyanobacteria, interactions of the proteins KaiA, KaiB, and KaiC give rise to near-24-hour oscillations in phosphorylation of KaiC. KaiA binding to KaiC promotes phosphorylation of KaiC, while KaiB binding to KaiC promotes dephosphorylation. Preliminary work indicates that, as temperature decreases, KaiC phosphorylation can occur in a KaiA-independent manner, which can be viewed as modulating the feedback in the network. Furthermore, multiple steady states that depend on initial conditions appear at low temperatures. A computational approach based on algebraic geometry will be used to map the fixed points and bifurcations of models of the Kai system at various temperatures and, in turn, to elucidate general principles about the role of feedback in temperature compensation. The computational results will be tested experimentally using mutants of the Kai proteins that alter the KaiA-KaiC interaction. By revealing how feedback in a reaction network is related to the fixed-point and bifurcation structure elucidated by the approach based on algebraic geometry, the project will advance dynamical systems theory.
Quantifying Natural Movement Variation in the Brain and Behavior
Stephanie Palmer
(University of Chicago)
Principal Investigator
Jason MacLean
(University of Chicago)
Collaborator
Abstract: The ultimate goal of neural processing is to drive reliable behaviors in an animal's natural environment, maximizing fitness in a complex environment. Our hypothesis posits a causal, evolved relationship between the complexity and structure of the brain and behavior and requires new mathematical approaches to both quantifying and recapitulating this matching between natural and neural state space. To deepen our understanding of motor encoding and control, we will integrate behavioral recording in freely moving mice executing a seed reach-to-grasp task with extensive, longitudinal tracking of neuronal activity across various layers in the motor cortex. This involves pairing detailed behavioral observations with comprehensive neuronal population recordings to characterize the mapping between the brain's control space and the resulting movement space exhibited by the arm and paw during the challenging task. To establish a direct connection between behavioral and neural data, we will utilize machine learning tools such as VAEs and U-nets to quantify the latent space of both datasets. Our primary objective with these advanced machine learning approaches is to identify interpretable features within the representations, and to develop new mathematics to define trajectories in this feature space. The goodness of fit will be assessed by training models on behavioral data and evaluating their ability to generate realistic limb and paw trajectories, with the constraints that these are differentiable and low-dimensional. Throughout our investigation, our specific focus will be on uncovering the features of the neural response that drive variable yet successful reach movements. By examining how the brain's code aligns or deviates from behavioral complexity, our goal is to reveal new principles of motor encoding and control that operate over both evolutionary and organismal timescales
Uncovering the Genetic Fitness Landscape Behind Bacterial Motility in Complex Environments
Jasmine Nirody
(University of Chicago)
Principal Investigator
István Kovács
(Northwestern University)
Collaborator
Abstract: Bacterial motility is a complex phenomenon that plays a fundamental role in widespread biological processes including pathogenesis and bioremediation. Motile bacteria perform chemotaxis – migration under the influence of a chemical gradient – to find conditions optimal for their fitness and survival. This movement can be either towards (positive chemotaxis) or away from (negative chemotaxis) a chemical stimulant. One important feature of this network is its ability to adapt to changes in the environment, allowing cells to maintain a high sensitivity to their environment over a wide range of chemical backgrounds. In natural environments, this sensory process also takes place in a cluttered and noisy mechanical background, as cells are constantly exposed to heterogeneous, variable physical cues. Despite this, the vast majority of studies into bacterial motility and chemotaxis have been performed in unconfined liquid media or along flat surfaces. In this project, we aim to develop both mechanistic and evolutionary insights into bacterial motility and adaptation under various environmental conditions. A key impact of the environment is posed by limiting the free path length for the bacteria. We will characterize the corresponding chord-length statistics and develop a modeling framework that takes into account geometric constraints. We will also revise the current theoretical models on chord-length statistics (Levitz&Tchoubar, 1992), as they rely on assumptions that are not valid in the planned experiments, leading to qualitative differences. We will also develop a hyper-network framework to infer fitness consequences of changes in the chemotactic gene regulatory network. Combining biophysical experiments using a novel microfluidic setup and modeling with predictive hyper-network analysis, we outline an investigation to characterize how the chemotactic sensing pathway adapts over multiple timescales to improve bacterial performance and fitness in a range of complex, naturalistic environments.
Learning Rules of Epithelial Tissue Dynamics
Margaret Gardel
(University of Chicago)
Principal Investigator
Cara J. Gottardi
(Northwestern University)
Collaborator
Vincenzo Vitelli
(University of Chicago)
Collaborator
DNA as a Phosphate Reservoir: Spatiotemporal Modeling of Extracellular DNA (eDNA) Dynamics in Biofilms
Arthur Prindle
(Northwestern University)
Principal Investigator
David Chopp
(Northwestern University)
Collaborator
Abstract: DNA is the genetic code found inside all living cells and its molecular stability can also be utilized outside the cell. While extracellular DNA (eDNA) has been identified as a structural polymer in bacterial biofilms, whether it persists stably or can be reclaimed for further cellular activity remains unknown. Here, by imaging eDNA dynamics within undomesticated Bacillus subtilis biofilms, we propose to test the hypothesis that DNA acts as a temporary structural scaffold that is later metabolized for cell growth. Specifically, we found that eDNA is produced throughout biofilm development before being degraded in a spatiotemporally coordinated pulse. We identified YhcR, a secreted Ca2+-dependent nuclease, as responsible for DNA degradation. As predicted by a preliminary mathematical model, biofilms lacking this nuclease fail to reclaim DNA for its phosphate contents, thereby decreasing biofilm fitness. Our results identify a secreted nuclease that is crucial for reclaiming eDNA during biofilm development, expanding our knowledge of DNA and suggesting new targets for biofilm control.
Natural Motion and Optimal Prediction in a Complete Retinal Population
Gregory W. Schwartz
(Northwestern University)
Principal Investigator
Stephanie Palmer
(University of Chicago)
Collaborator
Abstract: Signals in the natural world are often characterized by a mixture of amplitude scales, like the quiet and loud segments of a musical recording. This property is manifested in the form of non-Gaussian, heavy-tailed distributions and nonlinear dependencies, both over time and across signal components. This is in strong contrast to the Gaussian and linear features typically assumed when modeling input signals. In the context of neuroscience, natural signals pose a serious challenge for sensory systems, which must adapt on the fly in order to efficiently encode them. Our recent work has demonstrated that the motion of objects in natural scenes also contains a mixture of scales, with a locally averaged velocity amplitude that fluctuates significantly on sub-second timescales. We have shown that this behavior can be modeled using an autoregressive Gaussian scale-mixture (ARGSM) model, which captures the temporal correlation structure of both the velocity and the fluctuating scale. Retinal responses to object motion have been characterized previously using carefully controlled artificial stimuli with Gaussian and linear statistics, revealing an efficient predictive code through the information bottleneck method. Here, we will extend this analysis to more naturalistic stimuli by incorporating a fluctuating scale variable matched to the statistics of natural scenes. We will bring together new experimental access to full RGC populations and our new theory about predictive coding and natural motion statistics. We will quantify predictive information about 1D and 2D motion trajectories in complete populations and sub-populations of mouse RGCs. Theoretically, this will require new calculations of information bottleneck-optimal representations under the ARGSM model. These will allow us to assess the performance of the retinal code using state-of-the-art recordings of mouse retinal ganglion cells (RGCs). Of particular interest are the contributions of the great diversity of RGC subtypes to the neural coding of these dynamically rich, naturalistic stimuli.
Understanding Synaptic Wiring Rules in the C. Elegans Brain
István Kovács
(Northwestern University)
Principal Investigator
Engin Özkan
(University of Chicago)
Collaborator
Abstract: Ongoing advances in brain imaging and single-cell RNA sequencing have produced a massive amount of data on the genetic identity of neurons and their synaptic connections. However, these advances set up a complexity bottleneck: We need guiding frameworks to integrate and conceptualize this data, distill the key emergent patterns and aid new biology discovery. Addressing this knowledge gap, we will focus on the following critical questions: i) What are the connection rules of neural networks, governing the emergent network structure and wiring mechanisms? ii) How can we integrate the existing connectomics, proteomics and transcriptomics datasets into a coherent and predictive theoretical framework? To start, we need to decode the genetic programs behind synapse formation and maintenance to make sense of the data and gain insight into the network organization and functional circuitry of the brain. As a solution, we propose a scalable modeling framework building upon our recently pioneered Spatial Connectome Model (SCM). The central hypothesis of the SCM is that synapses emerge due to an underlying wiring rule network that connects pre-synaptic and post-synaptic neuron features. First, we will develop scalable solutions to the SCM, using two alternative approaches, a Bayesian framework and an expectation maximization route. While the original model was linear, the underlying biological rules are highly non-linear and we aim to introduce and solve the non-linear SCM. In addition, we will introduce and solve a local SCM, allowing for the wiring rules to vary over different parts of the network. We will also develop a novel mathematical framework to debias experimental data for protein-protein interactions, extending the maximum entropy framework. Our research strategy focuses on the C. elegans as a model organism and combines tools from neuroscience, molecular biology, network science, and statistical physics to capture complex wiring mechanisms as well as key biological constraints. We will provide a series of falsifiable predictions, starting with i) neuron wiring rules, and ii) inferring missing synapses from the input data, as well as iii) inferring changes in the connectome upon genetic perturbations. We will then experimentally validate the key predictions of our computational framework using in vivo and in vitro approaches in the C. elegans.
Modeling and Analysis of Synchronous Behavior in Biological Systems
Daniel Abrams
(Northwestern University)
Principal Investigator
Guy Amichay
(Northwestern University)
Collaborator
Abstract: How does collective synchronous behavior emerge in living systems? We focus on two readily observable and malleable systems: firefly swarms (flashing in unison) and groups of fiddler crabs (waving their large claws in sync). In both examples, these groups are composed of males attempting to woo females through such collective displays. Our research will use four complementary approaches: (i) fieldwork to compile unparalleled new datasets (including the development and employment of novel experimental paradigms, perturbing animals with artificial conspecifics in the wild), (ii) development of new mathematical models for understanding the mechanism underlying biological sync (with potential implications for broader evolutionary theory), (iii) monitoring of a firefly population for conservational efforts, and (iiii) outreach and dissemination of our models and results to the public via podcast, visual arts, and print journalism. On the mathematical side, we are particularly interested in novel coupled oscillator models that can exhibit both phase synchrony and "breathing" chimera states that have been glimpsed in preliminary data. These models include oscillators coupled not just through phase but also through amplitude and / or frequency. We will explore such models on coupling networks of increasing realism, and we also plan to study the model selection problem in the context of oversampled dynamical data, where conventional approaches may need to be modified.
Keeping Growing Clocks in Sync
Rosemary Braun
(Northwestern University)
Principal Investigator
Michael Rust
(University of Chicago)
Collaborator
Abstract: The circadian clock, an endogenous near-24 h rhythm that can be entrained to environmental time cues, is a ubiquitous feature of life on Earth. An ancient mechanism for circadian oscillation is found in cyanobacteria, photosynthetic microbes found across the globe. Oscillations are based on cyclic phosphorylation of KaiC molecules, which are coupled together to create coherent bulk oscillations, a phenomenon which can be reconstituted using purified proteins (KaiA, KaiB, KaiC). However, in some cyanobacteria, the growth rate can be as much as 10x the oscillator frequency. In such a situation, the vast majority of protein at the end of a circadian cycle will be new protein that was not present at the beginning of the cycle. We will answer fundamental questions in this system using a combination of theoretical approaches and in vivo and in vitro measurements: How are new KaiC molecules “brought up to speed” without disturbing the frequency? Is this achievable by Kai proteins in isolation or does it require in vivo mechanisms? Is there a fundamental upper limit to the growth rate that is still compatible with oscillation? In addition to studying specific chemical reaction models of the Kai system, we will study the generic effects of growth on coupled oscillator systems by introducing a birth-death process into Kuramoto-like models of coupled phase oscillators to study the transition between phase-locked rhythms and desynchrony."
What’s the Place for Planning?
Malcolm MacIver
(Northwestern University)
Principal Investigator
Daniel Dombeck
(Northwestern University)
Collaborator
Matthew Kaufman
(University of Chicago)
Collaborator
Bradly Stadie
(Northwestern University)
Collaborator
Abstract: The crux of our proposed work revolves around the contrast between the diminishing role of planning in artificial intelligence (AI) and its significance in biological contexts. AI has shifted from an emphasis on planning to deep reinforcement learning (RL), particularly in environments where it thrives, like fully observable and reversible situations. However, RL struggles in irreversible scenarios, such as those involving fatal risks or mechanical constraints in robotics, because its trial-and-error approach is unsuitable. These issues reveal a gap in our understanding of problems that are better addressed by planning rather than RL. To bridge this gap, the study introduces a combined theoretical and empirical approach. Theoretically, it proposes a new framework called Q-Zero, to explore the extent to which agents can plan in complex settings, aiming to uncover advantages of planning in domains identified by systems neuroscience and evolutionary neurobiology. Empirically, the project will examine planning in predator-prey interactions and in limb movement planning. This research intends to fundamentally advance our grasp of the interplay between planning and learning in AI and biology.
A Theory of Models for Complex Ecology
Seppe Kuehn
(University of Chicago)
Principal Investigator
Madhav Mani
(Northwestern University)
Collaborator
Abstract: This proposal aims to develop new mathematical techniques to gain a deeper understanding of the "Theory of Models" in complex living systems. We will employ this formalism to explore how dynamic variations in the soil microbiome contribute to metabolic stability in the face of environmental changes. Effective models are essential for interpreting complex ecosystem functions in response to environmental disturbances, allowing us to quantify data and propose underlying mechanisms. Ecosystems offer an ideal setting to enhance our understanding of the mathematical constraints associated with the Physics of Models. We will develop new algorithms to facilitate data-driven model discovery and the identification of collective variables. Building upon groundbreaking research related to "Sloppy Models"~\cite{machta_parameter_2013}, our approach will harness the power of massively parallel experiments on soil microcosms, precise quantification of metabolite dynamics, controlled perturbations, and quantitative sequencing data. Our robust mathematical framework will enable us to develop ecological models that are easily interpretable.
Inferring Models for Microbial Dynamics
Stefano Allesina
(University of Chicago)
Principal Investigator
Niall Mangan
(Northwestern University)
Collaborator
Mary Silber
(University of Chicago)
Collaborator
Rebecca Willett
(University of Chicago)
Collaborator
Abstract: Microbial communities are widespread from the human gut to the deep ocean and influence systems including animal development, host health, and biogeochemical cycles. Characterization of complex communities is challenging, as morphology, physiology, evolution, and sensitivity to the environment all influence microbial interactions-- richness not captured in commonly used Lotka-Volterra-style models originally developed for macro-scale ecological systems. High throughput sequencing has enabled high-resolution quantification of populations within natural and synthetic communities, which could aid in the development of novel mathematical models to explain complex interactions such as diauxic shifts, cross-feeding, biofilm formation, and pH modification. Data-driven model development presents several mathematical challenges: 1) usually measurements of relative but not absolute abundances are available, 2) unmeasured dynamic variables such as nutrient levels can strongly impact populations, and 3) evaluation of all possible models and interactions is costly due to the combinatorial complexity of possible interactions. Challenges 1 and 2 manifest mathematically as identifiability issues; multiple models and parameter sets can produce the same trends in the data. To identify ensembles of possible models, we will perform parameter estimation across tens of thousands of possible models capturing the range of interaction mechanisms. Informed by commonalities of structure and behavior in the ensemble and statistical analysis of fluctuations we will develop identifiability-informed model sampling techniques to accelerate future screens and infer absolute abundance dynamics from relative abundance data.
Topological Analysis of Biological Data
Samantha Riesenfeld
(University of Chicago)
Principal Investigator
Shmuel Weinberger
(University of Chicago)
Collaborator
Richard Carthew
(Northwestern University)
Collaborator
Abstract: Understanding the underlying geometry and topology of biological data is a challenging problem that is key to improving inference. This new collaboration will adapt topological data analysis (TDA) tools to high-dimensional biological data, focusing on two datasets with distinct, complementary features: (i) images of morphological variation across closely related species, and (ii) transcriptomic samples of gene expression variation across related tissue samples. These very different data sets will give us a chance to study two different aspects of the usual TDA pipeline: one is the possibility of defining “testable” invariants (along the lines of property testability), i.e. ones that do not require full consideration of all the data to be approximated, and seeing whether they have utility for biological applications. The second is whether the space underlying a data set is actually Euclidean, or whether its embedding in Euclidean space (induced by the use of a number of measurements of each datum) induces metric distortion. We hope to approach this using the theory of metric distortion, and hope that this initiates a new large scale approach to understanding the geometry of data.
Inverse problem of inferring adaptive strategies from the statistics of rare events
Arvind Murugan
(University of Chicago)
Principal Investigator
Yogesh Goyal
(Northwestern University)
Collaborator
Abstract: Biology has a diverse range of adaptation strategies to deal with changing environments that range from Darwinian multi-generational processes which play out over millions of years to within-a-lifetime learning. The underlying mechanistic basis of these strategies is highly varied and context dependent. The traditional time-consuming approach has been to distinguish these strategies with mechanistic experimental approaches. Here we propose building a mathematical framework to guide high throughput experiments that will use rare event sampling to reveal learning and adaptation strategies. We will apply our mathematical framework to experiments on drug resistance in cancer cells and in microbes. The proposed work here will (a) solve the inverse problem of inferring a broad class of adaptation strategies with finite heritability from the shape of rare-event distributions; (b) tailor proposed mathematics to specific regimes accessible in current high-throughput experiments, (c) develop novel experimental workflows for studying drug resistance in cancer cells and microbes.
Developing predictive frameworks for the control of non-equilibrium cellular membrane dynamics
Suriyanarayanan Vaikutanathan
(University of Chicago)
Principal Investigator
Petia Vlahovska
(Northwestern University)
Collaborator
Abstract: The material properties of biological membranes control a vast array of molecular processes. Biological lipid membranes behave like fluids in plane and exhibit elastic fluctuations out of plane. While the basic driving forces for describing membrane biophysics are easy to formulate, their emergent properties and morphologies they can elicit remain important open questions. In this project, we seek to leverage modern advances in non-equilibrium statistical mechanics along with ideas from representation learning and AI to identify low dimensional physical laws for the non-equilibrium dynamics of biomimetic membranes. We will use a combination of experiments by Petia Vlahovska and co-workers, and theory from Petia Vlahovska, Suri Vaikuntanathan and co-workers. Briefly, data from experiments studying the fluctuations of model lipid membranes in electric fields mimicking polarized cellular membranes will, to the best of our knowledge for the first time, be analyzed using dimensional reduction techniques to infer physical laws and constraints in low dimensional spaces. These constraints will then be related to modern non-equilibrium thermodynamic bounds. If successful, this integration of theoretical approaches based on thermodynamics and AI and experiments with biomimetic membranes will compactly reveal how biological lipid membrane dynamics can be described, controlled, and leveraged. The project will establish a new collaboration of faculty from University of Chicago and Northwestern University with complementary expertise in statistical and continuum mechanics modeling, and experimental biomimetic membrane systems.
Towards the Molecular Basis of Graft Compatibility
Adilson Motter
(Northwestern University)
Principal Investigator
Nyree Zerega
(Northwestern University)
Collaborator
Linking gene expression profiles to firing properties in the Drosophila thermosensory circuit
William Kath
(Northwestern University)
Principal Investigator
Marco Gallio
(Northwestern University)
Collaborator
Abstract: We will use neurons that are part of the Drosophila thermosensory circuit as models to explore the extent to which molecular profiling data obtained through single-cell patch-sequencing can be used to predict a neuron’s firing properties. We will develop new information theory-based data filtering and similarity methods to compare patch-seq and single-cell data and obtain improved estimates of ion channel expression in these neurons. We will then produce anatomically realistic models of thermosensory neurons by using the expression data to populate models of these neurons with candidate ion channels and regulators. Furthermore, we will extend current evolutionary algorithms to incorporate ion expression correlation data to improve fits to in-vivo electrophysiological measurements of each cell type. Our goal is two fold: to determine the extent to which a neuron’s firing properties can be extrapolated from its gene expression profile and to explore how underlying variability in a neuron’s repertoire of components can nevertheless lead to robust firing properties. Overall, we expect that our methods and models will bring clarity regarding the patterns of expression of ion channels, receptors, and signal transduction components that are key determinants of the functional properties of neurons in the thermosensory circuit, point to nodes of the network that may be particularly sensitive to perturbation, and make specific predictions on the effects of such perturbations, eventually directing new experiments that exploit cell-type specific RNAi and genetic mutants.