Multi-scale models in infectious disease epidemiology
Models that link within- and between-host processes are potentially important tools in disease ecology, but the disease research community remains somewhat divided about when and how they should be used. We discussed cross-scale modeling at a recent CIDD lunch, and while the discussion was fairly free-wheeling and ad hoc, a few consistent themes emerged. Here, I draw on those themes, as well as a few of my own observations on the role – realized or potential – that multi-scale modeling plays in modern infectious disease epidemiology.
I'm an empiricist working primarily on between-host modeling, and that perspective flavors the challenges I identify. I hope this post will serve as a jump-off point for a broader discussion among folks with more varied backgrounds.
Introduction
Multi-scale models use within-host dynamics to drive between-host epidemiological (or evolutionary) processes. This is implemented by treating state-transition rates in the between-host process (for example, transmission and disease-induced mortality rates in a classic SIR model) as functions of dynamics within the host. In their 2008 TREE paper, Mideo, Alizon, and Day note that multi-scale models are only essential in cases where reciprocal feedbacks link within- and between-host processes (for example, this occurs when SIR state-transitions depend on within-host dynamics, and also within-host dynamics depend on conditions at the population level). Mideo, Alizon, and Day observed that at the time they wrote, these reciprocal feedbacks were not actually included in the majority of publications on multi-scale models for pathogen evolution; in these cases, the resulting inferences could have emerged from examination of a single scale.
Yet in many situations, within- and between-host processes do have reciprocal feedbacks, whether our models account for them or not. For example, reciprocal dependencies exist in systems where within-host dynamics depend on initial dose, and where individual infectiousness or morbidity depend on both immune and pathogen dynamics. Since both of these dependencies are pervasive among infectious diseases, there is reason to think that multi-scale models may have broad applicability.
The apparent biological plausibility of multi-scale models raises two questions. 1) Which systems stand to benefit the most from multi-scale studies approaches? 2) If multi-scale models have so much utility, why aren't we using them more frequently?
What impedes multi-scale model development?
I see three clear barriers inhibiting multi-scale model development, each of which I describe below.
1. Language and style
2. Articulating model objectives
3. Gathering and incorporating data
1. Language and style
Within- and between-host modeling cultures are fairly distinct from one another, and for the inexperienced practitioner this poses a challenge for model design. Researchers trained in between-host dynamics come from a culture in which the vast majority of models derive from a single basic structure, Susceptible-Infected-Recovered (SIR). A rich knowledge of SIR-like models is crucial for disease ecologists, but comes with two important limitations.
First, SIR-focused modelers may not be particularly well-versed in other kinds of consumer-resource interactions, like mutualisms, competition, or predator-prey dynamics (for a good synthesis of consumer-resource models, see Laffery et al. Science 2015). This limits the set of model structures we draw from when considering the various co-stimulatory relationships governing interactions between pathogen and host immune responses.
Second, population-level disease modelers are used to working on models with relatively well-defined states. S, I, and R are good jump-off points for almost every between-host disease model. Although one state, pathogen population size, is an obvious fixture of most within-host models, constraining the host immune response to a limited number of appropriate compartments is really difficult. This issue is somewhat compounded by the immunology community's emphasis on specificity: binning different groups of immune markers into single entities is counter to current research in that academic culture. I suspect that the juxtaposition of population-level disease modeling's tendency to bin and immunology's tendency to split states stymies many multi-scale modeling efforts before they even begin.
2. Articulating model objectives
Disease researchers trained on between-host processes (or at least, the one writing this post) sometimes have a tendency to model first, and ask questions later. Because the equilibrium conditions of the SIR model and its various derivatives have been so extensively studied, just constructing and parameterizing an SIR-like model can offer a number of useful insights for a given system. Selecting which insight is most important from the outset of the modeling endeavor isn't always imperative.
The ability to choose model goals after-the-fact is less available for within-host – and consequently, multi-scale – models. Modeling first and asking questions later isn't feasible, since the set of possible model states and configurations in-host is so broad. Instead, within-host modelers tend to emphasize developing a specific question from the outset of an investigation, and selecting model states aimed at addressing that particular question.
3. Gathering and incorporating data
There is a huge volume of published data on within-host dynamics for all kinds of pathogen-host combinations (I can attest that this is true from Mycoplasma ovipneumoniae, which very few people care about, so I'm willing to postulate that it's probably true for whatever agent you're studying right now). Making sense of these studies, however, isn't trivial, and this is especially true for an outsider with limited microbiological and immunological literacy. Here's why: unlike many between-host datasets, which are subject to relatively few (well-studied) assumptions with respect to data collection (e.g., sightability, population closure, etc.), the assumptions underlying many within-host datasets – which often directly manipulate the host's health through gene knock-outs, particular nutritional regimens, and chronic stress – are daunting. Understanding and adjusting for potential biases also requires a reasonable working knowledge of a wide range of immunological and microbiological methods and their idiosyncrasies (how accurate IS that ELISA, anyway?), a good understanding of the study conditions, and an appreciation for how those conditions differ from conditions in nature. Furthermore, longitudinal datasets, or other data that capture temporal dynamics within hosts are relatively rare in immunology. For ecological modelers whose data gold standard is rich, replicated timeseries, a lack of analogous data within the host seems acutely problematic.
However, this need not be the case. Other data structures (especially multi-sectional datasets) do contain relevant (albeit sometimes less) information, and replication under slightly varied conditions is exactly the pretext for Bayesian inference. The empirical challenge – largely solved through partially observed Markov process modeling, approximate Bayesian computation and other approaches, still rarely implemented – is to figure out ways to appropriately leverage these less-than-ideal datasets. Conflicting attitudes about when particular datasets can and should be used may impede this process, but the statistical infrastructure for these models already exists.
When is a multi-scale model worth the effort?
Despite the challenges, I am convinced that multi-scale models are crucial to advancing infectious disease epidemiology. Here's why:
We're currently very good at modeling epidemics for which Infected is Infected is Infected. Despite the clamour surrounding Ebola modeling, I think between-host models generally perform very well for acutely immunizing infectious, in which the preponderance of heterogeneity is attributable to host behavior.
BUT
With the notable exception of HIV (perhaps a special case, given that so much heterogeneity in secreted load can be explained by infection age), we do not have good population-level models of most chronic pathogens. Population-level epidemiology of pathogens like Herpes Simplex Virus, cryptosporidium, or tuberculosis that are typically latent but occasionally “flare up” (apparently at random) within particular hosts is not well-characterized*. Understanding transmission dynamics of these pathogens requires understanding the within-host processes that allow for re-emergence, and the population-level consequences of increased individual infectiousness. In short, what allows sporadic epidemic transmission of a pathogen that was locally endemic the whole time. The answer is sometimes pathogen evolution, but I posit that other factors may also contribute. Within-host complexities that lead to sustained infectious periods and pathogen re-invigoration has been well-reviewed (for example, see the excellent review by Virgin et al. 2009), but I have yet to see these ideas extensively translated into population-level predictions.
I suspect that multi-scale modeling may be most useful at a different point in a research program than between-host models. While between-host models often occupy the ends of epidemiological projects (e.g., though assessing R_0's sensitivity to particular aspects of the disease process), and lead directly to management recommendations, multi-scale models might be best used to generate hypotheses, identify key pieces of missing information, and constrain the relevant space of unknown parameters for future experimental investigation. As a consequence, model outputs might shift from estimation-focused between-host models to a focus on mathematical sensitivity. This transition will require some flexibility on the part of practitioners; however, although the benefits are not entirely clear from the outset, multi-scale approaches hold a lot of promise, especially for pathogens that continue to defy clear population-level modeling.
Parting thought
Perhaps the real challenge, then, is that multi-scale models might be most useful in situations that defy standard assumptions both within and between-hosts. I think that the following kinds of infections fall into this group:
infections with relevant spatial compartmentalization within the host (e.g., HepC, HSV)
epithelial infections (e.g., infections of the respiratory, GI, or reproductive tracts)
infections that stimulate autoimmune elements of the immune system
infections in which immune regulation misfires (either by over- or under-response)
References
Mideo N, Alizon S, Day T. 2008. Linking within- and between-host dynamics in the evolutionary epidemiology of infectious diseases. Trends in Ecology and Evolution 23(9); 511-517.
Handel A, Rohani P. 2015. Crossing the scale from within-host infection dynamics to between-host transmission fitness: a discussion of current assumptions and knowledge. Philosophical Transactions of the Royal Society B 370; 20140302
Lafferty KD, DeLeo G, Briggs CJ, Dobson AP, Gross T, Karis AM. 2015. A general consumer-resource population model. Science 349(6250); 854-857.
Virgin HW, Wherry EH, Ahmed R. 2009. Redefining chronic viral infections. Cell 138(1); 30-50.
Comments and discussion
Jessica Conway made the following comment, which I'll quote directly:
[You claim that] "Understanding transmission dynamics of these pathogens requires understanding the within-host processes that allow for re-emergence, and the population-level consequences of increased individual infectiousness." It doesn't really, does it? Take HSV. To understand population-level spread, you need outbreak pattern data, length, duration, and frequency. Do you need to know what drives them? Probably not to understand spread. Maybe yes, to determine interventions that minimize spread, but for that you pretty much want to stop flare-ups, and the between-host scale is not helpful. To understand evolution however likely requires both scales.
This is a really good point. In-host models are probably sufficient, so long as initial dose isn't a critical determinant of infection outcome in the host. If dose is critical, then I think there is an opportunity for feedback across scales that could make multi-scale models helpful.
Models that link within- and between-host processes are potentially important tools in disease ecology, but the disease research community remains somewhat divided about when and how they should be used. We discussed cross-scale modeling at a recent CIDD lunch, and while the discussion was fairly free-wheeling and ad hoc, a few consistent themes emerged. Here, I draw on those themes, as well as a few of my own observations on the role – realized or potential – that multi-scale modeling plays in modern infectious disease epidemiology.
I'm an empiricist working primarily on between-host modeling, and that perspective flavors the challenges I identify. I hope this post will serve as a jump-off point for a broader discussion among folks with more varied backgrounds.
Introduction
Multi-scale models use within-host dynamics to drive between-host epidemiological (or evolutionary) processes. This is implemented by treating state-transition rates in the between-host process (for example, transmission and disease-induced mortality rates in a classic SIR model) as functions of dynamics within the host. In their 2008 TREE paper, Mideo, Alizon, and Day note that multi-scale models are only essential in cases where reciprocal feedbacks link within- and between-host processes (for example, this occurs when SIR state-transitions depend on within-host dynamics, and also within-host dynamics depend on conditions at the population level). Mideo, Alizon, and Day observed that at the time they wrote, these reciprocal feedbacks were not actually included in the majority of publications on multi-scale models for pathogen evolution; in these cases, the resulting inferences could have emerged from examination of a single scale.
Yet in many situations, within- and between-host processes do have reciprocal feedbacks, whether our models account for them or not. For example, reciprocal dependencies exist in systems where within-host dynamics depend on initial dose, and where individual infectiousness or morbidity depend on both immune and pathogen dynamics. Since both of these dependencies are pervasive among infectious diseases, there is reason to think that multi-scale models may have broad applicability.
The apparent biological plausibility of multi-scale models raises two questions. 1) Which systems stand to benefit the most from multi-scale studies approaches? 2) If multi-scale models have so much utility, why aren't we using them more frequently?
What impedes multi-scale model development?
I see three clear barriers inhibiting multi-scale model development, each of which I describe below.
1. Language and style
2. Articulating model objectives
3. Gathering and incorporating data
1. Language and style
Within- and between-host modeling cultures are fairly distinct from one another, and for the inexperienced practitioner this poses a challenge for model design. Researchers trained in between-host dynamics come from a culture in which the vast majority of models derive from a single basic structure, Susceptible-Infected-Recovered (SIR). A rich knowledge of SIR-like models is crucial for disease ecologists, but comes with two important limitations.
First, SIR-focused modelers may not be particularly well-versed in other kinds of consumer-resource interactions, like mutualisms, competition, or predator-prey dynamics (for a good synthesis of consumer-resource models, see Laffery et al. Science 2015). This limits the set of model structures we draw from when considering the various co-stimulatory relationships governing interactions between pathogen and host immune responses.
Second, population-level disease modelers are used to working on models with relatively well-defined states. S, I, and R are good jump-off points for almost every between-host disease model. Although one state, pathogen population size, is an obvious fixture of most within-host models, constraining the host immune response to a limited number of appropriate compartments is really difficult. This issue is somewhat compounded by the immunology community's emphasis on specificity: binning different groups of immune markers into single entities is counter to current research in that academic culture. I suspect that the juxtaposition of population-level disease modeling's tendency to bin and immunology's tendency to split states stymies many multi-scale modeling efforts before they even begin.
2. Articulating model objectives
Disease researchers trained on between-host processes (or at least, the one writing this post) sometimes have a tendency to model first, and ask questions later. Because the equilibrium conditions of the SIR model and its various derivatives have been so extensively studied, just constructing and parameterizing an SIR-like model can offer a number of useful insights for a given system. Selecting which insight is most important from the outset of the modeling endeavor isn't always imperative.
The ability to choose model goals after-the-fact is less available for within-host – and consequently, multi-scale – models. Modeling first and asking questions later isn't feasible, since the set of possible model states and configurations in-host is so broad. Instead, within-host modelers tend to emphasize developing a specific question from the outset of an investigation, and selecting model states aimed at addressing that particular question.
3. Gathering and incorporating data
There is a huge volume of published data on within-host dynamics for all kinds of pathogen-host combinations (I can attest that this is true from Mycoplasma ovipneumoniae, which very few people care about, so I'm willing to postulate that it's probably true for whatever agent you're studying right now). Making sense of these studies, however, isn't trivial, and this is especially true for an outsider with limited microbiological and immunological literacy. Here's why: unlike many between-host datasets, which are subject to relatively few (well-studied) assumptions with respect to data collection (e.g., sightability, population closure, etc.), the assumptions underlying many within-host datasets – which often directly manipulate the host's health through gene knock-outs, particular nutritional regimens, and chronic stress – are daunting. Understanding and adjusting for potential biases also requires a reasonable working knowledge of a wide range of immunological and microbiological methods and their idiosyncrasies (how accurate IS that ELISA, anyway?), a good understanding of the study conditions, and an appreciation for how those conditions differ from conditions in nature. Furthermore, longitudinal datasets, or other data that capture temporal dynamics within hosts are relatively rare in immunology. For ecological modelers whose data gold standard is rich, replicated timeseries, a lack of analogous data within the host seems acutely problematic.
However, this need not be the case. Other data structures (especially multi-sectional datasets) do contain relevant (albeit sometimes less) information, and replication under slightly varied conditions is exactly the pretext for Bayesian inference. The empirical challenge – largely solved through partially observed Markov process modeling, approximate Bayesian computation and other approaches, still rarely implemented – is to figure out ways to appropriately leverage these less-than-ideal datasets. Conflicting attitudes about when particular datasets can and should be used may impede this process, but the statistical infrastructure for these models already exists.
When is a multi-scale model worth the effort?
Despite the challenges, I am convinced that multi-scale models are crucial to advancing infectious disease epidemiology. Here's why:
We're currently very good at modeling epidemics for which Infected is Infected is Infected. Despite the clamour surrounding Ebola modeling, I think between-host models generally perform very well for acutely immunizing infectious, in which the preponderance of heterogeneity is attributable to host behavior.
BUT
With the notable exception of HIV (perhaps a special case, given that so much heterogeneity in secreted load can be explained by infection age), we do not have good population-level models of most chronic pathogens. Population-level epidemiology of pathogens like Herpes Simplex Virus, cryptosporidium, or tuberculosis that are typically latent but occasionally “flare up” (apparently at random) within particular hosts is not well-characterized*. Understanding transmission dynamics of these pathogens requires understanding the within-host processes that allow for re-emergence, and the population-level consequences of increased individual infectiousness. In short, what allows sporadic epidemic transmission of a pathogen that was locally endemic the whole time. The answer is sometimes pathogen evolution, but I posit that other factors may also contribute. Within-host complexities that lead to sustained infectious periods and pathogen re-invigoration has been well-reviewed (for example, see the excellent review by Virgin et al. 2009), but I have yet to see these ideas extensively translated into population-level predictions.
I suspect that multi-scale modeling may be most useful at a different point in a research program than between-host models. While between-host models often occupy the ends of epidemiological projects (e.g., though assessing R_0's sensitivity to particular aspects of the disease process), and lead directly to management recommendations, multi-scale models might be best used to generate hypotheses, identify key pieces of missing information, and constrain the relevant space of unknown parameters for future experimental investigation. As a consequence, model outputs might shift from estimation-focused between-host models to a focus on mathematical sensitivity. This transition will require some flexibility on the part of practitioners; however, although the benefits are not entirely clear from the outset, multi-scale approaches hold a lot of promise, especially for pathogens that continue to defy clear population-level modeling.
Parting thought
Perhaps the real challenge, then, is that multi-scale models might be most useful in situations that defy standard assumptions both within and between-hosts. I think that the following kinds of infections fall into this group:
infections with relevant spatial compartmentalization within the host (e.g., HepC, HSV)
epithelial infections (e.g., infections of the respiratory, GI, or reproductive tracts)
infections that stimulate autoimmune elements of the immune system
infections in which immune regulation misfires (either by over- or under-response)
References
Mideo N, Alizon S, Day T. 2008. Linking within- and between-host dynamics in the evolutionary epidemiology of infectious diseases. Trends in Ecology and Evolution 23(9); 511-517.
Handel A, Rohani P. 2015. Crossing the scale from within-host infection dynamics to between-host transmission fitness: a discussion of current assumptions and knowledge. Philosophical Transactions of the Royal Society B 370; 20140302
Lafferty KD, DeLeo G, Briggs CJ, Dobson AP, Gross T, Karis AM. 2015. A general consumer-resource population model. Science 349(6250); 854-857.
Virgin HW, Wherry EH, Ahmed R. 2009. Redefining chronic viral infections. Cell 138(1); 30-50.
Comments and discussion
Jessica Conway made the following comment, which I'll quote directly:
[You claim that] "Understanding transmission dynamics of these pathogens requires understanding the within-host processes that allow for re-emergence, and the population-level consequences of increased individual infectiousness." It doesn't really, does it? Take HSV. To understand population-level spread, you need outbreak pattern data, length, duration, and frequency. Do you need to know what drives them? Probably not to understand spread. Maybe yes, to determine interventions that minimize spread, but for that you pretty much want to stop flare-ups, and the between-host scale is not helpful. To understand evolution however likely requires both scales.
This is a really good point. In-host models are probably sufficient, so long as initial dose isn't a critical determinant of infection outcome in the host. If dose is critical, then I think there is an opportunity for feedback across scales that could make multi-scale models helpful.