Evolution involves random events, including new mutations, and natural selection, which favors fitter variants over others.
When is evolution predictable? For fast-evolving viruses, limited predictability emerges from three key factors. First, because mutation rates are high, the population continuously produces new variants, some of which have a fitness advantage. Second, selection between variants is strong, so fitter variants spread rapidly. Third, important selection components are computable from past data, notably the immunity in the human population induced by previous variants.
Evolutionary predictions are concerned with genetic changes of a virus, their effects on viral functions and fitness, and the resulting shifts in the population of circulating viruses. Evolutionary predictions are different from epidemiological predictions, which are estimates of future case numbers.
Current predictive methods span periods of about one year into the future. Recent approaches also predict key properties of new viral variants even before they are observed.
Predictive analysis can be extended to estimate how the viral population responds to human interventions. In this way, it can inform vaccination strategies that are effective against future variants. For more details, see selected publications [1,2,6,7].
Importantly, predictive analysis estimates the probability of future evolutionary trajectories but does not give certainty. Model-based predictions are limited by noisy and incomplete input data, fitness effects not included in the model, and the intrinsic stochasticity of evolution (e.g., the emergence and effects of new mutations).
Our comprehensive methods combine world-wide genetic, epidemiological, and molecular data.
Sampling of viral isolates provides genome sequences of circulating strains, resolved by time and region of collection. Sequence data serves to estimate the frequencies of variants and to build evolutionary trees of the virus.
Epidemiological data, including case numbers of infections and vaccinations, enter estimates of immunity present in the human population.
Molecular experiments provide information relevant for viral fitness, notably the neutralization of viruses by human immune antisera and by monoclonal antibodies.
All data sources of our predictive analysis are listed here.
Human cohorts. Each human has a unique immunological history with regard to respiratory diseases. Understanding the collective evolutionary pressure exerted by the human population is key to predict viral evolution. At Previr, we have access to several cohorts and biobanks established at the University Hospital of Cologne or within cross-center cooperation projects. These cover a diverse population with a variety of virus-host interactions and can be harvested for predictions. We characterize the humoral immune response within cohorts to learn about commonalities and variation of response patterns between individuals.
Monoclonal antibodies. A major part of our experimental research deals with the analysis of monoclonal antibodies. We perform genotypic and phenotypic characterization in the context of natural infection and vaccination. To this end, we have developed high-throughput protocols to isolate hundreds of monoclonal antibodies from an individual. A network of international collaboration partners provides expertise in structural and further functional analysis of monoclonal antibodies. For more details, see selected publications [3,4,5].
Our predictions are based on strain- and variant-specific models for viral fitness. Fitness differences between co-circulating variants are used to compute likely future changes in their frequencies. We use a fitness model of the basic form
\[
f_\alpha (t)= f_{\rm{int},\alpha}-\sum_\kappa c_{\alpha \kappa} r_\kappa (t).
\]
This model computes the growth rate \(f_\alpha(t) \), or absolute fitness, of a given variant \(\alpha\) at a given time \(t\). The first term \(f_{\rm{int},\alpha}\) is the intrinsic fitness component, which is related to the basic reproductive number. Viral traits relevant for intrinsic fitness include protein fold stability and binding to human cell receptors. The second term is the antigenic fitness component. This term measures the selection pressure induced by immunity of the human population, which in turn is generated by previous infections and vaccinations. Here \(c_{\alpha\kappa}\) is the amount of immunity against variant \(\alpha\) in humans with a given infection/vaccination history \(\kappa\) as estimated from neutralization data, and \(r_\kappa(t)\) is the fraction of humans in group \(\kappa\) at time \(t\). This form can be generalized to include immune waning. More details are given in selected publications [1,6,7].
Previr uses a comprehensive computational pipeline for integrative processing of sequence data, epidemiological data, and neutralization data for a given pathogen. This pipeline is continuously updated to new research questions and public health demands. Our analysis has the following steps.
Data curation of viral sequences. This step filters out low-quality or incomplete isolate sequences.
Sequence alignment. This step maps nucleotide substitutions along the genome between all sequences of the data set.
Reconstruction of the strain tree. This step builds a family tree for all isolates of the data set based on their sequence similarity.
Tracking of clade frequencies. We estimate worldwide clade frequency trajectories, using sequence data counts mapped to clades together with epidemiological case data.
Inference of antigenic evolution. We map antigenic escape evolution by curated neutralization titers of multiple test strains against multiple reference antisera.
Evolution of population immunity. We track population immunity trajectories, using neutralization data and immune weights inferred from past epidemiological data.
Computation of viral fitness and predictions. We compute the relative fitness of viral clades and use this to predict the evolution of clade frequencies.
[1] Meijers M., Ruchnewitz D., Eberhard J., Łuksza M., Lässig M. Population immunity predicts evolutionary trajectories of SARS-CoV-2. Cell 186, 1–14 (2023).
[2] Lässig M., Mustonen V. & Nourmohammad A. Steering and controlling evolution — from bioengineering to fighting pathogens. Nature Review Genetics (2023).
[3] Gruell H., …, Klein. F. SARS-CoV-2 Omicron sublineages exhibit distinct antibody escape patterns. Cell Host & Microbe, 30(9), 1231-1241 e1236 (2022).
[4] Gieselmann L., Kreer C., …, Klein F. Effective high-throughput isolation of fully human antibodies targeting infectious pathogens. Nature Protocols, 16(7), 3639-3671 (2021).
[5] Kreer C., …, Klein F. Longitudinal Isolation of Potent Near-Germline SARS-CoV-2-Neutralizing Antibodies from COVID-19 Patients. Cell, 182(6), 1663-1673 (2020).
[6] Lässig M., Mustonen V. & Walczak A. Predicting evolution. Nature Ecology and Evolution 1, 0077 (2017).
[7] Łuksza M., Lässig M. A predictive fitness model for influenza. Nature 507, 57–61 (2014).