Fred Hutchinson Cancer Center / Howard Hughes Medical Institute

8 Oct 2024

KITP Workshop on Interactions and Co-evolution between Viruses and Immune Systems

University of California Santa Barbara

Slides at: bedford.io/talks

- Evolutionary patterns across endemic human viruses
- Frequency dynamics and fitness estimation
- Evolutionary forecasting

ONS Infection Survey provides rare source of ground truth, roughly 1 in 3 infections detected in 2021, while 1 in 40 in 2023

Data from ONS

~110% population attack rate from March 2022 to March 2023

Post-Omicron period shows consistent IFR of 0.04%

Future frequency $x_i(t+\Delta t)$ of strain $i$ derives from strain fitness $f_i$ and present day frequency $x_i(t)$, such that

$$x_i(t+\Delta t) = \frac{1}{Z(t)} \, x_i(t) \, \mathrm{exp}(f_i \, \Delta t)$$

Strain frequencies at each timepoint are normalized by total frequency $Z(t)$. Strain fitness $f_i$ is estimated from viral attributes (primarily number of epitope and non-epitope mutations).

$x' = \frac{x \, (1+s)}{x \, (1+s) + (1-x)}$ for frequency $x$ over one generation with selective advantage $s$

$x(t) = \frac{x_0 \, (1+s)^t}{x_0 \, (1+s)^t + (1-x_0)}$ for initial frequency $x_0$ over $t$ generations

Trajectories are linear once logit transformed via $\mathrm{log}(\frac{x}{1 - x})$

Multinomial logistic regression across $n$ variants models the probability of a virus sampled at time $t$ belonging to variant $i$ as

$$\mathrm{Pr}(X = i) = x_i(t) = \frac{p_i \, \mathrm{exp}(f_i \, t)}{\sum_{1 \le j \le n} p_j \, \mathrm{exp}(f_j \, t) }$$

with $2n$ parameters consisting of $p_i$ the frequency of variant $i$ at initial timepoint and $f_i$ the growth rate or fitness of variant $i$.

location variant date sequences Japan 22B 2023-02-10 242 Japan 22B 2023-02-11 56 Japan 22B 2023-02-12 70 Japan 22E 2023-02-10 80 Japan 22E 2023-02-11 21 Japan 22E 2023-02-12 27 USA 22B 2023-02-10 41 USA 22B 2023-02-11 23 USA 22B 2023-02-12 23 USA 22E 2023-02-10 368 USA 22E 2023-02-11 236 USA 22E 2023-02-12 246 ...

Model from Figgins and Bedford. 2022. medRxiv.

Constant clade fitness within each window, USA data only, ignoring within-clade fitness variation

Line thickness is proportional to variant frequency

Retrospective projections twice monthly during 2022

30 days out, countries range from 5 to 15% mean absolute error

Correlates with data availability (median number of sequences available from the previous 30 days):

- USA
- ~45k sequences
- Australia
- ~4k sequences
- South Africa
- 170 sequences
- Vietnam
- 30 sequences

This approach improves poor model accuracy in countries with less intensive genomic surveillance

Rapid sweep of JN.1 over Dec to Jan 2024

Rather than estimate variant specific fitness $f_i$ directly, we instead parameterize as the "innovation" in fitness in going from parent lineage $p$ to child lineage $i$ as $\psi_i = (f_i - f_p)$.

We then compare a non-informative model of $$\psi_i = (f_i - f_p) \sim \mathrm{Normal}(0, \sigma)$$ to a model where each "innovation" value has an informed prior based on a linear combination of predictors such as ACE2 binding, immune escape and S1 mutations, where $z_k$ represents the value of predictor $k$ $$\psi_i = (f_i - f_p) \sim \mathrm{Normal}\left(\sum_k \beta_k \, z_k, \sigma\right)$$

Figgins et al. In prep.

especially out of sample

**SARS-CoV-2 genomic epi**: Data producers from all over the world, GISAID

**Nextstrain**: Richard Neher, Ivan Aksamentov, John SJ Anderson, Kim Andrews, Jennifer Chang,
James Hadfield, Emma Hodcroft, John Huddleston, Jover Lee, Victor Lin, Cornelius Roemer, Thomas Sibley

**Adaptive evolution across human endemic viruses**: Katie Kistler

**MLR and evolutionary forecasting**: Marlin Figgins, Eslam Abousamra, Jover Lee, James Hadfield,
John Huddleston, Jesse Bloom, Cornelius Roemer, Richard Neher

**Bedford Lab**:
John Huddleston,
James Hadfield,
Katie Kistler,
Thomas Sibley,
Jover Lee,
Miguel Paredes,
Marlin Figgins,
Victor Lin,
Jennifer Chang,
Nashwa Ahmed,
Cécile Tran Kiem,
Kim Andrews,
Cristian Ovaduic,
Philippa Steinberg,
Jacob Dodds,
John SJ Anderson
Amin Bemanian