Fred Hutchinson Cancer Center / Howard Hughes Medical Institute

11 Jul 2023

Monthly Meeting

Northwest PGCoE

Slides at: bedford.io/talks

$x' = \frac{x \, (1+s)}{x \, (1+s) + (1-x)}$ for frequency $x$ over one generation with selective advantage $s$

$x(t) = \frac{x_0 \, (1+s)^t}{x_0 \, (1+s)^t + (1-x_0)}$ for initial frequency $x_0$ over $t$ generations

Trajectories are linear once logit transformed via $\mathrm{log}(\frac{x}{1 - x})$

Multinomial logistic regression across $n$ variants models the probability of a virus sampled at time $t$ belonging to variant $i$ as

$$\mathrm{Pr}(X = i) = x_i(t) = \frac{p_i \, \mathrm{exp}(f_i \, t)}{\sum_{1 \le j \le n} p_j \, \mathrm{exp}(f_j \, t) }$$

with $2n$ parameters consisting of $p_i$ the frequency of variant $i$ at initial timepoint and $f_i$ the growth rate or fitness of variant $i$.

The model is fit to minimize "log loss" of predicted variant vs observed variant across observations in dataset.

location variant date sequences Japan 22B 2023-02-10 242 Japan 22B 2023-02-11 56 Japan 22B 2023-02-12 70 Japan 22E 2023-02-10 80 Japan 22E 2023-02-11 21 Japan 22E 2023-02-12 27 USA 22B 2023-02-10 41 USA 22B 2023-02-11 23 USA 22B 2023-02-12 23 USA 22E 2023-02-10 368 USA 22E 2023-02-11 236 USA 22E 2023-02-12 246 ...

Model from Figgins and Bedford. 2022. medRxiv.

Data from UKHSA

Roughly 1 in 3 infections detected in 2021, while 1 in 40 in 2023

Data from ONS

Model from Figgins and Bedford. 2022. medRxiv.

The hierarchical model allows pooling of growth advantages across locations. This allows us to include locations with fewer sequences and to better estimate growth advantage of rare lineages.

Initial frequency | Growth advantage | |||
---|---|---|---|---|

Japan | $p_{23A}$ | $p_{23B}$ | $f_{23A}$ | $f_{23B}$ |

USA | $p_{23A}$ | $p_{23B}$ | $f_{23A}$ | $f_{23B}$ |

hierarchical | $f_{23A}$ | $f_{23B}$ |

Escape from antibodies that potently neutralize BA.2

- Application of MLR models to other pathogens, such as seasonal influenza
- Assessing and improving accuracy of "live" models at nextstrain.org/sars-cov-2/forecasts/
- Implementing DMS priors to predict fitness of emerging and yet-to-emerge lineages

**SARS-CoV-2 genomic epi**: Data producers from all over the world, GISAID and the Nextstrain team

**Bedford Lab**:
John Huddleston,
James Hadfield,
Katie Kistler,
Thomas Sibley,
Jover Lee,
Cassia Wagner,
Miguel Paredes,
Nicola Müller,
Marlin Figgins,
Victor Lin,
Jennifer Chang,
Allison Li,
Eslam Abousamra,
Donna Modrell,
Nashwa Ahmed,
Cécile Tran Kiem