## Tracking and forecasting SARS-CoV-2 variant spread

Trevor Bedford (@trvrb)
Fred Hutchinson Cancer Center / Howard Hughes Medical Institute
30 Sep 2022
HB Friday Faculty Talk
Fred Hutch

# Variant frequency dynamics

### Population genetic expectation of variant frequency under selection

$x' = \frac{x \, (1+s)}{x \, (1+s) + (1-x)}$ for frequency $x$ over one generation with selective advantage $s$

$x(t) = \frac{x_0 \, (1+s)^t}{x_0 \, (1+s)^t + (1-x_0)}$ for initial frequency $x_0$ over $t$ generations

Trajectories are linear once logit transformed via $\mathrm{log}(\frac{x}{1 - x})$

### Multinomial logistic regression

Multinomial logistic regression models the probability of a virus sampled at time $t$ belonging to variant $i$ as

$$\mathrm{Pr}(X = i) = x_i(t) = \frac{p_i \, \mathrm{exp}(f_i \, t)}{\sum_{1 \le j \le n} p_j \, \mathrm{exp}(f_j \, t) }$$

where the model has $2n$ parameters consisting of $p_i$ the frequency of variant $i$ at initial timepoint and $f_i$ the growth rate or fitness of variant $i$ for $n$ variants.

The model is fit to minimize "log loss" of predicted variant vs observed variant across observations in dataset.

### May 17: fitness advantage of BA.4/BA.5 clear even though rare

Multinomial logistic regression should work well for SARS-CoV-2 prediction, except emergence of new variants limits prediction horizon

# Emerging variants

### Over 350 Pango lineages designated in 2022

Emerging sublineages within BA.2.75 focusing on BA.2.75.2 and emerging lineages within BA.5 focusing on BA.5.3.1.1.1.1.1 ie BQ.1

### BA.2.75.2 and BQ.1 show selective advantage in UK and US

BA.2.75.2 has R346T and F486S on top of BA.2.75 and BQ.1 has K444T, N460K on top of BA.5

# Seasonality and near-term circulation

### Seasonality clearly evidence in 2020-2021 winter epidemic

Rt analysis suggests a ~30% transmission advantage going from Aug 2020 to Nov 2020

Data from rt.live

### Heading into fall, we very roughly have:

• 40% of the population with BA.2, BA.4, BA.5 in Mar 1 to Sep 15 (16.3M confirmed cases and 8x under-reporting)
• 40% of the population with BA.1 infection in Jan/Feb (24.1M confirmed cases and 5x under-reporting)
• 20% of the population vaccinated, boosted or with pre-Omicron infection

# Continued work on forecasting

### Could we predict the spread of new mutations using DMS data?

Perhaps this worked for 486V in BA.4/BA.5

### Acknowledgements

SARS-CoV-2 genomic epi: Data producers from all over the world, GISAID and the Nextstrain team

Bedford Lab: John Huddleston, James Hadfield, Katie Kistler, Maya Lewinsohn, Thomas Sibley, Jover Lee, Cassia Wagner, Miguel Paredes, Nicola MÃ¼ller, Marlin Figgins, Denisse Sequeira, Victor Lin, Jennifer Chang, Allison Li, Eslam Abousamra, Donna Modrell, Nashwa Ahmed