Trevor Bedford (@trvrb)

4 Apr 2017

D-BSSE Seminar Series

ETH Zurich

Haeckel 1879

Darwin 1859

Project to provide a real-time view of the evolving influenza population

All in collaboration with Richard Neher

- Download all recent HA sequences from GISAID
- Filter to remove outliers
- Subsample across time and space
- Align sequences
- Build tree
- Estimate clade frequencies
- Infer antigenic phenotypes
- Export for visualization

The future is here, it's just not evenly distributed yet

— William Gibson

Clade frequencies $X$ derive from the fitnesses $f$ and frequencies $x$ of constituent viruses, such that

$$\hat{X}_v(t+\Delta t) = \sum_{i:v} x_i(t) \, \mathrm{exp}(f_i \, \Delta t)$$

This captures clonal interference between competing lineages

A simple predictive model estimates the fitness $f$ of virus $i$ as

$$\hat{f}_i = \beta^\mathrm{ep} \, f_i^\mathrm{ep} + \beta^\mathrm{ne} \, f_i^\mathrm{ne}$$

where $f_i^\mathrm{ep}$ measures cross-immunity via substitutions at epitope sites and $f_i^\mathrm{ep}$ measures mutational load via substitutions at non-epitope sites

- Clade frequency change
- Antigenic advancement

growing clades have high fitness

Our predictive model estimates the fitness $f$ of virus $i$ as

$$\hat{f}_i = \beta^\mathrm{freq} \, f_i^\mathrm{freq} + \beta^\mathrm{HI} \, f_i^\mathrm{HI}$$

We learn coefficients and validate model based on previous 15 H3N2 seasons

with Gytis Dudas, Luiz Carvalho, Marc Suchard, Philippe Lemey, Andrew Rambaut and many others
with Nuno Faria, Nick Loman, Oli Pybus, Luiz Alcantara, Ester Sabino, Josh Quick, Allison Black,
Ingra Morales, Julien Thézé, Marcio Nunes, Jacqueline de Jesus, Marta Giovanetti, Moritz Kraemer,
Sarah Hill and many others
with Kristian Andersen, Nathan Grubaugh, Jason Ladner, Gustavo Palacios, Sharon Isern, Oli Pybus,
Moritz Kraemer, Gytis Dudas, Amanda Tan, Karthik Gangavarapu, Michael Wiley, Stephen White,
Julien Thézé, Scott Michael, Leah Gillis, Pardis Sabeti, and many others
- Timely analysis and sharing of results critical
- Dissemination must be scalable
- Integrate many data sources
- Results must be easily interpretable and queryable

Project to conduct real-time molecular epidemiology and evolutionary analysis of emerging epidemics

Richard Neher,
Trevor Bedford,
Colin Megill,

James Hadfield,
Charlton Callender,
Sidney Bell,

Barney Potter,
Sarah Murata,

**Influenza**: WHO Global Influenza Surveillance Network, Worldwide Influenza Centre at the Francis Crick Institute, Richard Neher, Colin Russell, Boris Shraiman

**Ebola**: data producers, Gytis Dudas, Andrew Rambaut, Luiz Carvalho, Philippe Lemey,
Marc Suchard, Andrew Tatem, Nick Loman, Ian Goodfellow, Matt Cotten, Paul Kellam, Kristian Andersen,
Pardis Sabeti, many others

**Zika**: data producers, Nick Loman, Nuno Faria, Oliver Pybus, Josh Quick,
Allison Black, Kristian Andersen, Nathan Grubaugh, Gytis Dudas, many others

**Nextstrain**: Richard Neher, Colin Megill, James Hadfield, Charlton Callender,
Sarah Murata, Sidney Bell, Barney Potter