Trevor Bedford (@trvrb)

24 Jun 2016

Federation Meeting of Korean Basic Medical Scientists

Incheon, Republic of Korea

Haeckel 1879

Darwin 1859

Project to provide a real-time view of the evolving influenza population

All in collaboration with Richard Neher

- Download all recent HA sequences from GISAID
- Filter to remove outliers
- Subsample across time and space
- Align sequences
- Build tree
- Estimate frequencies
- Export for visualization

The future is here, it's just not evenly distributed yet

— William Gibson

Clade frequencies $X$ derive from the fitnesses $f$ and frequencies $x$ of constituent viruses, such that

$$\hat{X}_v(t+\Delta t) = \sum_{i:v} x_i(t) \, \mathrm{exp}(f_i \, \Delta t)$$

This captures clonal interference between competing lineages

A simple predictive model estimates the fitness $f$ of virus $i$ as

$$\hat{f}_i = \beta^\mathrm{ep} \, f_i^\mathrm{ep} + \beta^\mathrm{ne} \, f_i^\mathrm{ne}$$

where $f_i^\mathrm{ep}$ measures cross-immunity via substitutions at epitope sites and $f_i^\mathrm{ep}$ measures mutational load via substitutions at non-epitope sites

- Clade frequency change
- Antigenic advancement

growing clades have high fitness

drifted clades have high fitness

Our predictive model estimates the fitness $f$ of virus $i$ as

$$\hat{f}_i = \beta^\mathrm{freq} \, f_i^\mathrm{freq} + \beta^\mathrm{HI} \, f_i^\mathrm{HI}$$

We learn coefficients and validate model based on previous 15 H3N2 seasons

- Integrate data predictors and data sources, e.g. geography
- Possible to build predictive models for H1N1 and B and to forecast NA evolution

Rambaut 2015

Dudas et al 2016

Dudas et al 2016

- Rapid sharing of sequence data,
*genetic context critical* - Technologies for rapid diagnostics and sequencing
- Technologies to rapidly conduct phylogenetic inference
- Technologies to explore genetic relationships and inform epidemiological investigation

**Influenza**: WHO Global Influenza Surveillance Network, Worldwide Influenza Centre
at the Francis Crick Institute, Richard Neher, Colin Russell, Andrew Rambaut

**Ebola**: data producers, Gytis Dudas, Andrew Rambaut, Philipe Lemey, Richard Neher,
Nick Loman, Ian Goodfellow, Paul Kellam, Danny Park, Kristian Andersen, Pardis Sabeti

**Zika**: data producers, Nick Loman, Nuno Faria, Andrew Rambaut, Oliver Pybus, Richard Neher,
Charlton Callender, Allison Black, Luiz Alcantara and the rest of the ZiBRA team

- Website: bedford.io
- Twitter: @trvrb
- Slides: bedford.io/talks/real-time-tracking-fmkbms/