Trevor Bedford (@trvrb)

31 Mar 2016

Structure and Computation Affinity Group Seminar

Scripps Research Institute

Haeckel 1879

Darwin 1859

Project to provide a real-time view of the evolving influenza population

All in collaboration with Richard Neher

- Download all recent HA sequences from GISAID
- Filter to remove outliers
- Subsample across time and space
- Align sequences
- Build tree
- Estimate frequencies
- Export for visualization

The future is here, it's just not evenly distributed yet

— William Gibson

Clade frequencies $X$ derive from the fitnesses $f$ and frequencies $x$ of constituent viruses, such that

$$\hat{X}_v(t+\Delta t) = \sum_{i:v} x_i(t) \, \mathrm{exp}(f_i \, \Delta t)$$

This captures clonal interference between competing lineages

A simple predictive model estimates the fitness $f$ of virus $i$ as

$$\hat{f}_i = \beta^\mathrm{ep} \, f_i^\mathrm{ep} + \beta^\mathrm{ne} \, f_i^\mathrm{ne}$$

where $f_i^\mathrm{ep}$ measures cross-immunity via substitutions at epitope sites and $f_i^\mathrm{ep}$ measures mutational load via substitutions at non-epitope sites

- Clade frequency change
- Antigenic advancement

growing clades have high fitness

drifted clades have high fitness

Our predictive model estimates the fitness $f$ of virus $i$ as

$$\hat{f}_i = \beta^\mathrm{freq} \, f_i^\mathrm{freq} + \beta^\mathrm{HI} \, f_i^\mathrm{HI}$$

We learn coefficients and validate model based on previous 15 H3N2 seasons

Model | Ep coefficient | HI coefficient | Freq error | Growth corr |
---|---|---|---|---|

Epitope only | 2.36 | -- | 0.10 | 0.57 |

HI only | -- | 2.05 | 0.08 | 0.63 |

Epitope + HI | -0.11 | 2.15 | 0.08 | 0.67 |

- Integrate data predictors and data sources, e.g. geography
- Possible to build predictive models for H1N1 and B and to forecast NA evolution

Rambaut 2015

Dudas et al 2016

Dudas et al 2016

- Rapid sharing of sequence data,
*genetic context critical* - Technologies to rapidly conduct phylogenetic inference
- Technologies to explore genetic relationships and inform epidemiological investigation

**Influenza**: WHO Global Influenza Surveillance Network, GISAID, Worldwide Influenza Centre at the Francis Crick Institute, Richard Neher, Colin Russell, Andrew Rambaut

**Ebola**: data producers, Gytis Dudas, Andrew Rambaut, Philipe Lemey, Richard Neher, Nick Loman, Ian Goodfellow, Paul Kellam, Danny Park, Kristian Andersen, Pardis Sabeti

**Zika**: data producers, Nuno Faria, Andrew Rambaut, Richard Neher, Charlton Callender

- Website: bedford.io
- Twitter: @trvrb
- Slides: bedford.io/talks/real-time-tracking-scripps/