Trevor Bedford (@trvrb)

27 Jan 2016

Combi Seminar

Genome Sciences, University of Washington

Haeckel 1879

Darwin 1859

Project to provide a real-time view of the evolving influenza population

All in collaboration with Richard Neher

- Download all recent HA sequences from GISAID
- Filter to remove outliers
- Subsample across time and space
- Align sequences
- Build tree
- Estimate frequencies
- Export for visualization

The future is here, it's just not evenly distributed yet

— William Gibson

Clade frequencies $X$ derive from the fitnesses $f$ and frequencies $x$ of constituent viruses, such that

$$\hat{X}_v(t+\Delta t) = \sum_{i:v} x_i(t) \, \mathrm{exp}(f_i \, \Delta t)$$

This captures clonal interference between competing lineages

A simple predictive model estimates the fitness $f$ of virus $i$ as

$$\hat{f}_i = \beta^\mathrm{ep} \, f_i^\mathrm{ep} + \beta^\mathrm{ne} \, f_i^\mathrm{ne}$$

where $f_i^\mathrm{ep}$ measures cross-immunity via substitutions at epitope sites and $f_i^\mathrm{ep}$ measures mutational load via substitutions at non-epitope sites

- Clade frequency change
- Antigenic advancement

growing clades have high fitness

drifted clades have high fitness

Our predictive model estimates the fitness $f$ of virus $i$ as

$$\hat{f}_i = \beta^\mathrm{freq} \, f_i^\mathrm{freq} + \beta^\mathrm{HI} \, f_i^\mathrm{HI}$$

We learn coefficients and validate model based on previous 15 H3N2 seasons

Model | Ep coefficient | HI coefficient | Freq error | Growth corr |
---|---|---|---|---|

Epitope only | 2.36 | -- | 0.10 | 0.57 |

HI only | -- | 2.05 | 0.08 | 0.63 |

Epitope + HI | -0.11 | 2.15 | 0.08 | 0.67 |

- Integrate data predictors and data sources, e.g. plan to investigate a geographic predictor
- Possible to build predictive models for H1N1 and B and to forecast NA evolution

Analyses must be rapid and widely available

Predictive models can flag clades for experimental follow-up and creation of vaccine candidates

Rambaut 2015

- Rapid sharing of sequence data,
*genetic context critical* - Technologies to rapidly conduct phylogenetic inference
- Technologies to explore genetic relationships and inform epidemiological investigation

Richard Neher (Max Planck Tübingen), Andrew Rambaut (University of Edinburgh), Colin Russell (Cambridge University), Michael Lässig (University of Cologne), Marta Łuksza (Institute for Advanced Study), Gytis Dudas (University of Edinburgh), Pardis Sabeti (Harvard University), Danny Park (Harvard University), Nick Loman (University of Birmingham) Matthew Cotten (Sanger Institute), Paul Kellam (Sanger Institute), WHO Global Influenza Surveillance Network, GISAID

- Website: bedford.io
- Twitter: @trvrb
- Slides: bedford.io/talks/real-time-tracking-combi/