Forecasting the Future of Flu
Trevor Bedford (@trvrb)
28 Aug 2016
Options IX
Chicago, IL
Want to forecast the make up of the future flu population from the population that exists today
Real-time updates as new information rolls in
Population turnover (in H3N2) is extremely rapid
Clades emerge, die out and take over
Clades show rapid turnover
Dynamics driven by antigenic drift
Drift variants emerge and rapidly take over in the virus population
This causes the side effect of evading existing vaccine formulations
Drift necessitates vaccine updates
H3N2 vaccine updates occur every ~2 years
Timely surveillance and rapid analysis essential to vaccine strain selection
nextflu
Project to provide a real-time view of the evolving influenza population
nextflu
Project to provide a real-time view of the evolving influenza population
All in collaboration with Richard Neher
nextflu
pipeline
- Download all recent HA sequences from GISAID
- Filter to remove outliers
- Subsample across time and space
- Align sequences
- Build tree
- Estimate clade frequencies
- Infer antigenic phenotypes
- Export for visualization
Up-to-date analysis publicly available at:
Influenza hemagglutination inhibition (HI) assay
HI measures cross-reactivity across viruses
Data in the form of table of maximum inhibitory titers
Antigenic cartography compresses HI measurements into an interpretable diagram
Instead of a geometric model, we sought a phylogenetic model of HI titer data
Identify phylogeny branches associated with drops in HI titer
Model can be used to interpolate across tree and predict phenotype of untested viruses
Model is highly predictive of missing titer values
Recent HI data from London WHO Collaborating Center
"The future is here, it's just not evenly
distributed yet"
— William Gibson
USA music industry, 2011 dollars per capita
Influenza population turnover
Vaccine strain selection timeline
Seek to explain change in clade frequencies over 1 year
Fitness models can project clade frequencies
Clade frequencies $X$ derive from the fitnesses $f$ and frequencies $x$ of constituent viruses, such that
$$\hat{X}_v(t+\Delta t) = \sum_{i:v} x_i(t) \, \mathrm{exp}(f_i \, \Delta t)$$
This captures clonal interference between competing lineages
The question of forecasting becomes: how do we accurately estimate fitnesses of circulating viruses?
Fortunately, there's lots of training data and previously successful strains have had:
- Amino acid changes at epitope sites
- Antigenic novelty based on HI
- Rapid phylogenetic growth
Predictor: calculate HI drop from ancestor,
drifted clades have high fitness
Predictor: project frequencies forward,
growing clades have high fitness
We predict fitness based on a simple formula
where the fitness $f$ of virus $i$ is estimated as
$$\hat{f}_i = \beta^\mathrm{HI} \, f_i^\mathrm{HI} + \beta^\mathrm{freq} \, f_i^\mathrm{freq}$$
where $f_i^\mathrm{HI}$ measures antigenic drift via HI and $f_i^\mathrm{freq}$ measures clade growth/decline
We learn coefficients and validate model based on previous 15 H3N2 seasons
Clade growth rate is well predicted (ρ = 0.66)
Growth vs decline correct in 84% of cases
Trajectories show more detailed congruence
Trajectories show more detailed congruence
This model is similar in formulation and performance to Łuksza and Lässig
When does the forecast fail?
Emerging clades are difficult to forecast: little antigenic data and little evidence of "past performance"
Models work well for clades at >10%, but less well for clades <5%
New mutations difficult
Models can project forward from circulating strains, but cannot foresee the appearance of new mutations
Intrinsically limits the timescale of forecasting to ~1 year
Model is only as good as the data
Requires rapid shipping of samples, rapid sequencing and rapid antigenic characterization
Issuing reports online in Feb and Sep
In February we stated
"Barring substantial changes in other clades, we predict the (HA1:171K, HA2:77V/155E) variant to dominate"
Let's see how we did
The (H3N2) world today
3c2.a viruses continue to predominate (except in the USA)
Within 3c2.a clades are emerging, in particular (HA1:171K, HA2:77V/155E)
The 171K clade has recently risen in frequency
We predict the 171K clade will continue to be successful (unless supplanted by a novel mutant)
- Has an amino acid change at predicted epitope site
- Has some evidence of antigenic novelty based on HI
- Shows recent rapid expansion
Further improvements to predictive modeling
- Extend to other seasonal viruses
- Forecast NA evolution
- Integrate neutralization (FRA) assay data
- Model effects of egg adaptation
- Incorporate an explicit geographic model
Phylogeny of H3 with geographic history
Geographic location of phylogeny trunk
More generally real-time analyses may be useful for other viruses
Major opportunity to track evolution non-human influenza viruses
Adaptation to avian flu viruses by Yujia Zhou and Justin Bahl
All tools are completely open source and we encourage other groups to get involved and push the project forward
General purpose genomic surveillance tool
Acknowledgements
Analysis: Richard Neher, Colin Russell, Charlton Callender, Colin Megill, Andrew Rambaut,
Charles Cheung, Marc Suchard, Steven Riley, Philippe Lemey, Gytis Dudas, Boris Shraiman,
Marta Łuksza, Michael Lässig
GISRS/GISAID: Ian Barr, Shobha Broor, Mandeep Chadha, Nancy Cox, Rod Daniels, Becky Garten,
Palani Gunasekaran, Aeron Hurt, Anne Kelso, Jackie Katz, Nicola Lewis, Xiyan Li, John McCauley,
Takato Odagiri, Varsha Potdar, Yuelong Shu, Eugene Skepner, Masato Tashiro, Dayan Wang, Dave Wentworth,
Xiyan Xu
Contact
- Website: bedford.io
- Twitter: @trvrb
- Slides: bedford.io/talks/forecasting-flu-options/