Genomic surveillance of emerging threats and epidemic spread


Trevor Bedford (@trvrb)
17 Oct 2018
Grand Challenges Annual Meeting
Berlin, Germany

Sequencing to reconstruct pathogen spread

Epidemic process

Sample some individuals

Sequence and determine phylogeny

Sequence and determine phylogeny

Pathogen genomes may reveal:

  • Evolution of new adaptive variants
  • Epidemic origins
  • Patterns of geographic spread
  • Animal-to-human spillover
  • Transmission chains

Influenza: Forecasting spread of new variants for vaccine strain selection

Zika: Uncovering origins of the epidemic in the Americas

Ebola: Revealing spatial spread and persistence in West Africa

MERS: Quantifying camel-to-human spillover

TB: Tracking individual transmission chains

Actionable inferences

Genomic analyses are mostly done in a retrospective manner

Dudas and Rambaut 2016

Key challenges to making genomic epidemiology actionable

  • Timely analysis and sharing of results critical
  • Dissemination must be scalable
  • Integrate many data sources
  • Results must be easily interpretable and queryable

Nextstrain

Project to conduct real-time molecular epidemiology and evolutionary analysis of emerging epidemics


with Richard Neher, James Hadfield, Emma Hodcroft, Tom Sibley,
John Huddleston, Colin Megill, Sidney Bell, Barney Potter,
Charlton Callender

Nextstrain architecture

All code open source at github.com/nextstrain

Two central aims: (1) rapid and flexible phylodynamic analysis and
(2) interactive visualization

Rapid build pipeline for 1600 Ebola genomes

  • Align with MAFFT (34 min)
  • Build ML tree with RAxML (54 min)
  • Temporally resolve tree and geographic ancestry with TreeTime (16 min)
  • Total pipeline (1 hr 46 min)

Nextstrain is two things

  • a bioinformatics toolkit and visualization app, which can be used for a broad range of datasets
  • a collection of real-time pathogen analyses kept up-to-date on the website nextstrain.org

nextstrain.org

Newly released feature

  • "Community" builds to promote frictionless sharing of results

Genomic epidemiology of Lassa virus in Nigeria

Data and builds courtesy of Paul Oluniyi, Christian Happi and the African Center of Excellence for Genomics of Infectious Diseases (ACEGID) at Redeemer's University, Ede, Nigeria

Live at nextstrain.org/community/pauloluniyi/lassa/s

Moving forward

Recent headway towards "actionable" genomic epidemiology

  • affordable and portable full genome sequencing (ONT MinION, iSeq)
  • rapid phylodynamic methods (TreeTime, treedater, etc...)
  • rapid distribution of results (nextstrain.org, microreact.org, virological.org)

Sketch of data flow

The importance of networks: WHO Global Influenza Surveillance and Response System

Databases should have APIs

Analysis tools should be interoperable

  • We have a simple public API to our inferences that looks like: data.nextstrain.org/ebola_tree.json
  • ARTIC has packaged up the Nextstrain auspice viz and is shipping it with their bioinformatics toolkit
  • In the future, one should be able to select some genomes in IDSeq or Pathogenwatch and pipe them over to Nextstrain for a detailed view

Phylogenetic tracking has the capacity to revolutionize epidemiology

Acknowledgements

Bedford Lab: Alli Black, Sidney Bell, John Huddleston, Barney Potter,
James Hadfield, Louise Moncla, Tom Sibley, Maya Lewinsohn

Genomic epi: Gytis Dudas, Andrew Rambaut, Luiz Carvalho, Nick Loman, Jenn Gardy, Paul Oluniyi, Christian Happi, Nuno Faria, Oli Pybus, Josh Quick, Ingra Claro, Julien Thézé, Jaquilene de Jesus, Marta Giovanetti, Matt Cotten, Ian Goodfellow

Nextstrain: Richard Neher, James Hadfield, Emma Hodcroft, Tom Sibley, John Huddleston, Sidney Bell, Barney Potter, Colin Megill, Charlton Callender