Real-time genomic surveillance of pathogen evolution and spread


Trevor Bedford (@trvrb)
18 May 2017
Applied Bioinformatics and Public Health Microbiology
Hinxton, UK

Spread of plague in 14th century

Spread of swine flu in 2009

Sequencing to reconstruct pathogen spread

Epidemic process

Sample some individuals

Sequence and determine phylogeny

Sequence and determine phylogeny

Localized Middle Eastern MERS-CoV phylogeny

Regional West African Ebola phylogeny

Global influenza phylogeny

Phylogenetic tracking has the capacity to revolutionize epidemiology

Outline

  • Ebola spread in West Africa
  • Zika spread in the Americas
  • "Real-time" analyses

Ebola

Tracking geographic spread of the Ebola epidemic

with Gytis Dudas, Andrew Rambaut, Luiz Carvalho, Marc Suchard, Philippe Lemey,
and many others

Sequencing of 1610 Ebola virus genomes collected during the 2013-2016 West African epidemic

Phylogenetic reconstruction of evolution and spread

Initial emergence from Guéckédou

Tracking migration events

Factors influencing migration rates

Effect of borders on migration rates

Spatial structure at the country level

Substantial mixing at the regional level

Regional outbreaks due to multiple introductions

Each introduction results in a minor outbreak

Ebola spread in West Africa followed a gravity model with moderate slowing by international borders, in which spread is driven by short-lived migratory clusters

Zika

Zika's arrival and spread in the Americas

Tracking origins of the Zika epidemic

with Nuno Faria, Nick Loman, Oli Pybus, Luiz Alcantara, Ester Sabino, Josh Quick, Allison Black, Ingra Morales, Julien Thézé, Marcio Nunes, Jacqueline de Jesus, Marta Giovanetti, Moritz Kraemer, Sarah Hill and many others

Road trip through northeast Brazil to collect samples and sequence

Phylogeny infers an origin in northeast Brazil

Local spread of Zika in Florida

with Nathan Grubaugh, Kristian Andersen, Jason Ladner, Gustavo Palacios, Sharon Isern, Oli Pybus, Moritz Kraemer, Gytis Dudas, Amanda Tan, Karthik Gangavarapu, Michael Wiley, Stephen White, Julien Thézé, Scott Michael, Leah Gillis, Pardis Sabeti, and many others

Outbreak of locally-acquired infections focused in Miami-Dade county

Phylogeny shows a surprising degree of clustering

Clustering suggests fewer, longer transmission chains and higher R0

Extrapolate R0 to predict introduction counts driving outbreak

Flow of infected travelers greatest from Caribbean

Southern Florida has high potential for Aedes borne outbreaks

Genomic epidemiology of Zika in the US Virgin Islands

with Allison Black, Barney Potter, Esther Ellis, Brett Ellis, Kristian Andersen,
Nathan Grubaugh, Leora Feldstein, and others

Preliminary analysis of 11 genomes shows multiple introductions to USVI

Important analyses, let's make them more rapid and more automated

Key challenges

  • Timely analysis and sharing of results critical
  • Dissemination must be scalable
  • Integrate many data sources
  • Results must be easily interpretable and queryable

nextstrain

Project to conduct real-time molecular epidemiology and evolutionary analysis of emerging epidemics


Richard Neher, Trevor Bedford, James Hadfield,
Colin Megill, Sidney Bell, Charlton Callender,
Barney Potter, Sarah Murata,

Nextstrain architecture (github.com/nextstrain)

Fauna

Rethink database of virus and titer data

  • Harmonizes data from different sources
  • Integrates different types of data (serology, sequences, case details)
  • Provides an interface for downstream analysis

Augur

Build scripts to align sequences, build trees and annotate

  • Flexible build scripts to incorporate different viruses and analyses
  • Constructs time-resolved phylogenies
  • Annotates with geographic transitions and mutation events

Example augur pipeline for 1600 Ebola genomes

  • Align with MAFFT (34 min)
  • Build ML tree with RAxML (54 min)
  • Infer ML temporally-resolved tree with TreeTime (16 min)
  • Infer ML geographic ancestry with TreeTime (0.01 min)
  • Total pipeline (1 hr 46 min)

Auspice

Web visualization of resulting trees

  • Interactive data exploration and filtering
  • Framework through React / D3
  • Connects phylogeny, geography and genotypes

nextstrain.org

Rapid on-the-ground sequencing by Ian Goodfellow, Matt Cotten and colleagues













Desired analytics are pathogen specific and tied to response measures

Acknowledgements


Ebola: Gytis Dudas, Andrew Rambaut, Luiz Carvalho, Philippe Lemey, Marc Suchard, Andrew Tatem, Nick Loman, Ian Goodfellow, Matt Cotten, Paul Kellam, Kristian Andersen, Pardis Sabeti, many others

Zika: Nick Loman, Nuno Faria, Oliver Pybus, Josh Quick, Kristian Andersen, Nathan Grubaugh, Jason Ladner, Gustavo Palacios, Sharon Isern, Gytis Dudas, Allison Black, Barney Potter, Esther Ellis, many others

Nextstrain: Richard Neher, James Hadfield, Colin Megill, Sidney Bell, Charlton Callender, Barney Potter, Sarah Murata

                   

Wellcome Trust Collaborative Award to put "genomic surveillance at the heart of viral epidemic response" with Andrew Rambaut, Nick Loman, Ian Goodfellow, Philippe Lemey, Christophe Fraser and Marc Suchard.

Looking for postdocs. Contact @arambaut or @pathogenomenick.