Genomic surveillance of emerging threats and epidemic spread

Trevor Bedford (@trvrb)
22 Jun 2018
Emerging Infections and Pandemic Risk
Institut Pasteur

Spread of plague in 14th century

Spread of swine flu in 2009

Sequencing to reconstruct pathogen spread

Epidemic process

Sample some individuals

Sequence and determine phylogeny

Sequence and determine phylogeny

Localized Middle Eastern MERS-CoV phylogeny

Regional West African Ebola phylogeny

Global influenza phylogeny

Phylogenetic tracking critical to pandemic warning systems and outbreak response

Monitoring of animal-to-human spillover

      MERS spillover in the Arabian Peninsula

Monitoring of human-to-human transmission

      Ebola spread in West Africa
      Zika spread in the Americas


Middle East respiratory syndrome coronavirus (MERS-CoV)

  • First identified in Saudi Arabia in 2012
  • 2220 confirmed cases to date and 790 deaths
  • Camels thought to be the intermediate host
  • 30% of common colds due to endemic human coronaviruses

Ongoing incidence, but lack of epidemic growth

Cases localized to the Arabian Peninsula

Hypotheses for MERS transmission

MERS-CoV spillover at the camel-human interface

with Gytis Dudas, Luiz Carvalho and Andrew Rambaut

Genomic dataset

  • 174 virus genomes from human infections
  • 100 virus genomes from camel infections

MERS tree with host state

Phylodynamic reconstruction of host state

Humans are transient hosts

Asymmetric migration rates

  • 56 (48–63) camel-to-human transmission events resulting in 174 sequenced human infections
  • 3 (0-12) human-to-camel transmission events

Introduction events tend to occur between April and July

Dromedary camel calving occurs between Nov and Feb

Monte Carlo simulation

Phylogenetic clustering suggests $R_0$ below 1.0 and ~2000 human cases driven by ~600 introduction events

Similar results when we assume larger clusters are more likely to be sequenced

Critically, no evidence of increasing cluster sizes through time

Many other viruses that exhibit stuttering chains of human infection

  • Nipah virus (fruit bats / pigs, Southeast Asia)
  • Lassa virus (rodents, West Africa)
  • Avian influenza (birds, mainland China)


Ebola epidemic of 2014-2016 was unprecedented in scope

Ebola epidemic in West Africa

Virus genomes reveal factors that spread and sustained the Ebola epidemic

with Gytis Dudas, Andrew Rambaut, Luiz Carvalho, Marc Suchard, Philippe Lemey,
and many others

Sequencing of 1610 Ebola virus genomes collected during the 2013-2016 West African epidemic

Sequenced genomes were representative of spatiotemporal diversity

Phylogenetic reconstruction of epidemic

Tracking migration events

Factors influencing migration rates

Effect of borders on migration rates

Spatial structure at the country level

Substantial mixing at the regional level

Regional outbreaks due to multiple introductions

Each introduction results in a minor outbreak


Zika's arrival and spread in the Americas

Establishment and cryptic transmission of Zika virus in Brazil and the Americas

with Nuno Faria, Nick Loman, Oli Pybus, Luiz Alcantara, Ester Sabino, Josh Quick,
Alli Black, Ingra Morales, Julien Thézé, Marcio Nunes, Jacqueline de Jesus,
Marta Giovanetti, Moritz Kraemer, Sarah Hill and many others

Road trip through northeast Brazil to collect samples and sequence

Case reports and diagnostics suggest initiation in northeast Brazil

Phylogeny infers an origin in northeast Brazil

Actionable inferences

Genomic analyses were mostly done in a retrospective manner

Dudas and Rambaut 2016

Key challenges to making genomic epidemiology actionable

  • Timely analysis and sharing of results critical
  • Dissemination must be scalable
  • Integrate many data sources
  • Results must be easily interpretable and queryable


Project to conduct real-time molecular epidemiology and evolutionary analysis of emerging epidemics

with Richard Neher, James Hadfield, Colin Megill,
Sidney Bell, John Huddleston, Barney Potter,
Emma Hodcroft, Charlton Callender

Nextstrain architecture

All code open source at

Example pipeline for 1600 Ebola genomes

  • Align with MAFFT (34 min)
  • Build ML tree with RAxML (54 min)
  • Temporally resolve tree and geographic ancestry with TreeTime (16 min)
  • Total pipeline (1 hr 46 min)

Rapid on-the-ground sequencing by Ian Goodfellow, Matt Cotten and colleagues

Build out pipelines for different pathogens, improve databasing and lower
bioinformatics bar


Bedford Lab: Alli Black, Sidney Bell, Gytis Dudas, John Huddleston,
Barney Potter, James Hadfield, Louise Moncla, Tom Sibley

MERS: Gytis Dudas, Andrew Rambaut, Luiz Carvalho

Ebola: Gytis Dudas, Andrew Rambaut, Luiz Carvalho, Philippe Lemey, Marc Suchard, Andrew Tatem

Zika: Nick Loman, Nuno Faria, Oli Pybus, Josh Quick, Ingra Claro, Julien Thézé, Jaquilene de Jesus, Marta Giovanetti, Moritz Kraemer, Sarah Hill, Allison Black, Ester Sabino, Luiz Alcantara

Nextstrain: Richard Neher, James Hadfield, Colin Megill, Sidney Bell, Charlton Callender, Barney Potter, John Huddleston, Emma Hodcroft