Key insights from genomic epidemiology


Trevor Bedford

Fred Hutchinson Cancer Center / Howard Hughes Medical Institute
25 Aug 2023
CDC-PGCoE visit
Northwest PGCoE

Pathogen genomes may reveal

  • Evolution of new adaptive variants
  • Epidemic origins
  • Patterns of geographic spread
  • Animal-to-human spillover
  • Transmission chains

Influenza: Forecasting spread of new variants for vaccine strain selection

Zika: Uncovering origins of the epidemic in the Americas

Ebola: Revealing spatial spread and persistence in West Africa

MERS: Repeated spillover into the human population from camel reservoir

Actionable inferences


Project to conduct real-time genomic epidemiology and evolutionary analysis of emerging epidemics

Nextstrain architecture

Central aims: (1) rapid and flexible bioinformatic workflows, (2) interactive visualization and (3) always up-to-date analyses at

All code open source at

Operationalized during the 2018-2020 outbreak of Ebola in the Democratic Republic of the Congo

Genomic epidemiology during the COVID-19 pandemic

Over 15M SARS-CoV-2 genomes shared to GISAID and evolution tracked in real-time at

Richard Neher, Ivan Aksamentov, Jennifer Chang James Hadfield, Emma Hodcroft, John Huddleston, Jover Lee, Victor Lin, Cornelius Roemer, Thomas Sibley

Three key insights that genomic epi provided during pandemic

  1. Rapid human-to-human spread in Wuhan beyond initial market outbreak
  2. Extensive local transmission while testing was rare
  3. Identification of variants of concern and mapping of increased transmission rates

Jan 11: First five genomes showed a novel SARS-like coronavirus

Initially thought clustering due to epi investigation of linked cases at Huanan seafood market

Data from CAMS, China CDC, Fudan University, WIV; Figure from

Jan 19: First 12 genomes from Wuhan (blue) and Bangkok (red) showed lack of genetic diversity

Data from CAMS, China CDC, Fudan University, Hubei CDC, Thai MOPH, WIV; Figure from

Jan 23: Introduction into the human population between Nov 15 and Dec 15 and subsequent rapid human-to-human spread

Rapid global epidemic spread from China

Epidemic in the USA was introduced from China in late Jan and from Europe during Feb

Early sequencing provided best estimate of extent of local outbreak

After initial wave, with mitigation
efforts and decreased travel,
regional clades emerge

Emergence of Alpha in the UK with excess spike mutations

Alpha described in Rambaut et al. 2020. Figure from

Further emergence of variants with increased transmissibility

Variant emergence and spread has continued

In which new variants emerge that escape from existing population immunity and spread rapidly

Future of genomic epidemiology

The COVID-19 pandemic has pushed the field perhaps ~5 years into the future

2013-16 Ebola in West Africa 29k confirmed cases 1610 genomes
2015-17 Zika in the Americas 223k confirmed cases 942 genomes
2018-19 seasonal flu in US 290k confirmed cases 8864 genomes
2020-22 COVID-19 pandemic 732M confirmed cases 14.5M genomes

Current research program and work with NW PGCoE

  • Continue two major threads
    1. Evolutionary forecasting: focus on seasonal influenza and SARS-CoV-2 for impact on vaccine strain selection
    2. Genomic outbreak investigation: across pathogens with focus on spatial dynamics and recontructing spread
  • Continue to build out the Nextstrain software platform for broader use by the community

Clade and lineage forecasts continuously updated

Detailed geographic transmission patterns from identical sequences

Tran Kiem et al

Lay ground-work for near ubiquitous pathogen sequencing


SARS-CoV-2 genomic epi: Data producers from all over the world, GISAID and the Nextstrain team

Bedford Lab: John Huddleston, James Hadfield, Katie Kistler, Thomas Sibley, Jover Lee, Cassia Wagner, Miguel Paredes, Nicola Müller, Marlin Figgins, Victor Lin, Jennifer Chang, Allison Li, Eslam Abousamra, Donna Modrell, Nashwa Ahmed, Cécile Tran Kiem