Nextstrain build for novel coronavirus SARS-CoV-2
Source code
Latest commits

Compiled Nextstrain SARS-CoV-2 resources are available at Follow @nextstrain for continual data updates.

This phylogeny shows evolutionary relationships of SARS-CoV-2 viruses from the ongoing COVID-19 pandemic. Although the genetic relationships among sampled viruses are quite clear, there is considerable uncertainty surrounding estimates of specific transmission dates and in reconstruction of geographic spread. Please be aware that specific inferred geographic transmission patterns and temporal estimates are only a hypothesis.

There are millions of complete SARS-CoV-2 genomes available and this number increases every day. This visualization can only handle ~4000 genomes in a single view for performance and legibility reasons. Because of this we subsample available genome data for these analysis views. Our primary global analysis subsamples to ~600 genomes per continental region with ~400 from the previous 4 months and ~200 from before this. This results in a more equitable global sequence distribution, but hides samples available from regions that are doing lots of sequencing. To mitigate against this, we’ve set up separate analyses to focus on particular regions. They are available on the “Dataset” dropdown on the left or by clicking on the following links: Africa, Asia, Europe, North America, Oceania and South America.

Site numbering and genome structure uses Wuhan-Hu-1/2019 as reference. The phylogeny is rooted relative to early samples from Wuhan. Temporal resolution assumes a nucleotide substitution rate of 8 × 10^-4 subs per site per year. Mutational fitness is calculated using results from Obermeyer et al (under review). Full details on bioinformatic processing can be found here.

We gratefully acknowledge the authors, originating and submitting laboratories of the genetic sequence and metadata made available through GISAID on which this research is based. An attribution table is available by clicking on “Download Data” at the bottom of the page and then clicking on “Acknowledgments” in the resulting dialog box.

At the specific request of GISAID, we:

  • maintain the prefix hCoV-19/ in the names of viral isolates
  • disable download of full metadata TSV and provide instead an acknowledgments TSV in the “download data” link at the bottom of the page
  • refrain from sharing alignments or other intermediate files computed in our pipeline