This phylogeny shows evolutionary relationships of SARS-CoV-2 viruses from the ongoing COVID-19 pandemic. Although the genetic relationships among sampled viruses are generally quite clear, there is considerable uncertainty surrounding estimates of specific transmission dates and in reconstruction of geographic spread. Please be aware that specific inferred geographic transmission patterns and temporal estimates are only a hypothesis.
There are millions of complete SARS-CoV-2 genomes available and this number increases every day. This visualization can only handle ~4000 genomes in a single view for performance and legibility reasons. Because of this we subsample available genome data for our analysis views. We provision multiple views to focus subsampling on different geographic regions and different time periods. These views are available through the “Dataset” dropdown on the left or by clicking on the following links:
|past 1 month||past 2 months||past 6 months||all time|
Site numbering and genome structure uses Wuhan-Hu-1/2019 as reference. The phylogeny is rooted relative to early samples from Wuhan. Temporal resolution assumes a nucleotide substitution rate of 8 × 10^-4 subs per site per year. Mutational fitness is calculated using results from Obermeyer et al (under review). Full details on bioinformatic processing can be found here.
We gratefully acknowledge the authors, originating and submitting laboratories of the genetic sequences and metadata made available through GISAID on which this research is based. An attribution table is available by clicking on “Download Data” at the bottom of the page and then clicking on “Acknowledgments” in the resulting dialog box.
At the specific request of GISAID, we:
- maintain the prefix
hCoV-19/in the names of viral isolates
- disable download of full metadata TSV and provide instead an acknowledgments TSV in the “Download Data” link at the bottom of the page
- refrain from sharing alignments or other intermediate files computed in our pipeline