Can't find your sequences in Nextstrain? Check here for common reasons why your sequences may not be appearing.
The hCoV-19 / SARS-CoV-2 genomes were generously shared via GISAID. We gratefully acknowledge the Authors, Originating and Submitting laboratories of the genetic sequence and metadata made available through GISAID on which this research is based.
In order to download the GISAID data to run yourself, please see Running a SARS-CoV-2 analysis
Please note that
data/metadata.tsvis no longer included as part of this repo and should be downloaded directly from GISAID.
For information about how clades are defined, and the currently named clades, please see here.
Site numbering and genome structure uses Wuhan-Hu-1/2019 as reference. The phylogeny is rooted relative to early samples from Wuhan. Temporal resolution assumes a nucleotide substitution rate of 8 × 10-4 subs per site per year. There were SNPs present in the nCoV samples in the first and last few bases of the alignment that were masked as likely sequencing artifacts.
If you'd like to customize and run the analysis yourself, please see the developer docs.
How to run using your own data
- How to format the metadata
- Running a SARS-CoV-2 analysis
- Understanding the structure of the workflow including directory structure and the order in which Snakemake runs rules
We welcome contributions from the community! Please note that we strictly adhere to the Contributor Covenant Code of Conduct.
Contributing to translations of our situation reports
Please see the translations repo to get started!
Contributing to software or documentation
Please see our Contributor Guide to get started!
Please note that we automatically pick up any hCoV-19 data that is submitted to GISAID.
If you're a lab and you'd like to get started sequencing, please see:
* Protocols from the ARTIC network
* Funding opportunities for sequencing efforts
* Or, if these don't meet your needs, get in touch