Nextstrain build for novel coronavirus SARS-CoV-2

A Getting Started Guide to the Genomic Epidemiology of SARS-CoV-2

This template and tutorial will walk you through the process of running a basic phylogenetic analysis on SARS-CoV-2 data. We’ve created these resources with the goal of enabling Departments of Public Health to start using Nextstrain to understand their SARS-CoV-2 genomic data within 1-2 hours. In addition to the phylogenetic analysis described here, you can use our “drag-and-drop” tool for a clade assignment, mutations calling, and basic sequence quality checks at

We also recommend this 1-hour video overview by Heather Blankenship on how to deploy Nextstrain for a Public Health lab.

Overview: complete walkthrough

Getting started with analysis

The starting point for this section is a FASTA file with sequence data + a TSV file with metadata. You can alternately use our example data to start.

  1. Setup and installation
  2. Preparing your data
  3. Orientation: analysis workflow
  4. Orientation: which files should I touch?
  5. Running & troubleshooting
  6. Customizing your analysis
  7. Customizing your visualization

Getting started with visualization & interpretation

The starting point for this section is a JSON file. You can alternately use our examples to start.

  1. Options for visualizing and sharing results
  2. Interpreting your results
  3. Writing a narrative to highlight key findings

Multiple inputs

  1. Running the pipeline starting with multiple inputs

Reference guides


If something in this tutorial is broken or unclear, please open an issue so we can improve it for everyone.

If you have a specific question, post a note over at the discussion board – we’re happy to help!