Genomic epidemiological studies have been used in academic contexts to reconstruct regional transmission of Ebola during the West African outbreak, estimate when Zika came to Brazil, and investigate how seasonal influenza circulates around the world. But these types of studies have moved out of the ivory tower, and public health agencies regularly sequence and analyze whole pathogen genomes to support surveillance and epidemiologic investigations of foodborne diseases, tuberculosis, and influenza, among other pathogens. Indeed, almost every infectious disease program at the Centers for Disease Control and Prevention now uses pathogen genomics, with increasing adoption by state and local health departments as well.

Pathogen genomics is a great addition to the public health toolbox. However, genomic data is complex and needs transformation from its raw form prior to analysis. Increasing use of pathogen genomics will require that public health agencies invest in advanced computational infrastructure, develop a broader technical workforce, and investigate new approaches to integrated data management and stewardship. As the number of agencies with genomic surveillance capabilities grows we’ll need a unified network of validated, reproducible ways to analyze data. The question then is how do we build that ecosystem?

In collaboration with the CDC’s Office of Advanced Molecular Detection (OAMD) we’ve written a whitepaper describing ten recommendations for supporting open pathogen genomic analysis in public health settings, which we’ve just posted to preprints.org (bioRxiv doesn’t take editorial content such as this).

To get a sense of the current landscape of pathogen genomic analysis in public health agencies, including investigating challenges encountered and overcome, we conducted a series of long form interviews with public health practitioners who use pathogen genomic data. We spoke with various branches and divisions at CDC, as well as state public health labs in the United States, provincial public health labs in Canada, and representatives from the European CDC. In a concurrent effort, the Africa CDC investigated similar questions and assessed capabilities for building genomic surveillance across the African continent. We learned a lot from these interviews about what parts of genomic surveillance are working well in public health agencies, as well as areas that need to be improved. This information forms the basis of our proposals.

This paper is just the first step in what we hope is a community-based discussion and development effort of standards and tools for everything from databases to pipelines to data visualization capabilities. These community-based efforts will be guided and supported by the Public Health Alliance for Genomic Epidemiology (PHA4GE). Announced in October 2019, PHA4GE is a global coalition that is actively working to establish consensus standards; document and share best practices; improve the availability of critical bioinformatic tools and resources; and advocate for greater openness, interoperability, accessibility and reproducibility in public health microbial bioinformatics. If you’re interested in joining in on this effort, please get in touch!