Rethink database to support real-time virus analysis

ZIKA Pipeline Notes


  • Update citation fields
    • python vdb/ -db vdb -v zika --update_citations
    • updates authors, title and url fields from genbank files
    • If you get ERROR: Couldn't connect with entrez, please run again just run command again
  • Update location fields
    • After hand editing location in chateau
    • python vdb/ -db vdb -v zika --update_locations
    • Updates division, country, region, latitude, longitude fields


python vdb/ -db vdb -v zika --fstem zika --resolve_method choose_genbank


ViPR sequences

  1. Download sequences
    • Select year >= 2013 and genome length >= 5000
    • Download as Genome Fasta
    • Set Custom Format Fields to 0: GenBank Accession, 1: Strain Name, 2: Segment, 3: Date, 4: Host, 5: Country, 6: Subtype, 7: Virus Species
  2. Move downloaded sequences to fauna/data
  3. Upload to vdb database
    • python vdb/ -db vdb -v zika --source genbank --locus genome --fname GenomeFastaResults.fasta

Fred Hutch sequences

Upload with:

python vdb/ -db vdb -v zika --source fh --locus genome --authors "Black et al" --fname zika_usvi_good.fasta --url
python vdb/ -db vdb -v zika --source fh --locus genome --authors "Black et al" --fname zika_usvi_partial.fasta --url