RethinkDB database to support real-time virus analysis

Download data from Genbank

  • Genbank search URL
  • This is search fields of measles[title] AND viruses[filter] AND ("5000"[SLEN] : "20000"[SLEN])
  • Send to : Complete Record : File : Accession List
  • This downloads the file sequence.seq
  • Remove the .1, .2, etc… from the accession numbers in sequence.seq:

      sed -i '' -e 's/.1$//g' -e 's/.2$//g' sequence.seq
    

Upload to fauna

python3 vdb/measles_upload.py \
  -db vdb \
  -v measles \
  --ftype accession \
  --source genbank \
  --locus genome \
  --fname sequence.seq

Download from fauna

python3 vdb/measles_download.py \
  -db vdb \
  -v measles \
  --fstem measles \
  --resolve_method choose_genbank