EBOLA Pipeline Notes
Upload documents to VDB
- Download Ebola sequence data from https://github.com/ebov/space-time/tree/master/Data
- Move
Makona_1610_genomes_genomes.fasta
sequences tofauna/data
- Replace
SLE
withsierra_leone
,LBR
withliberia
andGIN
withguinea
- Replace
sierra_leone\|\?
withsierra_leone|sierra_leone
,liberia\|\?
withliberia|liberia
andguinea\|\?
withguinea|guinea
- Replace
sierra_leone\|\|
withsierra_leone|sierra_leone|
,liberia\|\|
withliberia|liberia|
andguinea\|\|
withguinea|guinea|
- Upload to vdb database
python2 vdb/ebola_upload.py -db vdb -v ebola --source genbank --locus genome --fname Makona_1610_genomes_genbank.fasta
- Hand edit author and url info into other 153 genomes
- Upload to vdb database
python2 vdb/ebola_upload.py -db vdb -v ebola --source genbank --locus genome --fname Makona_1610_genomes_quick.fasta --authors "Quick et al" --url https://github.com/nickloman/ebov/
Update
- Update citation fields
python2 vdb/ebola_update.py -db vdb -v ebola --update_citations
- Updates
authors
,title
,url
,journal
andpuburl
fields from genbank files - If you get
ERROR: Couldn't connect with entrez, please run again
just run command again
Download documents from VDB
python2 vdb/ebola_download.py -db vdb -v ebola --fstem ebola --resolve_method choose_genbank