Pipelines to do MinION sequencing of Zika virus

Overlap Graphs

To prepare libraries for this pipeline:

  1. poretools fasta --type 2D <path/to/base/called/reads/> > <name.fasta>
  2. bwa mem -x on2d <indexed_reference.fasta> <name.fasta> | samtools view -bS - | samtools sort -o <name.sorted.bam> -
  3. samtools depth <name.sorted.bam> > <name.coverage>
    • head <name.coverage> # This finds the name of the 'chromosome'; there may be >1.
  4. awk '$1 == "<chromosomename>" {print $0}' <name.coverage> > chr1.coverage
  5. Repeat for paired library
  6. Fill in and into pool1 and pool2 in depth_coverage.R
  7. Stats acquired using poretools stats <path/to/base/called/reads

Reported p20 and p40 values represent percentage of the Zika genome which has at coverage of at least 20/40 reads, respectively.

Overlap graphs and stats on pass libraries

NB01-NB07 Overlap

p20: 0.9702 p40: 0.9671

NB01NB07
total reads97297530
total base pairs52253553992373
mean537.09530.20
median547550
min213201
max17811827
N25580577
N50552555
N75520508
NB02-NB08 Overlap

p20: 0.8412 p40: 0.7145

NB02NB08
total reads82776462
total base pairs44023003463771
mean531.87536.02
median541554
min176213
max17211309
N25570578
N50544558
N75518523
NB03-NB09 Overlap

p20: 0.6571 p40: 0.5022

NB03NB09
total reads36184845
total base pairs18880202638147
mean521.84544.51
median531557
min265277
max16621696
N25562579
N50535561
N75501531
NB04-NB010 Overlap

p20: 0.9384 p40: 0.9087

NB04NB10
total reads123817224
total base pairs64503443853127
mean520.99533.38
median538550
min192207
max19392260
N25575577
N50544555
N75498514
NB05-NB11 Overlap

p20: 0.9066 p40: 0.8737

NB05NB11
total reads134799558
total base pairs73766565043864
mean547.27527.71
median551549
min183168
max18461874
N25578578
N50554555
N75530504
NB06-NB12 Overlap

p20: 0 p40: 0

NB06NB12
total reads8133
total base pairs2807716649
mean346.63504.52
median323517
min145322
max600596
N25518574
N50341546
N75302476