Data schema
Nanopore reads
Input data to the Zika pipeline arrives in the data/ directory.
data/usvi-library1-2016-12-10/- libraryraw_reads/- squiggle graphs in fast5 format; all subdirectores are written automatically by MinKNOW0/- contains ~4000 raw.fast5files1/- contains ~4000 raw.fast5files- etc …
basecalled_reads/- basecalled with Albacore 1.0.4; all subdirectores are written automatically by Albacoresequencing_summary.txt- summary of Albacore run; automatically made by Albacorepipeline.log- summary of Albacore run; automatically made by Albacoreworkspace- contains all basecalled, demultiplexed readsbarcode01/- basecalled, demultiplexed reads with ONT barcode NB010/- contains ~4000 basecalled, demultiplexed.fast5files1/- contains ~4000 basecalled, demultiplexed.fast5files- etc …
barcode02/- basecalled, demultiplexed reads with ONT barcode NB020/- contains ~4000 basecalled, demultiplexed.fast5files1/- contains ~4000 basecalled, demultiplexed.fast5files- etc …
- …
barcode12/- basecalled, demultiplexed reads with ONT barcode NB120/- contains ~4000 basecalled, demultiplexed.fast5files1/- contains ~4000 basecalled, demultiplexed.fast5files- etc …
unclassified/- basecalled, non-demultiplexed reads0/- contains ~4000 basecalled, non-demultiplexed.fast5files1/- contains ~4000 basecalled, non-demultiplexed.fast5files- etc …
Sample metadata
Sample metadata for the Zika pipeline arrives in the samples/ directory.
samples/samples.tsv- line list of sample metadataruns.tsv- line list of run metadata
samples.tsv
Must be tsv formatted. Keyed off of column headers rather than column order.
| sample_id | strain | collection_date | country | division | location |
|---|---|---|---|---|---|
| ZBRD116 | ZBRD116 | 2015-08-28 | brazil | alagoas | arapiraca |
| ZBRC301 | ZBRC301 | 2015-05-13 | brazil | pernambuco | paulista |
runs.tsv
Must be tsv formatted. Keyed off of column headers rather than column order.
| run_name | barcode_id | sample_id | primer_scheme |
|---|---|---|---|
| library1 | NB01 | ZBRD116 | v2_500.amplicons.ver2 |