zika-seq/pipeline/scripts/illumina/

#Analysis of Illumina Data

###Data Volume

Create a named data volume that mirrors the local .fastq.gz files generated to data/ within container:

   docker create --name illumina-data -v /path/to/local/illumina/data/data:/illumina_data zibra/zibra

Modify align_reads/run_pipeline.sh and get_coverage/run_pipeline.sh to execute snakemake on a single machine or using a scheduling system on a cluster. By default the script executes the pipeline on the local machine.

Run ./run_pipeline.sh in align_reads/ to create aligned bam files. Output files are written to a folder named using the timestamp in /build/illumina_analysis/ by default. To generate alignment statistics for merged bam files run ./run_pipeline.sh in the get_coverage/ directory. Modify the src parameter in get_coverage/Snakefile to point to the new output folder. The statistics file is written to a _pileup/ direcoty in _aligned_bams/ by default.

Example directory structure of output folder,

.
|-- _aligned_bams
|   |-- Sample1.aligned.sorted.bam
|   |-- Sample1.trimmed.aligned.sorted.bam
|   |-- Sample2.aligned.sorted.bam
|   |-- Sample2.trimmed.aligned.sorted.bam
|   `-- _pileup
|       |-- Sample1.aligned.sorted.tsv
|       |-- Sample1.trimmed.aligned.sorted.tsv
|       |-- Sample2.aligned.sorted.tsv
|       |-- Sample2.trimmed.aligned.sorted.tsv
|       `-- statistics.md
|-- _reads
|   |-- Sample1_R1.fastq
|   |-- Sample1_R2.fastq
|   |-- Sample2_R1.fastq
|   `-- Sample2_R2.fastq
`-- _reports
    |-- Sample1.alignreport.txt
    `-- Sample1.alignreport.txt