User Guide
Learn how to use ZaroPGx to submit a sample for processing and receive insightful reports.
Last revised 2025-10-06
Web Interface
The Main Dashboard provides:
File Upload: Drag and drop or click to upload genomic files: single datafile is fine unless you have a raw FASTQ; in that case please upload both paired reads. If you have an existing index file, you may upload it as well, though it may not be necessary as a new one may be generated anyhow at some point throughout the pipeline. You may also enter an identifier for the sample.
System Status: Monitor service health visually by observing the service glyphs, progress bar, and processing log.
Quick Actions: Common tasks and shortcuts: you can check the header of a sample without running the pipeline. You can cancel a running pipeline cleanly. While uploading a sample, a cancel button is not provided, you may simply press the home button to reset the display. Service glyphs may be interacted with to toggle enable/disable override certain stages in the pipeline.
Report Screen: Upon completion of a workflow, a report screen will appear, offering PharmCAT and custom ZaroPGx reports.
Uploading Supported File Types
Format |
Extension |
Description |
Processing |
|---|---|---|---|
VCF |
|
Variant calls |
Direct analysis |
BAM |
|
Aligned reads |
HLA typing → Analysis |
CRAM |
|
Compressed BAM |
GATK → HLA typing → Analysis |
SAM |
|
Text alignment |
GATK → HLA typing → Analysis |
FASTQ |
|
Raw sequences |
HLA typing → GATK → Analysis |
Upload Process
Select Files: Choose one or more genomic files
Configure Options:
Sample Identifier: Optional patient/sample name
Reference Genome: hg38 (default) or hg19 (coming in 0.3)
Processing Options: Enable/disable specific tools
Start Analysis: Click “Upload and Analyze”
Upload Options
Reference Genome Selection:
hg38/GRCh38: Recommended
hg19/GRCh37: Supported with automatic bcftools liftover (coming in 0.3)
T2T: Not yet supported
Processing Toggles:
GATK Processing: Enable/Disable use of GATK Tools including conversion and variant calling
HLA Typing: Enable/Disable HLA allele calling
PyPGx Analysis: Enable/Disable PyPGx’s comprehensive star allele calling
Report Generation: Enable/Disable custom ZaroPGx PDF and HTML reports
Analysis Workflow
Processing Stages
File Validation: Verify file format and integrity
Header Analysis: Extract metadata and contig information
Preprocessing: Convert files to VCF format if needed
Allele Calling:
HLA typing (if enabled); computationally intensive
PyPGx analysis (if enabled, recommended)
PharmCAT analysis (required)
Report Generation: Create reports
Data Export: Optional FHIR export (first XML, coming in v0.3)
Monitoring Progress
Real-time Updates:
Progress percentage (estimated)
Current processing stage (live logs)
Detailed Logs:
Use
docker compose logs -fSee the /data directory for detailed service and Nextflow logs
Container-specific logs can also be accessed via
docker compose logs -f {container-name}
Nextflow logs will show:
Error messages and warnings
Processing statistics
Reports
Custom PDF Report
Executive Summary: Key findings and recommendations
Gene Analysis: Detailed pharmacogene results
Drug Analysis: Detailed overview of identified drugs
Clinical Guidelines: CPIC, DPWG, and FDA-based recommendations
Technical Details: Methodology, parameters, sampled header, etc.
Interactive HTML Report
Detailed Annotations: Gene-specific information
Everything in PDF Report: And more
Export Options: Download data in various formats (FHIR in XML coming in 0.3)
Interactive Tables: Sortable, filterable results (coming soon)
Visualizations: Charts and diagrams (coming soon)
Raw Data Files
PharmCAT HTML: Original PharmCAT report
PharmCAT JSON: Machine-readable results
PharmCAT TSV: Tab-separated data
VCF Files: Processed variant calls, if you enable intermediate file retaining
Understanding Results
Star Allele Notation
Format:
*1/*2(diplotype) or*1(haplotype), or*3+*15or similar (atypical cases)Interpretation:
*1: Reference allele. Typically synonymous with wild (pheno)type.*2,*3, etc.: Variant alleles*N: Novel or undefined alleles
Phenotype Categories
Normal Metabolizer: Typical drug processing
Intermediate Metabolizer: Reduced drug processing
Poor Metabolizer: Significantly reduced processing
Rapid Metabolizer: Increased drug processing
Ultrarapid Metabolizer: Very high drug processing
API Usage (NEEDS REVIEW)
API Reference: See https://pgx.zimerguz.net/api-reference for the standard ZaroPGx implementation’s public API reference
REST API Endpoints
Upload Genomic Data
curl -X POST \
-F "file=@sample.vcf" \
-F "sample_identifier=patient_001" \
-F "reference_genome=hg38" \
http://localhost:8765/upload/genomic-data
Check Analysis Status
curl http://localhost:8765/status/{job_id}
Get Report URLs
curl http://localhost:8765/reports/{job_id}
Download Reports
curl -O http://localhost:8765/reports/{patient_id}/{report_file}
API Response Format
{
"job_id": "uuid-string",
"status": "completed",
"progress": 100,
"pdf_report_url": "/reports/patient_id/report.pdf",
"html_report_url": "/reports/patient_id/report.html",
"diplotypes": {
"CYP2D6": "*1/*2",
"CYP2C19": "*1/*1"
},
"recommendations": [
{
"gene": "CYP2D6",
"recommendation": "Consider alternative dosing",
"severity": "yellow"
}
]
}
Data Management
File Organization
Upload Directory: /data/uploads/
Original uploaded files
Temporary processing files
Index files (.bai, .crai, .csi, .tbi)
Reports Directory: /data/reports/{patient_id}/
Generated reports (PDF, HTML)
Raw analysis outputs
Intermediate processing files
Reference Directory: /reference/
Reference genome files
Annotation databases
Tool-specific references
Data Retention
(NOTE: The ZaroPGx Demo Reference server at pgx.zimerguz.net is for DEMO purposes only! Do not upload your sensitive data)
Uploaded Files: Retained indefinitely (configurable)
Processing Logs: Retained for 30 days (configurable)
Reports: Retained indefinitely (configurable)
Temporary Files: Cleaned up after processing
Data Export
FHIR Export
curl -X POST \
http://localhost:8765/reports/{report_id}/export-to-fhir
Bulk Export
# Export all reports for a patient
curl http://localhost:8765/patients/{patient_id}/export
Best Practices
File Preparation
Choose the highest fidelity genomic datafile for submission: Computing resources aside, make sure you choose the best file out of the files available for a given sample to upload.
Include Index Files: Provide the accompanying index file (.bai, .crai, .csi, .tbi,) if available
Check Quality: Verify file integrity before upload
Analysis Configuration
Enable Relevant Tools: Only enable tools your device can afford to run (the program will attempt to match your hardware, but if memory or storage runs out, it may hang or crash)
Monitor Resources: Watch CPU and memory usage during processing
Review Logs: Check docker compose container logs and nextflow logs for warnings or errors
Result Interpretation
Understand Limitations: Be aware of tool-specific limitations, especially if a VCF sample was submitted
Review Quality Metrics: Check confidence scores and coverage (see the header matter)
Consider Broader Context: Review the findings in a broad context
Validate Findings: Follow up with a qualified professional and when applicable, an accredited laboratory
Troubleshooting
Upload Failures:
Check file format and size
Verify network connectivity
Review server logs
Processing Errors:
Check file quality and format
Verify reference genome availability
Review container logs
Report Generation Issues:
Check drive space availability and permissions
Verify that all software dependencies are properly configured
Review report generation logs
Getting Help
Check Logs: Review container and application logs
Documentation: Consult this guide
Community: Check discussions on GitHub, or start a new thread
Issues: Report bugs, request features, and suggest changes on GitHub
Next Steps
Learn about file formats: Supported File Formats [NEEDS CURATION]
Understand reports: Understanding Reports
Configure advanced settings: Advanced Configuration
Troubleshoot issues: Troubleshooting Guide