---
title: User Guide
---

# User Guide
Learn how to use ZaroPGx to submit a sample for processing and receive insightful reports.
- *Last revised 2025-10-06*

## Web Interface
- The **Main Dashboard** provides:

- **File Upload**: Drag and drop or click to upload genomic files: single datafile is fine unless you have a raw FASTQ; in that case please upload both paired reads. If you have an existing index file, you may upload it as well, though it may not be necessary as a new one may be generated anyhow at some point throughout the pipeline. You may also enter an identifier for the sample.
- **System Status**: Monitor service health visually by observing the service glyphs, progress bar, and processing log.
- **Quick Actions**: Common tasks and shortcuts: you can check the header of a sample without running the pipeline. You can cancel a running pipeline cleanly. While uploading a sample, a cancel button is not provided, you may simply press the home button to reset the display. Service glyphs may be interacted with to toggle enable/disable override certain stages in the pipeline. 
- **Report Screen**: Upon completion of a workflow, a report screen will appear, offering PharmCAT and custom ZaroPGx reports. 

### Uploading Supported File Types

| Format | Extension | Description | Processing |
| --- | --- | --- | --- |
| **VCF** | `.vcf`, `.vcf.gz` | Variant calls | Direct analysis |
| **BAM** | `.bam` | Aligned reads | HLA typing → Analysis |
| **CRAM** | `.cram` | Compressed BAM | GATK → HLA typing → Analysis |
| **SAM** | `.sam` | Text alignment | GATK → HLA typing → Analysis |
| **FASTQ** | `.fastq`, `.fastq.gz` | Raw sequences | HLA typing → GATK → Analysis |

#### Upload Process

1. **Select Files**: Choose one or more genomic files
2. **Configure Options**:
   - **Sample Identifier**: Optional patient/sample name
   - **Reference Genome**: hg38 (default) or hg19 (coming in 0.3)
   - **Processing Options**: Enable/disable specific tools
3. **Start Analysis**: Click "Upload and Analyze"

#### Upload Options

**Reference Genome Selection:**
- **hg38/GRCh38**: Recommended
- **hg19/GRCh37**: Supported with automatic bcftools liftover (coming in 0.3)
- **T2T**: Not yet supported

**Processing Toggles:**
- **GATK Processing**: Enable/Disable use of GATK Tools including conversion and variant calling
- **HLA Typing**: Enable/Disable HLA allele calling
- **PyPGx Analysis**: Enable/Disable PyPGx's comprehensive star allele calling
- **Report Generation**: Enable/Disable custom ZaroPGx PDF and HTML reports

## Analysis Workflow

### Processing Stages

1. **File Validation**: Verify file format and integrity
2. **Header Analysis**: Extract metadata and contig information
3. **Preprocessing**: Convert files to VCF format if needed
4. **Allele Calling**: 
   - HLA typing (if enabled); computationally intensive
   - PyPGx analysis (if enabled, recommended)
   - PharmCAT analysis (required)
5. **Report Generation**: Create reports
6. **Data Export**: Optional FHIR export (first XML, coming in v0.3)

### Monitoring Progress

**Real-time Updates:**
- Progress percentage (estimated)
- Current processing stage (live logs)

**Detailed Logs:**
- Use `docker compose logs -f`
- See the /data directory for detailed service and Nextflow logs
  - Container-specific logs can also be accessed via `docker compose logs -f {container-name}`
- Nextflow logs will show:
  - Error messages and warnings
  - Processing statistics

## Reports

#### Custom PDF Report
- **Executive Summary**: Key findings and recommendations
- **Gene Analysis**: Detailed pharmacogene results
- **Drug Analysis**: Detailed overview of identified drugs
- **Clinical Guidelines**: CPIC, DPWG, and FDA-based recommendations
- **Technical Details**: Methodology, parameters, sampled header, etc.

#### Interactive HTML Report
- **Detailed Annotations**: Gene-specific information
- **Everything in PDF Report**: And more
- **Export Options**: Download data in various formats (FHIR in XML coming in 0.3)
- **Interactive Tables**: Sortable, filterable results (coming soon)
- **Visualizations**: Charts and diagrams (coming soon)

#### Raw Data Files
- **PharmCAT HTML**: Original PharmCAT report
- **PharmCAT JSON**: Machine-readable results
- **PharmCAT TSV**: Tab-separated data
- **VCF Files**: Processed variant calls, if you enable intermediate file retaining

### Understanding Results

#### Star Allele Notation
- **Format**: `*1/*2` (diplotype) or `*1` (haplotype), or `*3+*15` or similar (atypical cases)
- **Interpretation**: 
  - `*1`: Reference allele. Typically synonymous with wild (pheno)type.
  - `*2`, `*3`, etc.: Variant alleles
  - `*N`: Novel or undefined alleles

#### Phenotype Categories
- **Normal Metabolizer**: Typical drug processing
- **Intermediate Metabolizer**: Reduced drug processing
- **Poor Metabolizer**: Significantly reduced processing
- **Rapid Metabolizer**: Increased drug processing
- **Ultrarapid Metabolizer**: Very high drug processing

## API Usage (NEEDS REVIEW)
- **API Reference**: See https://pgx.zimerguz.net/api-reference for the standard ZaroPGx implementation's public API reference 

### REST API Endpoints

#### Upload Genomic Data
```bash
curl -X POST \
  -F "file=@sample.vcf" \
  -F "sample_identifier=patient_001" \
  -F "reference_genome=hg38" \
  http://localhost:8765/upload/genomic-data
```

#### Check Analysis Status
```bash
curl http://localhost:8765/status/{job_id}
```

#### Get Report URLs
```bash
curl http://localhost:8765/reports/{job_id}
```

#### Download Reports
```bash
curl -O http://localhost:8765/reports/{patient_id}/{report_file}
```

### API Response Format

```json
{
  "job_id": "uuid-string",
  "status": "completed",
  "progress": 100,
  "pdf_report_url": "/reports/patient_id/report.pdf",
  "html_report_url": "/reports/patient_id/report.html",
  "diplotypes": {
    "CYP2D6": "*1/*2",
    "CYP2C19": "*1/*1"
  },
  "recommendations": [
    {
      "gene": "CYP2D6",
      "recommendation": "Consider alternative dosing",
      "severity": "yellow"
    }
  ]
}
```

## Data Management

### File Organization

**Upload Directory**: `/data/uploads/`
- Original uploaded files
- Temporary processing files
- Index files (.bai, .crai, .csi, .tbi)

**Reports Directory**: `/data/reports/{patient_id}/`
- Generated reports (PDF, HTML)
- Raw analysis outputs
- Intermediate processing files

**Reference Directory**: `/reference/`
- Reference genome files
- Annotation databases
- Tool-specific references

### Data Retention
# (NOTE: The ZaroPGx Demo Reference server at pgx.zimerguz.net is for DEMO purposes only! Do not upload your sensitive data)

- **Uploaded Files**: Retained indefinitely (configurable)
- **Processing Logs**: Retained for 30 days (configurable)
- **Reports**: Retained indefinitely (configurable)
- **Temporary Files**: Cleaned up after processing

### Data Export

#### FHIR Export
```bash
curl -X POST \
  http://localhost:8765/reports/{report_id}/export-to-fhir
```

#### Bulk Export
```bash
# Export all reports for a patient
curl http://localhost:8765/patients/{patient_id}/export
```

## Best Practices

### File Preparation
1. **Choose the highest fidelity genomic datafile for submission**: Computing resources aside, make sure you choose the best file out of the files available for a given sample to upload.
2. **Include Index Files**: Provide the accompanying index file (.bai, .crai, .csi, .tbi,) if available
3. **Check Quality**: Verify file integrity before upload

### Analysis Configuration
1. **Enable Relevant Tools**: Only enable tools your device can afford to run (the program will attempt to match your hardware, but if memory or storage runs out, it may hang or crash)
2. **Monitor Resources**: Watch CPU and memory usage during processing
3. **Review Logs**: Check docker compose container logs and nextflow logs for warnings or errors

### Result Interpretation
1. **Understand Limitations**: Be aware of tool-specific limitations, especially if a VCF sample was submitted
2. **Review Quality Metrics**: Check confidence scores and coverage (see the header matter)
3. **Consider Broader Context**: Review the findings in a broad context
4. **Validate Findings**: Follow up with a qualified professional and when applicable, an accredited laboratory

## Troubleshooting

**Upload Failures:**
- Check file format and size
- Verify network connectivity
- Review server logs

**Processing Errors:**
- Check file quality and format
- Verify reference genome availability
- Review container logs

**Report Generation Issues:**
- Check drive space availability and permissions
- Verify that all software dependencies are properly configured
- Review report generation logs

### Getting Help
1. **Check Logs**: Review container and application logs
2. **Documentation**: Consult this guide
3. **Community**: Check discussions on GitHub, or start a new thread
4. **Issues**: Report bugs, request features, and suggest changes on GitHub

## Next Steps
- **Learn about file formats**: {doc}`file-formats`
- **Understand reports**: {doc}`reports`
- **Configure advanced settings**: {doc}`../advanced-configuration`
- **Troubleshoot issues**: {doc}`troubleshooting`