This page was composed with the aid of generative artificial intelligence; it is fully curated.

Input Format Support Priorities

Pipeline Function

Add GRCh37 support via bcftools liftover; surface accuracy caveats after conversion
Clarify workflow vs job IDs; define single source for workflow definition and per-run job state
Represent workflows as finite state matrix, each unique and deterministic workflow should have an assigned ID which can be quickly spot checked
Nextflow orchestration
- Dynamic resource allocation (CPU/memory by attempt, file type+size, etc.)
- Track active tool/stage; reflect in UI icons and progress
Improve progress calculation by normalizing step/substep points to 100%
Accept uploads by URL (streamed) and multi-file selects (main + index) with proper pairing
Recognize and/or regenerate index files as needed; map unaligned to appropriate reference: currently GRCh38.p14
Consider preprocessing complementing PyPGx-led VCF generation (evaluate necessity)
Add mtdna-server-2
Finish wiring in ZaroHLA
Improve analysis, make better use of samtools and bcftools

Implement translation layer (lexicon) to translate outside calls to recognized nomenclature
Implement optional and intelligent switch to toggle assume reference when missing

Unified report generation combining PharmCAT clinical recommendations with PyPGx gene coverage
Add demographics mini-section: mitochondrial lineage/haplogroup and variant rarity context
Standardize folder naming of generated reports (timestamp-based) and place logs under data/logs/
Display workflow ID specific Kroki/Mermaid workflow diagram in both HTML and PDF outputs
Add clear wording: sample vs patient terminology; avoid assumptions of medical context
Abstract report theme so cross-pipeline outputs remain stylistically consistent
Custom reports: add a QR code containing the raw data

Responsive glyphs: wrapping on small screens; grey-out non-applicable steps; size/flex adjustments
Add preprocessing glyph (e.g., Liftover) where applicable & mtDNA glyph
Unify/clean redundant text

PostgreSQL 17: add extensions; implement schemas
Adopt JSONB where appropriate; ensure escaping for special characters (done)
Begin persisting normalized results; build lexicon layer translating between caller spelling
Consolidating reference and sample material (FASTA/CPIC dumps) into a single references/ area

Ensure self-hosted deployments never transmit genomic data externally
Add cookie/consent footer for public deployments with per-user access gating (configurable via .env)
Add Privacy Policy and legal page

Clean compose stack; prefer compose.yml naming and remove legacy docker-compose.yml if redundant
Implement CI/CD github action to dockerhub image build
Clean up deprecated flags

Achieve complete docs curation
Provide example .env guidance; clarify build/run expectations for local Docker

Modularize large Python modules into smaller, focused files to improve readability and maintainability

Where should indexing responsibility live (always regenerate vs recognize existing)?
How to unify pipeline progress across heterogeneous inputs (FASTQ/BAM/VCF)?
Which schema to implement, ultimately?
Visualizations: what would be useful?
Should we integrate ClinPGx datasets directly for annotations, instead of (or alongside) a lexicon layer?