[CURATION NEEDED]
Contributing Guide
How to contribute to ZaroPGx development.
Getting Started
Prerequisites
Git
Docker and Docker Compose
Python 3.12+
Basic understanding of pharmacogenomics
Familiarity with FastAPI and SQLAlchemy
Development Setup
Fork the repository on GitHub
Clone your fork:
git clone https://github.com/your-username/ZaroPGx.git cd ZaroPGx
Set up development environment:
cp .env.local .env docker compose up -d --build
Verify setup:
curl http://localhost:8765/health
Contribution Types
Bug Reports
Before reporting a bug:
Check existing issues
Search closed issues
Test with latest version
Gather relevant information
Bug report template:
## Bug Description
Brief description of the bug
## Steps to Reproduce
1. Step one
2. Step two
3. Step three
## Expected Behavior
What should happen
## Actual Behavior
What actually happens
## Environment
- OS: [e.g., Ubuntu 20.04]
- Docker version: [e.g., 20.10.7]
- ZaroPGx version: [e.g., 1.0.0]
## Additional Context
Screenshots, logs, or other relevant information
Feature Requests
Before requesting a feature:
Check existing feature requests
Consider if it fits the project scope
Think about implementation complexity
Consider backward compatibility
Feature request template:
## Feature Description
Brief description of the feature
## Use Case
Why is this feature needed?
## Proposed Solution
How should this feature work?
## Alternatives Considered
What other approaches were considered?
## Additional Context
Mockups, examples, or references
Code Contributions
Types of contributions:
Bug fixes
New features
Documentation improvements
Performance optimizations
Test coverage improvements
Development Workflow
1. Create Feature Branch
# Update main branch
git checkout main
git pull origin main
# Create feature branch
git checkout -b feature/your-feature-name
Branch naming conventions:
feature/description: New featuresbugfix/description: Bug fixesdocs/description: Documentationrefactor/description: Code refactoringtest/description: Test improvements
2. Make Changes
Code style guidelines:
Follow PEP 8 for Python code
Use type hints
Write docstrings for functions
Use meaningful variable names
Keep functions small and focused
Example code style:
from typing import List, Optional
from sqlalchemy.orm import Session
def process_genomic_data(
file_path: str,
db: Session,
sample_identifier: Optional[str] = None
) -> dict:
"""
Process genomic data file and return analysis results.
Args:
file_path: Path to the genomic data file
db: Database session
sample_identifier: Optional sample identifier
Returns:
Dictionary containing analysis results
Raises:
ValueError: If file format is not supported
"""
# Implementation here
pass
3. Write Tests
Test requirements:
Write tests for new functionality
Maintain test coverage above 80%
Use descriptive test names
Test both success and failure cases
Example test:
import pytest
from fastapi.testclient import TestClient
from app.main import app
client = TestClient(app)
def test_upload_genomic_data_success():
"""Test successful genomic data upload."""
with open("test_data/sample.vcf", "rb") as f:
response = client.post(
"/upload/genomic-data",
files={"file": f},
data={"sample_identifier": "test_sample"}
)
assert response.status_code == 200
assert "job_id" in response.json()
assert response.json()["status"] == "uploaded"
def test_upload_genomic_data_invalid_file():
"""Test upload with invalid file format."""
with open("test_data/invalid.txt", "rb") as f:
response = client.post(
"/upload/genomic-data",
files={"file": f}
)
assert response.status_code == 400
assert "error" in response.json()
4. Update Documentation
Documentation requirements:
Update relevant documentation
Add docstrings for new functions
Update API documentation
Add examples for new features
Documentation types:
Code comments and docstrings
API documentation
User guides
Developer guides
README updates
5. Run Tests and Linting
# Run all tests
pytest
# Run with coverage
pytest --cov=app --cov-report=html
# Format code
black app/
isort app/
# Lint code
flake8 app/
mypy app/
6. Commit Changes
# Stage changes
git add .
# Commit with descriptive message
git commit -m "feat: add support for CRAM file processing
- Add CRAM file detection and validation
- Implement GATK preprocessing for CRAM files
- Add tests for CRAM processing workflow
- Update documentation with CRAM support"
Commit message format:
type(scope): description
Longer description if needed
- Bullet point 1
- Bullet point 2
Closes #123
Commit types:
feat: New featurefix: Bug fixdocs: Documentationstyle: Code style changesrefactor: Code refactoringtest: Test additions/changeschore: Maintenance tasks
7. Push and Create Pull Request
# Push branch
git push origin feature/your-feature-name
# Create pull request on GitHub
Pull request template:
## Description
Brief description of changes
## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update
## Testing
- [ ] Tests pass locally
- [ ] New tests added
- [ ] All tests pass in CI
## Checklist
- [ ] Code follows style guidelines
- [ ] Self-review completed
- [ ] Documentation updated
- [ ] No breaking changes
Code Review Process
Review Criteria
Code quality:
Follows coding standards
Has appropriate tests
Handles errors gracefully
Is well-documented
Functionality:
Solves the intended problem
Doesn’t break existing functionality
Is performant
Is secure
Documentation:
Code is self-documenting
Docstrings are present
README is updated if needed
API docs are updated
Review Process
Automated checks must pass
At least one reviewer must approve
All conversations must be resolved
CI/CD pipeline must pass
Maintainer approval for significant changes
Responding to Reviews
Be responsive:
Address feedback promptly
Ask questions if unclear
Explain your reasoning
Be open to suggestions
Common responses:
“Done” - Simple fix applied
“Good catch, fixed” - Bug found and fixed
“I disagree because…” - Explain reasoning
“Can you clarify…” - Ask for more details
Testing Guidelines
Test Types
Unit tests:
Test individual functions
Mock external dependencies
Test edge cases
Aim for high coverage
Integration tests:
Test service interactions
Use test database
Test API endpoints
Test error handling
End-to-end tests:
Test complete workflows
Use real data (anonymized)
Test user scenarios
Validate outputs
Test Data
Test data requirements:
Use anonymized data only
Include edge cases
Cover different file formats
Include invalid data
Test data organization:
tests/
├── fixtures/ # Test fixtures
├── data/ # Test data files
│ ├── valid/ # Valid test files
│ ├── invalid/ # Invalid test files
│ └── edge_cases/ # Edge case files
└── conftest.py # Test configuration
Documentation Guidelines
Code Documentation
Docstring format:
def process_file(file_path: str, options: dict) -> dict:
"""
Process a genomic file and return analysis results.
This function handles file validation, preprocessing,
and analysis orchestration.
Args:
file_path: Path to the genomic file to process
options: Dictionary of processing options
Returns:
Dictionary containing analysis results with keys:
- status: Processing status
- results: Analysis results
- metadata: File metadata
Raises:
FileNotFoundError: If file doesn't exist
ValueError: If file format is not supported
Example:
>>> results = process_file("sample.vcf", {"reference": "hg38"})
>>> print(results["status"])
"completed"
"""
API Documentation
Endpoint documentation:
@router.post("/upload/genomic-data", response_model=UploadResponse)
async def upload_genomic_data(
files: List[UploadFile] = File(...),
sample_identifier: Optional[str] = Form(None),
db: Session = Depends(get_db)
):
"""
Upload genomic data files for pharmacogenomic analysis.
This endpoint accepts various genomic file formats (VCF, BAM, CRAM, SAM, FASTQ)
and initiates the Nextflow-based processing pipeline.
Args:
files: List of genomic data files to upload
sample_identifier: Optional patient/sample identifier
db: Database session dependency
Returns:
UploadResponse containing job information
Raises:
HTTPException: If upload fails or file format is invalid
Example:
```bash
curl -X POST \\
-F "file=@sample.vcf" \\
-F "sample_identifier=patient_001" \\
http://localhost:8765/upload/genomic-data
```
"""
Release Process
Version Numbering
Semantic versioning:
MAJOR.MINOR.PATCHMAJOR: Breaking changesMINOR: New features (backward compatible)PATCH: Bug fixes (backward compatible)
Examples:
1.0.0: Initial release1.1.0: New features added1.1.1: Bug fixes2.0.0: Breaking changes
Release Checklist
Before release:
All tests pass
Documentation updated
Version numbers updated
CHANGELOG updated
Release notes prepared
Release process:
Update version in
pyproject.tomlUpdate
CHANGELOG.mdCreate release branch
Tag release
Create GitHub release
Update documentation
Community Guidelines
Code of Conduct
Be respectful:
Use welcoming language
Be respectful of differing viewpoints
Accept constructive criticism
Focus on what’s best for the community
Be collaborative:
Help others learn
Share knowledge
Be patient with newcomers
Work together constructively
Getting Help
Resources:
GitHub Discussions
Issue tracker
Documentation
Code comments
Asking questions:
Search existing issues first
Provide context and examples
Be specific about the problem
Include relevant logs or error messages
Recognition
Contributors
Contributor recognition:
Listed in CONTRIBUTORS.md
Mentioned in release notes
GitHub contributor status
Community recognition
Types of contributions:
Code contributions
Documentation improvements
Bug reports
Feature requests
Community support
Next Steps
Development Setup: Development Setup
Architecture: System Architecture
API Reference: API Reference
Deployment: Deployment Guide