This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
# Initial setup (using uv for modern Python package management)
make setup-env # Create .venv using uv
make install # Install all dependencies via uv sync
source .venv/bin/activate # Activate virtual environment
# Database setup
docker-compose up -d # Start IRIS database
make setup-db # Initialize database
make load-data # Load sample data# Run tests using the script runner
./scripts/ci/run-tests.sh # Run all tests
./scripts/ci/run-tests.sh -t unit # Run only unit tests
./scripts/ci/run-tests.sh -t integration -v # Integration tests with verbose output
./scripts/ci/run-tests.sh -p -c # Parallel execution without coverage
# Direct pytest execution
pytest tests/ # All tests
pytest tests/unit/ # Unit tests only
pytest tests/integration/ # Integration tests only
pytest tests/e2e/ # End-to-end tests only
pytest --cov=iris_rag --cov=rag_templates # With coverage
# Backend mode testing (Feature 035)
make test-community # Test with Community Edition mode (1 connection)
make test-enterprise # Test with Enterprise Edition mode (999 connections)
make test-backend-contracts # Run backend mode contract tests
IRIS_BACKEND_MODE=community pytest tests/ # Manual backend mode override# Format code (apply isort and black per pyproject.toml configuration)
black .
isort .
# Lint code
flake8 .
mypy iris_rag/# Core services
make docker-up # Start core services (IRIS, Redis, API, Streamlit)
make docker-down # Stop all services
make docker-logs # View logs from all services
# Development environment
make docker-up-dev # Start with Jupyter notebook
make docker-shell # Open shell in API container
make docker-iris-shell # Open IRIS database shell
# Full development setup
make docker-dev # Start dev environment, wait for health, init data# Quick evaluation on sample data
make test-ragas-sample
# Full evaluation on 1000 PMC documents
make test-ragas-1000
# Dockerized evaluation
make test-ragas-sample-docker
make test-ragas-1000-docker- iris_rag/: Main RAG framework package
- core/: Abstract base classes (RAGPipeline, VectorStore) and models
- pipelines/: RAG pipeline implementations (BasicRAG, CRAG, GraphRAG, HybridGraphRAG)
- storage/: Vector store implementations, primarily IRISVectorStore
- services/: Business logic services (entity extraction, storage management)
- config/: Configuration management and pipeline-specific configs
- validation/: Pipeline validation and requirements checking
- memory/: Memory management and incremental indexing components
basic→ BasicRAGPipeline - Standard vector similarity searchbasic_rerank→ BasicRAGRerankingPipeline - Vector search + cross-encoder rerankingcrag→ CRAGPipeline - Corrective RAG with self-evaluationgraphrag→ HybridGraphRAGPipeline - Hybrid search (vector + text + graph + RRF)pylate_colbert→ PyLateColBERTPipeline - ColBERT late interaction retrieval
Additional Pipeline (Direct Import): 6. IRIS-Global-GraphRAG - Academic papers with 3D visualization and global communities
- Vector Database: InterSystems IRIS with native vector search capabilities
- LLM Integration: OpenAI and Anthropic APIs via common.utils.get_llm_func
- Bridge Adapters: Generic memory components for external system integration
- Validation Framework: Automated pipeline requirement validation and setup
from iris_rag import create_pipeline
# Create with validation (recommended)
pipeline = create_pipeline(
pipeline_type="basic", # basic, basic_rerank, crag, graphrag, pylate_colbert
validate_requirements=True, # Auto-validate DB setup
auto_setup=False # Auto-fix issues if True
)
# All pipelines share the same standardized API
result = pipeline.query(query="What is diabetes?", top_k=5)
# Standardized response format (100% LangChain & RAGAS compatible):
# - result["answer"]: LLM-generated answer
# - result["retrieved_documents"]: List[Document] with full metadata
# - result["contexts"]: List[str] for RAGAS evaluation
# - result["sources"]: Source references in metadata
# - result["metadata"]: Pipeline-specific metadata fields- Unit Tests:
tests/unit/- Component-level testing - Integration Tests:
tests/integration/- Cross-component functionality - E2E Tests:
tests/e2e/- Full pipeline workflows - Contract Tests:
tests/contract/- API contract validation (TDD approach) - Enterprise Scale Tests: 10K document testing with mocking support
Constitutional Requirement: All integration and E2E tests with ≥10 entities MUST use .DAT fixtures loaded via iris-devtools. See .specify/memory/constitution.md for complete IRIS testing principles.
Performance Benefits:
- .DAT fixtures: 0.5-2 seconds for 100 entities (binary IRIS format)
- JSON fixtures: 39-75 seconds for same data
- Speedup: 100-200x faster test execution
When to Use What:
Need test data?
├─ Unit test (mocked components)?
│ └─ Use programmatic fixtures (Python code)
│
├─ Integration test (real IRIS database)?
│ ├─ < 10 entities or simple data?
│ │ └─ Use programmatic fixtures
│ │
│ └─ ≥ 10 entities or complex relationships?
│ └─ Use .DAT fixtures (REQUIRED)
│
└─ E2E test (full pipeline)?
└─ Use .DAT fixtures (REQUIRED)
Fixture Management Commands:
# List available fixtures
make fixture-list
# Get fixture details
make fixture-info FIXTURE=medical-graphrag-20
# Load fixture into IRIS
make fixture-load FIXTURE=medical-graphrag-20
# Create new fixture from current database
make fixture-create FIXTURE=my-test-data
# Validate fixture integrity
make fixture-validate FIXTURE=medical-graphrag-20Using Fixtures in Tests:
# Automatic fixture loading via pytest marker
@pytest.mark.dat_fixture("medical-graphrag-20")
def test_with_fixture():
# Fixture automatically loaded before test
# Database contains 21 entities, 15 relationships
pass
# Manual fixture loading via FixtureManager
from tests.fixtures.manager import FixtureManager
def test_manual_load():
manager = FixtureManager()
result = manager.load_fixture(
fixture_name="medical-graphrag-20",
cleanup_first=True,
validate_checksum=True,
)
assert result.successFixture Infrastructure (✅ Production Ready): The unified fixture infrastructure provides:
- Fast .DAT Loading: 100-200x faster than JSON (via iris-devtools)
- Checksum Validation: SHA256 integrity checking for data consistency
- Version Management: Semantic versioning with migration history tracking
- State Tracking: Session-wide fixture state to prevent schema loops
- pytest Integration: Automatic cleanup via
@pytest.mark.dat_fixturedecorator
Fixture Documentation:
- Complete Status:
FIXTURE_INFRASTRUCTURE_COMPLETE.md(implementation overview) - CLI Reference:
python -m tests.fixtures.cli --help - API Documentation:
tests/fixtures/manager.py(FixtureManager class) - Constitution:
.specify/memory/constitution.md(Principle II)
Purpose: Prevent license pool exhaustion in IRIS Community Edition while allowing parallel execution in Enterprise Edition.
Modes:
- Community: Single connection limit, sequential test execution
- Enterprise: 999 connections, parallel test execution
Configuration Precedence (highest to lowest):
IRIS_BACKEND_MODEenvironment variable.specify/config/backend_modes.yamlfile- Default (community mode)
Usage Examples:
# Pytest fixtures (auto-configured)
def test_example(iris_connection, backend_configuration):
assert backend_configuration.max_connections == 1 # community mode
# Manual configuration
from iris_rag.testing import load_configuration, ConnectionPool
config = load_configuration()
pool = ConnectionPool(mode=config.mode)
with pool.acquire() as conn:
# Use connection
passTroubleshooting:
- License pool exhaustion: Switch to
IRIS_BACKEND_MODE=community - Tests timing out: Check connection pool limits with
config.max_connections - Edition mismatch error: Set
IRIS_BACKEND_MODEto match your IRIS edition
- Default Config:
iris_rag/config/default_config.yaml - Pipeline Configs:
config/pipelines.yaml - Environment:
.envfile for API keys and database connections - Docker Compose: Multiple compose files for different deployment scenarios
The HybridGraphRAG pipeline requires iris-vector-graph for operation:
Installation:
pip install rag-templates[hybrid-graphrag]This installs the iris-vector-graph package providing iris_graph_core integration for 50x performance improvements.
Requirements:
- iris-vector-graph>=1.6.0 is now a mandatory dependency
- No fallback mechanisms - the pipeline will fail fast with clear error messages if the package is missing
- All retrieval methods (hybrid, rrf, text, vector, kg) require iris-vector-graph
Important: HybridGraphRAG integration tests are intentionally skipped in CI because they require:
- Configured LLM for entity extraction from documents
- iris-vector-graph tables populated with embeddings and optimized indexes
- Full knowledge graph (entities + relationships) extracted from documents
Test fixtures cannot provide this setup because:
- Entity extraction requires LLM API calls (not available/practical in test environment)
- iris-vector-graph requires optimized HNSW tables with real embeddings
- Simple 3-document fixtures cannot replicate the complexity of real knowledge graphs
Three-Tier Testing Strategy:
GraphRAG testing uses a pragmatic three-tier approach:
Tier 1: Contract Tests (Automated CI) ✅
pytest tests/contract/test_graphrag_fixtures.py # 13/13 passing- Purpose: Validate API interfaces and fixture loading
- Coverage: Data structures, fixture service, validation logic
- Run in CI: Yes - fast (< 1s), reliable, no dependencies
- When to run: Always (part of standard test suite)
Tier 2: Realistic Integration Tests (Manual, Development) ℹ️
# Run against real database with 221K+ entities
IRIS_PORT=21972 pytest tests/integration/test_graphrag_realistic.py -v
IRIS_PORT=21972 pytest tests/integration/test_graphrag_with_real_data.py -v- Purpose: Validate GraphRAG against production-like data
- Coverage: KG traversal, vector fallback, metadata completeness
- Run in CI: No - requires IRIS_PORT environment configuration
- When to run: During development, before major releases
- Database requirement: 100+ entities, 50+ relationships
Tier 3: E2E HybridGraphRAG Tests (Skipped) ⏭️
pytest tests/integration/test_hybridgraphrag_e2e.py # All skipped with clear reasons- Purpose: End-to-end validation of all 5 query methods
- Status: Intentionally skipped - requires LLM + iris-vector-graph setup
- Alternative: Manual testing with real data (see below)
Why Integration Tests are Skipped:
- Previous "passing" integration tests were false positives - they used 2,376 pre-existing documents in the database, not the 3-document test fixtures
- Maintaining complex LLM mocking + iris-vector-graph setup is brittle and provides little value
- Contract tests + manual validation with real data provides better signal
- Document Ingestion: Load documents via pipeline.load_documents()
- Chunking & Embedding: Automatic text segmentation and vector generation
- Storage: Vectors and metadata stored in IRIS vector tables
- Query Processing: Multi-modal retrieval (vector, text, graph) depending on pipeline
- Generation: LLM synthesis with retrieved context
- Response: Standardized response format with sources and metadata
Location: iris_rag/api/
The REST API provides production-ready HTTP endpoints for all RAG pipelines with enterprise features:
Features:
- API key authentication (bcrypt-hashed)
- Three-tier rate limiting (60/100/1000 requests/min)
- Request/response logging with audit trail
- WebSocket streaming for real-time progress
- Async document upload with validation
- Health monitoring for all components
- Elasticsearch-inspired error responses
- 100% LangChain & RAGAS compatible
Quick Start:
# Setup database tables
make api-setup-db
# Create API key
make api-create-key NAME="My Key" EMAIL=user@example.com
# Start server (development mode)
make api-run
# Start server (production mode, 4 workers)
make api-run-prod
# Open API documentation
make api-docs # http://localhost:8000/docsCLI Commands:
# Server operations
python -m iris_rag.api.cli run [--host HOST] [--port PORT] [--workers N] [--reload]
python -m iris_rag.api.cli health
# API key management
python -m iris_rag.api.cli create-key --name NAME --owner-email EMAIL [--tier TIER]
python -m iris_rag.api.cli list-keys [--owner-email EMAIL]
python -m iris_rag.api.cli revoke-key --key-id KEY_ID
# Database operations
python -m iris_rag.api.cli setup-db
# Cleanup job (run daily via cron)
python -m iris_rag.api.cleanup_jobAPI Endpoints:
POST /{pipeline}/_search- Execute query (requires auth)GET /api/v1/pipelines- List available pipelines (public)GET /api/v1/pipelines/{name}- Get pipeline details (public)POST /api/v1/documents/upload- Upload documents (requires write permission)GET /api/v1/documents/operations/{id}- Track upload progressGET /api/v1/health- System health check (public)WS /ws- WebSocket streaming (requires auth)
Authentication:
# All requests (except /health and /pipelines) require API key
Authorization: ApiKey <base64(key_id:key_secret)>
# Example
Authorization: ApiKey N2M5ZTY2NzktNzQyNS00MGRlLTk0NGItZTA3ZmMxZjkwYWU3Om15X3NlY3JldF9rZXk=Query Example:
curl -X POST http://localhost:8000/api/v1/basic/_search \
-H "Authorization: ApiKey <your-key>" \
-H "Content-Type: application/json" \
-d '{
"query": "What are the symptoms of diabetes?",
"top_k": 5
}'Response Format (RAGAS compatible):
{
"response_id": "uuid",
"request_id": "uuid",
"answer": "Generated answer text...",
"retrieved_documents": [
{
"doc_id": "uuid",
"content": "Document text...",
"score": 0.95,
"metadata": {"source": "file.pdf", "page_number": 127}
}
],
"sources": ["file.pdf"],
"contexts": ["Document text..."],
"pipeline_name": "basic",
"execution_time_ms": 1456,
"retrieval_time_ms": 345,
"generation_time_ms": 1089,
"tokens_used": 2345
}Rate Limiting:
| Tier | Requests/Minute | Requests/Hour | Max Concurrent |
|---|---|---|---|
| Basic | 60 | 1,000 | 5 |
| Premium | 100 | 5,000 | 10 |
| Enterprise | 1,000 | 50,000 | 20 |
Error Handling (Elasticsearch-inspired):
{
"error": {
"type": "validation_exception",
"reason": "Invalid parameter value",
"details": {
"field": "top_k",
"rejected_value": -5,
"message": "Must be positive integer between 1 and 100",
"min_value": 1,
"max_value": 100
}
}
}Database Cleanup:
# Run cleanup job manually
python -m iris_rag.api.cleanup_job
# Schedule with cron (daily at 2 AM)
0 2 * * * cd /path/to/rag-templates && .venv/bin/python -m iris_rag.api.cleanup_job >> logs/cleanup.log 2>&1Testing:
make api-test # Run all API tests
make api-test-contracts # Run contract tests (TDD)
make api-test-integration # Run integration testsConfiguration: config/api_config.yaml
server:
host: 0.0.0.0
port: 8000
workers: 4
database:
pool_size: 20
max_overflow: 10
pipelines:
enabled: [basic, basic_rerank, crag, graphrag, pylate_colbert]
rate_limiting:
max_concurrent_per_key: 10
logging:
retention_days: 30Complete Documentation: iris_rag/api/README.md
The project uses three repositories for selective public sharing:
origin (private) → isc-tdyar/iris-vector-rag-private
fork (public) → isc-tdyar/iris-vector-rag
upstream (community)→ intersystems-community/iris-vector-rag
Current remotes:
git remote -v
# origin https://github.com/isc-tdyar/iris-vector-rag-private.git
# fork https://github.com/isc-tdyar/iris-vector-rag.git
# upstream https://github.com/intersystems-community/iris-vector-rag.git1. Private Work (Default)
# Work on features privately
git commit -am "feat: experimental feature"
git push origin main # Push to private repo only2. Selective Public Sharing
# Cherry-pick commits for public release
git checkout -b public/feature-name
git cherry-pick <commit-hash> # Select specific commits
git push fork public/feature-name
# Create PR on GitHub: fork:public/feature-name → upstream:main3. Emergency Sync (Rare)
# Sync all repositories immediately (requires write access)
git push origin main && git push fork main && git push upstream mainVersion Bump and Publish:
# 1. Update version
vim pyproject.toml # version = "0.5.x"
# 2. Build and publish to PyPI
uv build
twine upload dist/iris_vector_rag-*.whl dist/iris_vector_rag-*.tar.gz
# 3. Commit and tag
git commit -am "chore: bump version to 0.5.x"
git tag -a v0.5.x -m "Release v0.5.x"
# 4. Push to all repositories
git push origin main
git push fork main
git push upstream main
git push --tagsIMPORTANT: Always use twine for PyPI publishing (not uv publish). See Constitution Principle X.
Divergent branches error:
# Use merge strategy to reconcile
git pull fork main --no-rebase --no-editCheck remote status:
git remote -v
git fetch --all
git log --oneline --graph --all --decorate -10- Complete Git Guide:
CONTRIBUTING.md - Constitution:
.specify/memory/constitution.md(Principle XI)
- Python 3.10+ (existing codebase uses 3.10-3.12) (051-enterprise-enhancements)
- InterSystems IRIS database (RAG.SourceDocuments table - existing) (051-enterprise-enhancements)
- 051-enterprise-enhancements: Added Python 3.10+ (existing codebase uses 3.10-3.12)