Minor update to docs
Production-ready Retrieval-Augmented Generation (RAG) pipelines powered by InterSystems IRIS Vector Search
Build intelligent applications that combine large language models with your enterprise data using battle-tested RAG patterns and native vector search capabilities.
π Production-Ready - Six proven RAG architectures ready to deploy, not research prototypes
β‘ Blazing Fast - Native IRIS vector search with HNSW indexing, no external vector databases needed
π§ Unified API - Swap between RAG strategies with a single line of code
π Enterprise-Grade - ACID transactions, connection pooling, and horizontal scaling built-in
π― 100% Compatible - Works seamlessly with LangChain, RAGAS, and your existing ML stack
π§ͺ Fully Validated - Comprehensive test suite with automated contract validation
| Pipeline Type | Use Case | Retrieval Method | When to Use |
|---|---|---|---|
| basic | Standard retrieval | Vector similarity | General Q&A, getting started, baseline comparisons |
| basic_rerank | Improved precision | Vector + cross-encoder reranking | Higher accuracy requirements, legal/medical domains |
| crag | Self-correcting | Vector + evaluation + web search fallback | Dynamic knowledge, fact-checking, current events |
| graphrag | Knowledge graphs | Vector + text + graph + RRF fusion | Complex entity relationships, research, medical knowledge |
| multi_query_rrf | Multi-perspective | Query expansion + reciprocal rank fusion | Complex queries, comprehensive coverage needed |
| pylate_colbert | Fine-grained matching | ColBERT late interaction embeddings | Nuanced semantic understanding, high precision |
# Clone repository git clone https://github.com/intersystems-community/iris-rag-templates.git cd iris-rag-templatesSetup environment (requires uv package manager)
make setup-env
make install
source .venv/bin/activate
# Start IRIS with Docker Compose docker-compose up -dInitialize database schema
make setup-db
Optional: Load sample medical data
make load-data
cat > .env << 'EOF'
OPENAI_API_KEY=your-key-here
ANTHROPIC_API_KEY=your-key-here # Optional, for Claude models
IRIS_HOST=localhost
IRIS_PORT=1972
IRIS_NAMESPACE=USER
IRIS_USERNAME=_SYSTEM
IRIS_PASSWORD=SYS
EOF
from iris_vector_rag import create_pipelineCreate pipeline with automatic validation
pipeline = create_pipeline('basic', validate_requirements=True)
Load your documents
from iris_rag.core.models import Document
docs = [
Document(
page_content="RAG combines retrieval with generation for accurate AI responses.",
metadata={"source": "rag_basics.pdf", "page": 1}
),
Document(
page_content="Vector search finds semantically similar content using embeddings.",
metadata={"source": "vector_search.pdf", "page": 5}
)
]pipeline.load_documents(documents=docs)
Query with LLM-generated answer
result = pipeline.query(
query="What is RAG?",
top_k=5,
generate_answer=True
)
print(f"Answer: {result['answer']}")
print(f"Sources: {result['sources']}")
print(f"Retrieved: {len(result['retrieved_documents'])} documents")
Switch RAG strategies with one line - all pipelines share the same interface:
from iris_vector_rag import create_pipelineStart with basic
pipeline = create_pipeline('basic')
result = pipeline.query("What are the latest cancer treatment approaches?", top_k=5)Upgrade to basic_rerank for better accuracy
pipeline = create_pipeline('basic_rerank')
result = pipeline.query("What are the latest cancer treatment approaches?", top_k=5)Try graphrag for entity reasoning
pipeline = create_pipeline('graphrag')
result = pipeline.query("What are the latest cancer treatment approaches?", top_k=5)All pipelines return the same response format
print(f"Answer: {result['answer']}")
print(f"Sources: {result['sources']}")
print(f"Retrieved: {len(result['retrieved_documents'])} documents")
100% LangChain & RAGAS compatible responses:
{
"query": "What is diabetes?",
"answer": "Diabetes is a chronic metabolic condition...", # LLM answer
"retrieved_documents": [Document(...)], # LangChain Documents
"contexts": ["context 1", "context 2"], # RAGAS contexts
"sources": ["medical.pdf p.12", "diabetes.pdf p.3"], # Source citations
"execution_time": 0.523,
"metadata": {
"num_retrieved": 5,
"pipeline_type": "basic",
"retrieval_method": "vector",
"generated_answer": True,
"processing_time": 0.523
}
}
Each pipeline uses the same API - just change the pipeline type:
basic - Fast vector similarity search, great for getting startedbasic_rerank - Vector + cross-encoder reranking for higher accuracycrag - Self-correcting with web search fallback for current eventsgraphrag - Multi-modal: vector + text + knowledge graph fusionmulti_query_rrf - Query expansion with reciprocal rank fusionpylate_colbert - ColBERT late interaction for fine-grained matchingπ Complete Pipeline Guide β - Decision tree, performance comparison, configuration examples
IRIS provides everything you need in one database:
Automatic concurrency management:
from iris_rag.storage import IRISVectorStoreConnection pool handles concurrency automatically
store = IRISVectorStore()
Safe for multi-threaded applications
Pool manages connections, no manual management needed
Database schema created and migrated automatically:
pipeline = create_pipeline('basic', validate_requirements=True)
# β
Checks database connection
# β
Validates schema exists
# β
Migrates to latest version if needed
# β
Reports validation results
Measure your RAG pipeline performance:
# Evaluate all pipelines on your data make test-ragas-sampleGenerates detailed metrics:
- Answer Correctness
- Faithfulness
- Context Precision
- Context Recall
- Answer Relevance
Automatic embedding generation with model caching - eliminates repeated model loading overhead for faster document vectorization.
Key Features:
from iris_vector_rag import create_pipelineEnable IRIS EMBEDDING support
pipeline = create_pipeline(
'basic',
embedding_config='medical_embeddings_v1'
)Documents auto-vectorize on INSERT
pipeline.load_documents(documents=docs)
π Complete IRIS EMBEDDING Guide β - Configuration, performance tuning, multi-field vectorization, troubleshooting
Expose RAG pipelines as MCP tools for Claude Desktop and other MCP clients - enables conversational RAG workflows where Claude queries your documents during conversations.
# Start MCP server
python -m iris_vector_rag.mcp
All pipelines available as MCP tools: rag_basic, rag_basic_rerank, rag_crag, rag_graphrag, rag_multi_query_rrf, rag_pylate_colbert.
π Complete MCP Integration Guide β - Claude Desktop setup, configuration, testing, production deployment
Framework-first design with abstract base classes (RAGPipeline, VectorStore) and concrete implementations for 6 production-ready pipelines.
Key Components: Core abstractions, pipeline implementations, IRIS vector store, MCP server, REST API, validation framework.
π Comprehensive Architecture Guide β - System design, component interactions, extension points
π Comprehensive documentation for every use case:
make test # Run comprehensive test suite
pytest tests/unit/ # Unit tests
pytest tests/integration/ # Integration tests
This implementation is based on peer-reviewed research:
We welcome contributions! See CONTRIBUTING.md for development setup, testing guidelines, and pull request process.
MIT License - see https://github.com/intersystems-community/iris-vector-rag/blob/main/LICENSE for details.