Multi-Tenant Setup¶
Configure CheckStream for multiple tenants with isolated policies and backends.
Overview¶
Multi-tenant mode allows a single CheckStream instance to serve multiple clients with:
- Separate LLM backends per tenant
- Independent policies per tenant
- Custom streaming formats
- Isolated audit trails
Enabling Multi-Tenancy¶
Tenant Resolution¶
CheckStream resolves tenants in priority order:
1. Header-Based (Recommended)¶
Client sends:
2. Path Prefix¶
Client sends:
3. API Key Mapping¶
tenants:
resolution:
- api_key_mapping: true
api_key_map:
"sk-acme-123": "acme-corp"
"sk-beta-456": "beta-inc"
4. Combined Resolution¶
tenants:
resolution:
- header: "X-Tenant-ID" # Try header first
- path_prefix: "/tenant/" # Then path
- api_key_mapping: true # Then API key
default_tenant: "default" # Fallback
Tenant Configuration¶
Basic Tenant¶
tenants:
enabled: true
configs:
acme-corp:
backend_url: "https://api.openai.com/v1"
policy_path: "./policies/acme.yaml"
Full Tenant Configuration¶
tenants:
enabled: true
configs:
acme-corp:
# Backend
backend_url: "https://api.openai.com/v1"
backend_timeout_ms: 30000
# Policies
policy_path: "./policies/acme.yaml"
# Streaming format
streaming_format: "openai"
# Thresholds (override global)
thresholds:
safety: 0.9
chunk: 0.8
# Pipeline (override global)
pipeline:
ingress:
classifiers:
- prompt_injection
- custom_acme_classifier
midstream:
classifiers:
- toxicity
# Rate limiting
rate_limit:
requests_per_minute: 1000
tokens_per_minute: 100000
# Audit
audit:
enabled: true
path: "./audit/acme"
Different Backends Per Tenant¶
OpenAI + Anthropic¶
tenants:
configs:
openai-tenant:
backend_url: "https://api.openai.com/v1"
streaming_format: "openai"
anthropic-tenant:
backend_url: "https://api.anthropic.com"
streaming_format: "anthropic"
headers:
anthropic-version: "2024-01-01"
Self-Hosted LLMs¶
tenants:
configs:
vllm-tenant:
backend_url: "http://vllm-server:8000/v1"
streaming_format: "openai"
ollama-tenant:
backend_url: "http://ollama:11434/v1"
streaming_format: "openai"
Per-Tenant Policies¶
Strict Policy (Finance)¶
# policies/finance.yaml
version: "1.0"
name: "finance-policy"
policies:
- name: block_investment_advice
trigger:
classifier: financial_advice
threshold: 0.7
action: stop
regulation: "FCA COBS 9A.2.1R"
Permissive Policy (Internal)¶
# policies/internal.yaml
version: "1.0"
name: "internal-policy"
policies:
- name: log_safety
trigger:
classifier: toxicity
threshold: 0.5
action: log
# No blocking, just logging
Tenant Configuration¶
tenants:
configs:
finance-team:
policy_path: "./policies/finance.yaml"
thresholds:
safety: 0.9
internal-team:
policy_path: "./policies/internal.yaml"
thresholds:
safety: 0.5
Per-Tenant Classifiers¶
Load different classifiers per tenant:
tenants:
configs:
healthcare-tenant:
classifiers:
- toxicity
- pii_detector
- medical_advice # Healthcare-specific
finance-tenant:
classifiers:
- toxicity
- pii_detector
- financial_advice # Finance-specific
Rate Limiting¶
Per-Tenant Limits¶
tenants:
configs:
free-tier:
rate_limit:
requests_per_minute: 60
tokens_per_minute: 10000
pro-tier:
rate_limit:
requests_per_minute: 600
tokens_per_minute: 100000
enterprise:
rate_limit:
requests_per_minute: 6000
tokens_per_minute: 1000000
Response Headers¶
Isolated Audit Trails¶
tenants:
configs:
tenant-a:
audit:
enabled: true
path: "./audit/tenant-a"
retention_days: 90
tenant-b:
audit:
enabled: true
path: "./audit/tenant-b"
retention_days: 365 # Longer retention
Authentication¶
Pass-through (Client's Keys)¶
tenants:
configs:
byok-tenant: # Bring Your Own Key
auth_passthrough: true
# Client provides their own API key
Managed Keys¶
tenants:
configs:
managed-tenant:
auth:
type: bearer
token_env: "TENANT_A_API_KEY"
# CheckStream uses its own key
Metrics Per Tenant¶
Metrics include tenant labels:
checkstream_requests_total{tenant="acme-corp",status="success"} 1234
checkstream_latency_ms{tenant="acme-corp",phase="ingress",quantile="0.95"} 3.2
checkstream_policy_triggers_total{tenant="acme-corp",rule="block_toxicity"} 56
Complete Example¶
server:
host: "0.0.0.0"
port: 8080
tenants:
enabled: true
resolution:
- header: "X-Tenant-ID"
- api_key_mapping: true
default_tenant: "default"
api_key_map:
"sk-acme-prod": "acme-corp"
"sk-beta-test": "beta-inc"
configs:
default:
backend_url: "https://api.openai.com/v1"
policy_path: "./policies/default.yaml"
rate_limit:
requests_per_minute: 100
acme-corp:
backend_url: "https://api.openai.com/v1"
policy_path: "./policies/acme.yaml"
thresholds:
safety: 0.9
chunk: 0.85
rate_limit:
requests_per_minute: 1000
audit:
enabled: true
path: "./audit/acme"
beta-inc:
backend_url: "https://api.anthropic.com"
streaming_format: "anthropic"
policy_path: "./policies/beta.yaml"
headers:
anthropic-version: "2024-01-01"
rate_limit:
requests_per_minute: 500
pipeline:
ingress:
enabled: true
classifiers:
- prompt_injection
midstream:
enabled: true
classifiers:
- toxicity
Testing Multi-Tenancy¶
Test Tenant Resolution¶
# Header-based
curl -H "X-Tenant-ID: acme-corp" http://localhost:8080/v1/chat/completions
# Path-based
curl http://localhost:8080/tenant/acme-corp/v1/chat/completions
# API key mapping
curl -H "Authorization: Bearer sk-acme-prod" http://localhost:8080/v1/chat/completions
Verify Tenant¶
{
"tenant": "acme-corp",
"backend": "https://api.openai.com/v1",
"policy": "acme.yaml",
"rate_limit": {
"requests_per_minute": 1000,
"remaining": 950
}
}
Next Steps¶
- Configuration Reference - Full configuration options
- Policy Engine - Write tenant-specific policies
- Deployment - Scale multi-tenant deployments