Configuration Reference¶
Overview¶
Membrain is configured entirely through environment variables and optional .env files. The configuration system is built on pydantic-settings, which provides typed, validated settings with sensible defaults.
All settings are defined in a single Settings class located at src/membrain/config.py. Because no env_prefix is configured, environment variable names match the uppercase version of each field name directly (e.g., the host field is set via the HOST environment variable). The one exception is upstream_anthropic_url, which uses the alias UPSTREAM_ANTHROPIC_URL to avoid colliding with ANTHROPIC_BASE_URL (set by Claude Code to point at the proxy itself).
Settings are loaded once at application startup via Settings() in the lifespan function. The .env file is read automatically from the working directory.
Configuration Precedence¶
Settings are resolved in the following order (highest priority first):
- Environment variables -- explicitly set in the shell or container runtime
.envfile -- key-value pairs in a.envfile in the working directory- Defaults -- hardcoded defaults in the
Settingsclass
The model_config also sets extra = "ignore", so unrecognized environment variables are silently ignored rather than causing validation errors.
Configuration Table¶
Core / Server¶
Settings that control the HTTP server and general gateway behavior.
| Field | Environment Variable | Type | Default | Description |
|---|---|---|---|---|
host |
HOST |
str |
"0.0.0.0" |
Address the HTTP server binds to. Use 0.0.0.0 to listen on all interfaces or 127.0.0.1 for localhost only. |
port |
PORT |
int |
8000 |
Port the HTTP server listens on. The Anthropic proxy (/v1/messages) and OpenAI-compatible endpoint (/v1/chat/completions) are both served on this port. |
default_provider |
DEFAULT_PROVIDER |
str |
"claude_cli" |
Default LLM provider when none is specified. Options: "claude_cli", "openai", "anthropic", "ollama", "litellm". |
default_model |
DEFAULT_MODEL |
str |
"sonnet" |
Default model name when none is specified in a request. |
Database¶
| Field | Environment Variable | Type | Default | Description |
|---|---|---|---|---|
database_url |
DATABASE_URL |
str |
"" (empty) |
PostgreSQL connection string using asyncpg. When empty, the gateway runs in in-memory mode (no persistence, no knowledge system, no auth). Example: postgresql+asyncpg://membrain:membrain@localhost:5432/membrain. |
Redis / Cache¶
| Field | Environment Variable | Type | Default | Description |
|---|---|---|---|---|
redis_url |
REDIS_URL |
str |
"" (empty) |
Redis connection URL. When empty, caching, rate limiting, and budget enforcement are disabled. Example: redis://localhost:6379. Supports rediss:// for TLS. |
Provider API Keys¶
API keys for upstream LLM providers. Add keys for the providers you want to route requests to. Not required if using local models only (Ollama) or non-proxy features.
| Field | Environment Variable | Type | Default | Description |
|---|---|---|---|---|
openai_api_key |
OPENAI_API_KEY |
str \| None |
None |
OpenAI API key (e.g., sk-...). Enables the OpenAI provider and GPT model family. Also used by the openai embedding backend. |
anthropic_api_key |
ANTHROPIC_API_KEY |
str \| None |
None |
Anthropic API key (e.g., sk-ant-...). Enables the Anthropic provider and Claude model family. |
google_api_key |
GOOGLE_API_KEY |
str \| None |
None |
Google API key. Enables Google/Gemini models when used with the LiteLLM provider. |
Anthropic Proxy¶
Settings specific to the Anthropic-compatible proxy endpoint (/v1/messages).
| Field | Environment Variable | Type | Default | Description |
|---|---|---|---|---|
upstream_anthropic_url |
UPSTREAM_ANTHROPIC_URL |
str |
"https://api.anthropic.com" |
Base URL for the upstream Anthropic API. This uses an explicit alias to avoid collision with ANTHROPIC_BASE_URL, which Claude Code sets to point at this proxy. Change this if you need to route through a different Anthropic endpoint. |
anthropic_proxy_timeout |
ANTHROPIC_PROXY_TIMEOUT |
float |
300.0 |
Timeout in seconds for upstream Anthropic API requests. Applies to both streaming and non-streaming calls. |
Knowledge / Embeddings¶
Settings that control the knowledge system, including embedding generation, semantic search, summarization, topic tagging, deduplication, and content lifecycle.
| Field | Environment Variable | Type | Default | Description |
|---|---|---|---|---|
embedding_backend |
EMBEDDING_BACKEND |
str |
"local" |
Embedding model backend. "local" uses sentence-transformers (all-MiniLM-L6-v2, runs on CPU, no API calls). "openai" uses the OpenAI embeddings API (requires OPENAI_API_KEY). |
embedding_dimension |
EMBEDDING_DIMENSION |
int |
384 |
Dimensionality of embedding vectors. Must match the backend: 384 for local sentence-transformers, 1536 for OpenAI text-embedding-ada-002. Mismatches will cause pgvector errors. |
knowledge_similarity_threshold |
KNOWLEDGE_SIMILARITY_THRESHOLD |
float |
0.3 |
Minimum cosine similarity score for knowledge entries to be injected as context into LLM requests. Lower values return more (potentially less relevant) results; higher values are more selective. Range: 0.0 to 1.0. |
knowledge_summarization |
KNOWLEDGE_SUMMARIZATION |
str |
"off" |
Knowledge summarization mode. "off" stores entries verbatim. "basic" produces concise summaries. "aggressive" produces minimal bullet-point summaries. Summarization reduces storage and improves retrieval but loses some detail. |
knowledge_ttl_days |
KNOWLEDGE_TTL_DAYS |
int |
0 |
Default time-to-live in days for knowledge entries. Entries older than this are considered stale and down-weighted in search results (see knowledge_stale_weight). 0 means entries never expire. |
knowledge_stale_weight |
KNOWLEDGE_STALE_WEIGHT |
float |
0.3 |
Weight multiplier applied to stale knowledge entries during search ranking. A value of 0.3 means stale entries contribute 30% of their original relevance score. Range: 0.0 (completely ignore stale) to 1.0 (no penalty). |
knowledge_dedup_similarity_threshold |
KNOWLEDGE_DEDUP_SIMILARITY_THRESHOLD |
float |
0.95 |
Cosine similarity threshold for near-duplicate detection when adding knowledge entries. Entries with similarity above this threshold to an existing entry are rejected as duplicates. Range: 0.0 to 1.0. |
knowledge_auto_topics |
KNOWLEDGE_AUTO_TOPICS |
bool |
False |
Enable automatic topic extraction for knowledge entries. When True, new entries are tagged with topics using keyword-based extraction. Set via KNOWLEDGE_AUTO_TOPICS=true or KNOWLEDGE_AUTO_TOPICS=1. |
knowledge_consolidation_schedule |
KNOWLEDGE_CONSOLIDATION_SCHEDULE |
str |
"" (empty) |
Cron expression for automatic knowledge consolidation (dedup, stale removal, re-ranking). Empty string means consolidation is manual only (triggered via API). Example: "0 2 * * *" for daily at 2 AM. |
knowledge_source_ttl_overrides |
KNOWLEDGE_SOURCE_TTL_OVERRIDES |
dict |
{} (empty) |
Per-source TTL overrides in days. Allows different content sources to have different expiration periods. Example (as JSON env var): {"auto_extracted": 30, "api_manual": 90}. |
knowledge_source_trust_levels |
KNOWLEDGE_SOURCE_TRUST_LEVELS |
dict |
{} (empty) |
Per-source trust level multipliers for search ranking. Higher trust means entries from that source rank higher. Example (as JSON env var): {"auto_extracted": 0.7, "api_manual": 1.0}. Range per value: 0.0 to 1.0. |
Rate Limiting¶
| Field | Environment Variable | Type | Default | Description |
|---|---|---|---|---|
rate_limit_rpm |
RATE_LIMIT_RPM |
int |
60 |
Maximum requests per minute per API key. Requires Redis to be configured. Set to 0 to disable rate limiting entirely. Per-key overrides can be configured via the admin API. |
Budget Enforcement¶
| Field | Environment Variable | Type | Default | Description |
|---|---|---|---|---|
budget_daily_limit_usd |
BUDGET_DAILY_LIMIT_USD |
float |
0.0 |
Maximum spend in USD per API key per day. Requires Redis to be configured. Set to 0.0 to disable daily budget limits. |
budget_monthly_limit_usd |
BUDGET_MONTHLY_LIMIT_USD |
float |
0.0 |
Maximum spend in USD per API key per month. Requires Redis to be configured. Set to 0.0 to disable monthly budget limits. |
Ollama (Local LLM)¶
| Field | Environment Variable | Type | Default | Description |
|---|---|---|---|---|
ollama_url |
OLLAMA_URL |
str |
"http://localhost:11434" |
Base URL of the Ollama server for local LLM inference. The Ollama provider is auto-registered when this URL is reachable. |
Transparent Network Proxy¶
Settings for the TLS-intercepting transparent proxy that enables network-level PII scanning.
| Field | Environment Variable | Type | Default | Description |
|---|---|---|---|---|
proxy_mode |
PROXY_MODE |
str |
"application" |
Proxy operation mode. "application" runs only the HTTP API gateway (default). "network" runs only the transparent TLS proxy. "hybrid" runs both simultaneously. |
ca_cert_path |
CA_CERT_PATH |
str |
"" (empty) |
Filesystem path to the organization CA certificate in PEM format. Required for network and hybrid proxy modes. This certificate is used to sign dynamically generated TLS certificates for intercepted connections. |
ca_key_path |
CA_KEY_PATH |
str |
"" (empty) |
Filesystem path to the organization CA private key in PEM format. Required for network and hybrid proxy modes. Must correspond to the certificate at CA_CERT_PATH. |
proxy_port |
PROXY_PORT |
int |
8443 |
Port the transparent TLS proxy listens on. Only used when PROXY_MODE is "network" or "hybrid". |
proxy_listen |
PROXY_LISTEN |
str |
"0.0.0.0" |
Address the transparent proxy binds to. Only used when PROXY_MODE is "network" or "hybrid". |
Vault / Secret Storage¶
Settings for the pluggable secret storage backend used to manage API keys and other sensitive values.
| Field | Environment Variable | Type | Default | Description |
|---|---|---|---|---|
vault_backend |
VAULT_BACKEND |
str |
"env" |
Secret storage backend. "env" reads secrets from environment variables (simple, no external dependencies). "hashicorp" uses HashiCorp Vault KV v2 (requires Vault server). |
vault_addr |
VAULT_ADDR |
str |
"" (empty) |
HashiCorp Vault server address. Only used when VAULT_BACKEND=hashicorp. Example: http://127.0.0.1:8200. |
vault_token |
VAULT_TOKEN |
str |
"" (empty) |
HashiCorp Vault authentication token. Only used when VAULT_BACKEND=hashicorp. In production, use a renewable token or AppRole auth. |
vault_mount |
VAULT_MOUNT |
str |
"secret" |
HashiCorp Vault KV v2 mount point. Only used when VAULT_BACKEND=hashicorp. |
vault_path_prefix |
VAULT_PATH_PREFIX |
str |
"membrain" |
Path prefix inside the Vault mount for Membrain secrets. Only used when VAULT_BACKEND=hashicorp. Secrets are stored at <mount>/data/<prefix>/<key>. |
PII Detection¶
PII detection is not configured via environment variables in the Settings class. Instead, the PII scanner supports three detection modes configured programmatically:
regex(default) -- Pattern-based detection using 25+ built-in regex patterns covering emails, SSNs, credit cards, phone numbers, IP addresses, API keys (OpenAI, GitHub, AWS, Stripe, Anthropic, GCP, Slack), UUIDs, JWTs, private keys, database URLs, IBANs, MAC addresses, dates of birth, passports, US addresses, connection strings, and bearer tokens.ner-- ML-based named entity recognition using thedslim/bert-base-NERmodel from HuggingFace. Detects person names, organizations, locations, and miscellaneous entities.hybrid-- Runs both regex and NER, with regex taking precedence on overlapping spans.
Custom PII patterns can be passed programmatically when constructing a PIIScanner instance.
Alerting¶
The alerting system (alert rules, channels, and thresholds) is configured programmatically rather than through environment variables. Alert rules are defined as AlertRule instances with:
name-- Rule identifiermetric-- Metric name to monitorthreshold-- Value that triggers the alertwindow_seconds-- Time window for metric aggregationchannel-- Channel name to deliver the alert to
Notification channels include:
- WebhookChannel -- Sends JSON POST payloads to a configured URL
- SlackChannel -- Sends formatted messages to a Slack incoming webhook
Telemetry¶
Telemetry metrics (counters, histograms, gauges) are collected automatically by the MetricsMiddleware ASGI middleware and exposed at the /metrics endpoint in Prometheus exposition format. Telemetry is always enabled and requires no configuration.
Routing¶
The intelligent model router resolves "auto" model requests to concrete models based on tier, privacy, and cost. Tier mappings are defined in src/membrain/routing/router.py:
| Tier | Models |
|---|---|
fast |
gpt-4o-mini, claude-haiku-4-5-20251001 |
balanced |
gpt-4o, claude-sonnet-4-20250514 |
best |
o1, claude-opus-4-20250514 |
Routing behavior (tier selection, privacy constraints, fallback chains) is controlled via request parameters rather than environment variables.
Example .env File¶
Below is a complete .env file showing every setting with its default value. Uncomment and modify values as needed.
# =============================================================================
# Membrain Gateway Configuration
# =============================================================================
# Copy this file to .env and adjust values for your environment.
# All values shown are the defaults unless otherwise noted.
# -----------------------------------------------------------------------------
# Core / Server
# -----------------------------------------------------------------------------
HOST=0.0.0.0
PORT=8000
DEFAULT_PROVIDER=claude_cli
DEFAULT_MODEL=sonnet
# -----------------------------------------------------------------------------
# Database (empty = in-memory mode, no persistence)
# -----------------------------------------------------------------------------
# DATABASE_URL=postgresql+asyncpg://membrain:membrain@localhost:5432/membrain
# -----------------------------------------------------------------------------
# Redis (empty = no caching, rate limiting, or budgets)
# -----------------------------------------------------------------------------
# REDIS_URL=redis://localhost:6379
# -----------------------------------------------------------------------------
# Cloud Provider API Keys (add the providers you use)
# -----------------------------------------------------------------------------
# OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...
# GOOGLE_API_KEY=...
# -----------------------------------------------------------------------------
# Anthropic Proxy
# -----------------------------------------------------------------------------
# UPSTREAM_ANTHROPIC_URL=https://api.anthropic.com
# ANTHROPIC_PROXY_TIMEOUT=300.0
# -----------------------------------------------------------------------------
# Knowledge / Embeddings
# -----------------------------------------------------------------------------
# EMBEDDING_BACKEND=local
# EMBEDDING_DIMENSION=384
# KNOWLEDGE_SIMILARITY_THRESHOLD=0.3
# KNOWLEDGE_SUMMARIZATION=off
# KNOWLEDGE_TTL_DAYS=0
# KNOWLEDGE_STALE_WEIGHT=0.3
# KNOWLEDGE_DEDUP_SIMILARITY_THRESHOLD=0.95
# KNOWLEDGE_AUTO_TOPICS=false
# KNOWLEDGE_CONSOLIDATION_SCHEDULE=
# KNOWLEDGE_SOURCE_TTL_OVERRIDES={}
# KNOWLEDGE_SOURCE_TRUST_LEVELS={}
# -----------------------------------------------------------------------------
# Rate Limiting (requires Redis; 0 = disabled)
# -----------------------------------------------------------------------------
# RATE_LIMIT_RPM=60
# -----------------------------------------------------------------------------
# Budget Enforcement (requires Redis; 0.0 = disabled)
# -----------------------------------------------------------------------------
# BUDGET_DAILY_LIMIT_USD=0.0
# BUDGET_MONTHLY_LIMIT_USD=0.0
# -----------------------------------------------------------------------------
# Ollama (local LLM)
# -----------------------------------------------------------------------------
# OLLAMA_URL=http://localhost:11434
# -----------------------------------------------------------------------------
# Transparent Network Proxy
# -----------------------------------------------------------------------------
# PROXY_MODE=application
# CA_CERT_PATH=
# CA_KEY_PATH=
# PROXY_PORT=8443
# PROXY_LISTEN=0.0.0.0
# -----------------------------------------------------------------------------
# Vault / Secret Storage
# -----------------------------------------------------------------------------
# VAULT_BACKEND=env
# VAULT_ADDR=
# VAULT_TOKEN=
# VAULT_MOUNT=secret
# VAULT_PATH_PREFIX=membrain
Common Configuration Scenarios¶
Minimal Development Setup¶
For local development with Claude CLI only (no database, no Redis):
Then point Claude Code at the proxy:
Full Production Setup¶
# Server
HOST=0.0.0.0
PORT=8000
# Database
DATABASE_URL=postgresql+asyncpg://membrain:secret@db.internal:5432/membrain
# Redis
REDIS_URL=redis://redis.internal:6379
# Providers
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
UPSTREAM_ANTHROPIC_URL=https://api.anthropic.com
# Knowledge (OpenAI embeddings for better quality)
EMBEDDING_BACKEND=openai
EMBEDDING_DIMENSION=1536
KNOWLEDGE_SIMILARITY_THRESHOLD=0.4
KNOWLEDGE_SUMMARIZATION=basic
KNOWLEDGE_TTL_DAYS=90
KNOWLEDGE_STALE_WEIGHT=0.3
KNOWLEDGE_DEDUP_SIMILARITY_THRESHOLD=0.95
KNOWLEDGE_AUTO_TOPICS=true
KNOWLEDGE_CONSOLIDATION_SCHEDULE=0 2 * * *
# Rate limiting and budgets
RATE_LIMIT_RPM=120
BUDGET_DAILY_LIMIT_USD=50.0
BUDGET_MONTHLY_LIMIT_USD=500.0
# Vault (HashiCorp)
VAULT_BACKEND=hashicorp
VAULT_ADDR=https://vault.internal:8200
VAULT_TOKEN=hvs.xxxxx
VAULT_MOUNT=secret
VAULT_PATH_PREFIX=membrain
Network Proxy Mode (TLS Interception)¶
PROXY_MODE=hybrid
CA_CERT_PATH=/etc/membrain/ca.pem
CA_KEY_PATH=/etc/membrain/ca-key.pem
PROXY_PORT=8443
PROXY_LISTEN=0.0.0.0