Configuration Reference¶

Overview¶

Membrain is configured entirely through environment variables and optional .env files. The configuration system is built on pydantic-settings, which provides typed, validated settings with sensible defaults.

All settings are defined in a single Settings class located at src/membrain/config.py. Because no env_prefix is configured, environment variable names match the uppercase version of each field name directly (e.g., the host field is set via the HOST environment variable). The one exception is upstream_anthropic_url, which uses the alias UPSTREAM_ANTHROPIC_URL to avoid colliding with ANTHROPIC_BASE_URL (set by Claude Code to point at the proxy itself).

Settings are loaded once at application startup via Settings() in the lifespan function. The .env file is read automatically from the working directory.

Configuration Precedence¶

Settings are resolved in the following order (highest priority first):

Environment variables -- explicitly set in the shell or container runtime
.env file -- key-value pairs in a .env file in the working directory
Defaults -- hardcoded defaults in the Settings class

The model_config also sets extra = "ignore", so unrecognized environment variables are silently ignored rather than causing validation errors.

Configuration Table¶

Core / Server¶

Settings that control the HTTP server and general gateway behavior.

Field	Environment Variable	Type	Default	Description
`host`	`HOST`	`str`	`"0.0.0.0"`	Address the HTTP server binds to. Use `0.0.0.0` to listen on all interfaces or `127.0.0.1` for localhost only.
`port`	`PORT`	`int`	`8000`	Port the HTTP server listens on. The Anthropic proxy (`/v1/messages`) and OpenAI-compatible endpoint (`/v1/chat/completions`) are both served on this port.
`default_provider`	`DEFAULT_PROVIDER`	`str`	`"claude_cli"`	Default LLM provider when none is specified. Options: `"claude_cli"`, `"openai"`, `"anthropic"`, `"ollama"`, `"litellm"`.
`default_model`	`DEFAULT_MODEL`	`str`	`"sonnet"`	Default model name when none is specified in a request.

Database¶

Field	Environment Variable	Type	Default	Description
`database_url`	`DATABASE_URL`	`str`	`""` (empty)	PostgreSQL connection string using asyncpg. When empty, the gateway runs in in-memory mode (no persistence, no knowledge system, no auth). Example: `postgresql+asyncpg://membrain:membrain@localhost:5432/membrain`.

Redis / Cache¶

Field	Environment Variable	Type	Default	Description
`redis_url`	`REDIS_URL`	`str`	`""` (empty)	Redis connection URL. When empty, caching, rate limiting, and budget enforcement are disabled. Example: `redis://localhost:6379`. Supports `rediss://` for TLS.

Provider API Keys¶

API keys for upstream LLM providers. Add keys for the providers you want to route requests to. Not required if using local models only (Ollama) or non-proxy features.

Field	Environment Variable	Type	Default	Description
`openai_api_key`	`OPENAI_API_KEY`	`str \\| None`	`None`	OpenAI API key (e.g., `sk-...`). Enables the OpenAI provider and GPT model family. Also used by the `openai` embedding backend.
`anthropic_api_key`	`ANTHROPIC_API_KEY`	`str \\| None`	`None`	Anthropic API key (e.g., `sk-ant-...`). Enables the Anthropic provider and Claude model family.
`google_api_key`	`GOOGLE_API_KEY`	`str \\| None`	`None`	Google API key. Enables Google/Gemini models when used with the LiteLLM provider.

Anthropic Proxy¶

Settings specific to the Anthropic-compatible proxy endpoint (/v1/messages).

Field	Environment Variable	Type	Default	Description
`upstream_anthropic_url`	`UPSTREAM_ANTHROPIC_URL`	`str`	`"https://api.anthropic.com"`	Base URL for the upstream Anthropic API. This uses an explicit alias to avoid collision with `ANTHROPIC_BASE_URL`, which Claude Code sets to point at this proxy. Change this if you need to route through a different Anthropic endpoint.
`anthropic_proxy_timeout`	`ANTHROPIC_PROXY_TIMEOUT`	`float`	`300.0`	Timeout in seconds for upstream Anthropic API requests. Applies to both streaming and non-streaming calls.

Knowledge / Embeddings¶

Settings that control the knowledge system, including embedding generation, semantic search, summarization, topic tagging, deduplication, and content lifecycle.

Field	Environment Variable	Type	Default	Description
`embedding_backend`	`EMBEDDING_BACKEND`	`str`	`"local"`	Embedding model backend. `"local"` uses sentence-transformers (all-MiniLM-L6-v2, runs on CPU, no API calls). `"openai"` uses the OpenAI embeddings API (requires `OPENAI_API_KEY`).
`embedding_dimension`	`EMBEDDING_DIMENSION`	`int`	`384`	Dimensionality of embedding vectors. Must match the backend: `384` for local sentence-transformers, `1536` for OpenAI `text-embedding-ada-002`. Mismatches will cause pgvector errors.
`knowledge_similarity_threshold`	`KNOWLEDGE_SIMILARITY_THRESHOLD`	`float`	`0.3`	Minimum cosine similarity score for knowledge entries to be injected as context into LLM requests. Lower values return more (potentially less relevant) results; higher values are more selective. Range: `0.0` to `1.0`.
`knowledge_summarization`	`KNOWLEDGE_SUMMARIZATION`	`str`	`"off"`	Knowledge summarization mode. `"off"` stores entries verbatim. `"basic"` produces concise summaries. `"aggressive"` produces minimal bullet-point summaries. Summarization reduces storage and improves retrieval but loses some detail.
`knowledge_ttl_days`	`KNOWLEDGE_TTL_DAYS`	`int`	`0`	Default time-to-live in days for knowledge entries. Entries older than this are considered stale and down-weighted in search results (see `knowledge_stale_weight`). `0` means entries never expire.
`knowledge_stale_weight`	`KNOWLEDGE_STALE_WEIGHT`	`float`	`0.3`	Weight multiplier applied to stale knowledge entries during search ranking. A value of `0.3` means stale entries contribute 30% of their original relevance score. Range: `0.0` (completely ignore stale) to `1.0` (no penalty).
`knowledge_dedup_similarity_threshold`	`KNOWLEDGE_DEDUP_SIMILARITY_THRESHOLD`	`float`	`0.95`	Cosine similarity threshold for near-duplicate detection when adding knowledge entries. Entries with similarity above this threshold to an existing entry are rejected as duplicates. Range: `0.0` to `1.0`.
`knowledge_auto_topics`	`KNOWLEDGE_AUTO_TOPICS`	`bool`	`False`	Enable automatic topic extraction for knowledge entries. When `True`, new entries are tagged with topics using keyword-based extraction. Set via `KNOWLEDGE_AUTO_TOPICS=true` or `KNOWLEDGE_AUTO_TOPICS=1`.
`knowledge_consolidation_schedule`	`KNOWLEDGE_CONSOLIDATION_SCHEDULE`	`str`	`""` (empty)	Cron expression for automatic knowledge consolidation (dedup, stale removal, re-ranking). Empty string means consolidation is manual only (triggered via API). Example: `"0 2 * * *"` for daily at 2 AM.
`knowledge_source_ttl_overrides`	`KNOWLEDGE_SOURCE_TTL_OVERRIDES`	`dict`	`{}` (empty)	Per-source TTL overrides in days. Allows different content sources to have different expiration periods. Example (as JSON env var): `{"auto_extracted": 30, "api_manual": 90}`.
`knowledge_source_trust_levels`	`KNOWLEDGE_SOURCE_TRUST_LEVELS`	`dict`	`{}` (empty)	Per-source trust level multipliers for search ranking. Higher trust means entries from that source rank higher. Example (as JSON env var): `{"auto_extracted": 0.7, "api_manual": 1.0}`. Range per value: `0.0` to `1.0`.

Rate Limiting¶

Field	Environment Variable	Type	Default	Description
`rate_limit_rpm`	`RATE_LIMIT_RPM`	`int`	`60`	Maximum requests per minute per API key. Requires Redis to be configured. Set to `0` to disable rate limiting entirely. Per-key overrides can be configured via the admin API.

Budget Enforcement¶

Field	Environment Variable	Type	Default	Description
`budget_daily_limit_usd`	`BUDGET_DAILY_LIMIT_USD`	`float`	`0.0`	Maximum spend in USD per API key per day. Requires Redis to be configured. Set to `0.0` to disable daily budget limits.
`budget_monthly_limit_usd`	`BUDGET_MONTHLY_LIMIT_USD`	`float`	`0.0`	Maximum spend in USD per API key per month. Requires Redis to be configured. Set to `0.0` to disable monthly budget limits.

Ollama (Local LLM)¶

Field	Environment Variable	Type	Default	Description
`ollama_url`	`OLLAMA_URL`	`str`	`"http://localhost:11434"`	Base URL of the Ollama server for local LLM inference. The Ollama provider is auto-registered when this URL is reachable.

Transparent Network Proxy¶

Settings for the TLS-intercepting transparent proxy that enables network-level PII scanning.

Field	Environment Variable	Type	Default	Description
`proxy_mode`	`PROXY_MODE`	`str`	`"application"`	Proxy operation mode. `"application"` runs only the HTTP API gateway (default). `"network"` runs only the transparent TLS proxy. `"hybrid"` runs both simultaneously.
`ca_cert_path`	`CA_CERT_PATH`	`str`	`""` (empty)	Filesystem path to the organization CA certificate in PEM format. Required for `network` and `hybrid` proxy modes. This certificate is used to sign dynamically generated TLS certificates for intercepted connections.
`ca_key_path`	`CA_KEY_PATH`	`str`	`""` (empty)	Filesystem path to the organization CA private key in PEM format. Required for `network` and `hybrid` proxy modes. Must correspond to the certificate at `CA_CERT_PATH`.
`proxy_port`	`PROXY_PORT`	`int`	`8443`	Port the transparent TLS proxy listens on. Only used when `PROXY_MODE` is `"network"` or `"hybrid"`.
`proxy_listen`	`PROXY_LISTEN`	`str`	`"0.0.0.0"`	Address the transparent proxy binds to. Only used when `PROXY_MODE` is `"network"` or `"hybrid"`.

Vault / Secret Storage¶

Settings for the pluggable secret storage backend used to manage API keys and other sensitive values.

Field	Environment Variable	Type	Default	Description
`vault_backend`	`VAULT_BACKEND`	`str`	`"env"`	Secret storage backend. `"env"` reads secrets from environment variables (simple, no external dependencies). `"hashicorp"` uses HashiCorp Vault KV v2 (requires Vault server).
`vault_addr`	`VAULT_ADDR`	`str`	`""` (empty)	HashiCorp Vault server address. Only used when `VAULT_BACKEND=hashicorp`. Example: `http://127.0.0.1:8200`.
`vault_token`	`VAULT_TOKEN`	`str`	`""` (empty)	HashiCorp Vault authentication token. Only used when `VAULT_BACKEND=hashicorp`. In production, use a renewable token or AppRole auth.
`vault_mount`	`VAULT_MOUNT`	`str`	`"secret"`	HashiCorp Vault KV v2 mount point. Only used when `VAULT_BACKEND=hashicorp`.
`vault_path_prefix`	`VAULT_PATH_PREFIX`	`str`	`"membrain"`	Path prefix inside the Vault mount for Membrain secrets. Only used when `VAULT_BACKEND=hashicorp`. Secrets are stored at `<mount>/data/<prefix>/<key>`.

PII Detection¶

PII detection is not configured via environment variables in the Settings class. Instead, the PII scanner supports three detection modes configured programmatically:

regex (default) -- Pattern-based detection using 25+ built-in regex patterns covering emails, SSNs, credit cards, phone numbers, IP addresses, API keys (OpenAI, GitHub, AWS, Stripe, Anthropic, GCP, Slack), UUIDs, JWTs, private keys, database URLs, IBANs, MAC addresses, dates of birth, passports, US addresses, connection strings, and bearer tokens.
ner -- ML-based named entity recognition using the dslim/bert-base-NER model from HuggingFace. Detects person names, organizations, locations, and miscellaneous entities.
hybrid -- Runs both regex and NER, with regex taking precedence on overlapping spans.

Custom PII patterns can be passed programmatically when constructing a PIIScanner instance.

Alerting¶

The alerting system (alert rules, channels, and thresholds) is configured programmatically rather than through environment variables. Alert rules are defined as AlertRule instances with:

name -- Rule identifier
metric -- Metric name to monitor
threshold -- Value that triggers the alert
window_seconds -- Time window for metric aggregation
channel -- Channel name to deliver the alert to

Notification channels include:

WebhookChannel -- Sends JSON POST payloads to a configured URL
SlackChannel -- Sends formatted messages to a Slack incoming webhook

Telemetry¶

Telemetry metrics (counters, histograms, gauges) are collected automatically by the MetricsMiddleware ASGI middleware and exposed at the /metrics endpoint in Prometheus exposition format. Telemetry is always enabled and requires no configuration.

Routing¶

The intelligent model router resolves "auto" model requests to concrete models based on tier, privacy, and cost. Tier mappings are defined in src/membrain/routing/router.py:

Tier	Models
`fast`	`gpt-4o-mini`, `claude-haiku-4-5-20251001`
`balanced`	`gpt-4o`, `claude-sonnet-4-20250514`
`best`	`o1`, `claude-opus-4-20250514`

Routing behavior (tier selection, privacy constraints, fallback chains) is controlled via request parameters rather than environment variables.

Example .env File¶

Below is a complete .env file showing every setting with its default value. Uncomment and modify values as needed.

# =============================================================================
# Membrain Gateway Configuration
# =============================================================================
# Copy this file to .env and adjust values for your environment.
# All values shown are the defaults unless otherwise noted.

# -----------------------------------------------------------------------------
# Core / Server
# -----------------------------------------------------------------------------
HOST=0.0.0.0
PORT=8000
DEFAULT_PROVIDER=claude_cli
DEFAULT_MODEL=sonnet

# -----------------------------------------------------------------------------
# Database (empty = in-memory mode, no persistence)
# -----------------------------------------------------------------------------
# DATABASE_URL=postgresql+asyncpg://membrain:membrain@localhost:5432/membrain

# -----------------------------------------------------------------------------
# Redis (empty = no caching, rate limiting, or budgets)
# -----------------------------------------------------------------------------
# REDIS_URL=redis://localhost:6379

# -----------------------------------------------------------------------------
# Cloud Provider API Keys (add the providers you use)
# -----------------------------------------------------------------------------
# OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...
# GOOGLE_API_KEY=...

# -----------------------------------------------------------------------------
# Anthropic Proxy
# -----------------------------------------------------------------------------
# UPSTREAM_ANTHROPIC_URL=https://api.anthropic.com
# ANTHROPIC_PROXY_TIMEOUT=300.0

# -----------------------------------------------------------------------------
# Knowledge / Embeddings
# -----------------------------------------------------------------------------
# EMBEDDING_BACKEND=local
# EMBEDDING_DIMENSION=384
# KNOWLEDGE_SIMILARITY_THRESHOLD=0.3
# KNOWLEDGE_SUMMARIZATION=off
# KNOWLEDGE_TTL_DAYS=0
# KNOWLEDGE_STALE_WEIGHT=0.3
# KNOWLEDGE_DEDUP_SIMILARITY_THRESHOLD=0.95
# KNOWLEDGE_AUTO_TOPICS=false
# KNOWLEDGE_CONSOLIDATION_SCHEDULE=
# KNOWLEDGE_SOURCE_TTL_OVERRIDES={}
# KNOWLEDGE_SOURCE_TRUST_LEVELS={}

# -----------------------------------------------------------------------------
# Rate Limiting (requires Redis; 0 = disabled)
# -----------------------------------------------------------------------------
# RATE_LIMIT_RPM=60

# -----------------------------------------------------------------------------
# Budget Enforcement (requires Redis; 0.0 = disabled)
# -----------------------------------------------------------------------------
# BUDGET_DAILY_LIMIT_USD=0.0
# BUDGET_MONTHLY_LIMIT_USD=0.0

# -----------------------------------------------------------------------------
# Ollama (local LLM)
# -----------------------------------------------------------------------------
# OLLAMA_URL=http://localhost:11434

# -----------------------------------------------------------------------------
# Transparent Network Proxy
# -----------------------------------------------------------------------------
# PROXY_MODE=application
# CA_CERT_PATH=
# CA_KEY_PATH=
# PROXY_PORT=8443
# PROXY_LISTEN=0.0.0.0

# -----------------------------------------------------------------------------
# Vault / Secret Storage
# -----------------------------------------------------------------------------
# VAULT_BACKEND=env
# VAULT_ADDR=
# VAULT_TOKEN=
# VAULT_MOUNT=secret
# VAULT_PATH_PREFIX=membrain

Common Configuration Scenarios¶

Minimal Development Setup¶

For local development with Claude CLI only (no database, no Redis):

HOST=127.0.0.1
PORT=8000

Then point Claude Code at the proxy:

export ANTHROPIC_BASE_URL=http://localhost:8000

Full Production Setup¶

# Server
HOST=0.0.0.0
PORT=8000

# Database
DATABASE_URL=postgresql+asyncpg://membrain:secret@db.internal:5432/membrain

# Redis
REDIS_URL=redis://redis.internal:6379

# Providers
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
UPSTREAM_ANTHROPIC_URL=https://api.anthropic.com

# Knowledge (OpenAI embeddings for better quality)
EMBEDDING_BACKEND=openai
EMBEDDING_DIMENSION=1536
KNOWLEDGE_SIMILARITY_THRESHOLD=0.4
KNOWLEDGE_SUMMARIZATION=basic
KNOWLEDGE_TTL_DAYS=90
KNOWLEDGE_STALE_WEIGHT=0.3
KNOWLEDGE_DEDUP_SIMILARITY_THRESHOLD=0.95
KNOWLEDGE_AUTO_TOPICS=true
KNOWLEDGE_CONSOLIDATION_SCHEDULE=0 2 * * *

# Rate limiting and budgets
RATE_LIMIT_RPM=120
BUDGET_DAILY_LIMIT_USD=50.0
BUDGET_MONTHLY_LIMIT_USD=500.0

# Vault (HashiCorp)
VAULT_BACKEND=hashicorp
VAULT_ADDR=https://vault.internal:8200
VAULT_TOKEN=hvs.xxxxx
VAULT_MOUNT=secret
VAULT_PATH_PREFIX=membrain

Network Proxy Mode (TLS Interception)¶

PROXY_MODE=hybrid
CA_CERT_PATH=/etc/membrain/ca.pem
CA_KEY_PATH=/etc/membrain/ca-key.pem
PROXY_PORT=8443
PROXY_LISTEN=0.0.0.0

Privacy-First Setup (Local Models Only)¶

OLLAMA_URL=http://localhost:11434
DEFAULT_PROVIDER=ollama
EMBEDDING_BACKEND=local
EMBEDDING_DIMENSION=384