Skip to content

Configuration Reference

Overview

Membrain is configured entirely through environment variables and optional .env files. The configuration system is built on pydantic-settings, which provides typed, validated settings with sensible defaults.

All settings are defined in a single Settings class located at src/membrain/config.py. Because no env_prefix is configured, environment variable names match the uppercase version of each field name directly (e.g., the host field is set via the HOST environment variable). The one exception is upstream_anthropic_url, which uses the alias UPSTREAM_ANTHROPIC_URL to avoid colliding with ANTHROPIC_BASE_URL (set by Claude Code to point at the proxy itself).

Settings are loaded once at application startup via Settings() in the lifespan function. The .env file is read automatically from the working directory.


Configuration Precedence

Settings are resolved in the following order (highest priority first):

  1. Environment variables -- explicitly set in the shell or container runtime
  2. .env file -- key-value pairs in a .env file in the working directory
  3. Defaults -- hardcoded defaults in the Settings class

The model_config also sets extra = "ignore", so unrecognized environment variables are silently ignored rather than causing validation errors.


Configuration Table

Core / Server

Settings that control the HTTP server and general gateway behavior.

Field Environment Variable Type Default Description
host HOST str "0.0.0.0" Address the HTTP server binds to. Use 0.0.0.0 to listen on all interfaces or 127.0.0.1 for localhost only.
port PORT int 8000 Port the HTTP server listens on. The Anthropic proxy (/v1/messages) and OpenAI-compatible endpoint (/v1/chat/completions) are both served on this port.
default_provider DEFAULT_PROVIDER str "claude_cli" Default LLM provider when none is specified. Options: "claude_cli", "openai", "anthropic", "ollama", "litellm".
default_model DEFAULT_MODEL str "sonnet" Default model name when none is specified in a request.

Database

Field Environment Variable Type Default Description
database_url DATABASE_URL str "" (empty) PostgreSQL connection string using asyncpg. When empty, the gateway runs in in-memory mode (no persistence, no knowledge system, no auth). Example: postgresql+asyncpg://membrain:membrain@localhost:5432/membrain.

Redis / Cache

Field Environment Variable Type Default Description
redis_url REDIS_URL str "" (empty) Redis connection URL. When empty, caching, rate limiting, and budget enforcement are disabled. Example: redis://localhost:6379. Supports rediss:// for TLS.

Provider API Keys

API keys for upstream LLM providers. Add keys for the providers you want to route requests to. Not required if using local models only (Ollama) or non-proxy features.

Field Environment Variable Type Default Description
openai_api_key OPENAI_API_KEY str \| None None OpenAI API key (e.g., sk-...). Enables the OpenAI provider and GPT model family. Also used by the openai embedding backend.
anthropic_api_key ANTHROPIC_API_KEY str \| None None Anthropic API key (e.g., sk-ant-...). Enables the Anthropic provider and Claude model family.
google_api_key GOOGLE_API_KEY str \| None None Google API key. Enables Google/Gemini models when used with the LiteLLM provider.

Anthropic Proxy

Settings specific to the Anthropic-compatible proxy endpoint (/v1/messages).

Field Environment Variable Type Default Description
upstream_anthropic_url UPSTREAM_ANTHROPIC_URL str "https://api.anthropic.com" Base URL for the upstream Anthropic API. This uses an explicit alias to avoid collision with ANTHROPIC_BASE_URL, which Claude Code sets to point at this proxy. Change this if you need to route through a different Anthropic endpoint.
anthropic_proxy_timeout ANTHROPIC_PROXY_TIMEOUT float 300.0 Timeout in seconds for upstream Anthropic API requests. Applies to both streaming and non-streaming calls.

Knowledge / Embeddings

Settings that control the knowledge system, including embedding generation, semantic search, summarization, topic tagging, deduplication, and content lifecycle.

Field Environment Variable Type Default Description
embedding_backend EMBEDDING_BACKEND str "local" Embedding model backend. "local" uses sentence-transformers (all-MiniLM-L6-v2, runs on CPU, no API calls). "openai" uses the OpenAI embeddings API (requires OPENAI_API_KEY).
embedding_dimension EMBEDDING_DIMENSION int 384 Dimensionality of embedding vectors. Must match the backend: 384 for local sentence-transformers, 1536 for OpenAI text-embedding-ada-002. Mismatches will cause pgvector errors.
knowledge_similarity_threshold KNOWLEDGE_SIMILARITY_THRESHOLD float 0.3 Minimum cosine similarity score for knowledge entries to be injected as context into LLM requests. Lower values return more (potentially less relevant) results; higher values are more selective. Range: 0.0 to 1.0.
knowledge_summarization KNOWLEDGE_SUMMARIZATION str "off" Knowledge summarization mode. "off" stores entries verbatim. "basic" produces concise summaries. "aggressive" produces minimal bullet-point summaries. Summarization reduces storage and improves retrieval but loses some detail.
knowledge_ttl_days KNOWLEDGE_TTL_DAYS int 0 Default time-to-live in days for knowledge entries. Entries older than this are considered stale and down-weighted in search results (see knowledge_stale_weight). 0 means entries never expire.
knowledge_stale_weight KNOWLEDGE_STALE_WEIGHT float 0.3 Weight multiplier applied to stale knowledge entries during search ranking. A value of 0.3 means stale entries contribute 30% of their original relevance score. Range: 0.0 (completely ignore stale) to 1.0 (no penalty).
knowledge_dedup_similarity_threshold KNOWLEDGE_DEDUP_SIMILARITY_THRESHOLD float 0.95 Cosine similarity threshold for near-duplicate detection when adding knowledge entries. Entries with similarity above this threshold to an existing entry are rejected as duplicates. Range: 0.0 to 1.0.
knowledge_auto_topics KNOWLEDGE_AUTO_TOPICS bool False Enable automatic topic extraction for knowledge entries. When True, new entries are tagged with topics using keyword-based extraction. Set via KNOWLEDGE_AUTO_TOPICS=true or KNOWLEDGE_AUTO_TOPICS=1.
knowledge_consolidation_schedule KNOWLEDGE_CONSOLIDATION_SCHEDULE str "" (empty) Cron expression for automatic knowledge consolidation (dedup, stale removal, re-ranking). Empty string means consolidation is manual only (triggered via API). Example: "0 2 * * *" for daily at 2 AM.
knowledge_source_ttl_overrides KNOWLEDGE_SOURCE_TTL_OVERRIDES dict {} (empty) Per-source TTL overrides in days. Allows different content sources to have different expiration periods. Example (as JSON env var): {"auto_extracted": 30, "api_manual": 90}.
knowledge_source_trust_levels KNOWLEDGE_SOURCE_TRUST_LEVELS dict {} (empty) Per-source trust level multipliers for search ranking. Higher trust means entries from that source rank higher. Example (as JSON env var): {"auto_extracted": 0.7, "api_manual": 1.0}. Range per value: 0.0 to 1.0.

Rate Limiting

Field Environment Variable Type Default Description
rate_limit_rpm RATE_LIMIT_RPM int 60 Maximum requests per minute per API key. Requires Redis to be configured. Set to 0 to disable rate limiting entirely. Per-key overrides can be configured via the admin API.

Budget Enforcement

Field Environment Variable Type Default Description
budget_daily_limit_usd BUDGET_DAILY_LIMIT_USD float 0.0 Maximum spend in USD per API key per day. Requires Redis to be configured. Set to 0.0 to disable daily budget limits.
budget_monthly_limit_usd BUDGET_MONTHLY_LIMIT_USD float 0.0 Maximum spend in USD per API key per month. Requires Redis to be configured. Set to 0.0 to disable monthly budget limits.

Ollama (Local LLM)

Field Environment Variable Type Default Description
ollama_url OLLAMA_URL str "http://localhost:11434" Base URL of the Ollama server for local LLM inference. The Ollama provider is auto-registered when this URL is reachable.

Transparent Network Proxy

Settings for the TLS-intercepting transparent proxy that enables network-level PII scanning.

Field Environment Variable Type Default Description
proxy_mode PROXY_MODE str "application" Proxy operation mode. "application" runs only the HTTP API gateway (default). "network" runs only the transparent TLS proxy. "hybrid" runs both simultaneously.
ca_cert_path CA_CERT_PATH str "" (empty) Filesystem path to the organization CA certificate in PEM format. Required for network and hybrid proxy modes. This certificate is used to sign dynamically generated TLS certificates for intercepted connections.
ca_key_path CA_KEY_PATH str "" (empty) Filesystem path to the organization CA private key in PEM format. Required for network and hybrid proxy modes. Must correspond to the certificate at CA_CERT_PATH.
proxy_port PROXY_PORT int 8443 Port the transparent TLS proxy listens on. Only used when PROXY_MODE is "network" or "hybrid".
proxy_listen PROXY_LISTEN str "0.0.0.0" Address the transparent proxy binds to. Only used when PROXY_MODE is "network" or "hybrid".

Vault / Secret Storage

Settings for the pluggable secret storage backend used to manage API keys and other sensitive values.

Field Environment Variable Type Default Description
vault_backend VAULT_BACKEND str "env" Secret storage backend. "env" reads secrets from environment variables (simple, no external dependencies). "hashicorp" uses HashiCorp Vault KV v2 (requires Vault server).
vault_addr VAULT_ADDR str "" (empty) HashiCorp Vault server address. Only used when VAULT_BACKEND=hashicorp. Example: http://127.0.0.1:8200.
vault_token VAULT_TOKEN str "" (empty) HashiCorp Vault authentication token. Only used when VAULT_BACKEND=hashicorp. In production, use a renewable token or AppRole auth.
vault_mount VAULT_MOUNT str "secret" HashiCorp Vault KV v2 mount point. Only used when VAULT_BACKEND=hashicorp.
vault_path_prefix VAULT_PATH_PREFIX str "membrain" Path prefix inside the Vault mount for Membrain secrets. Only used when VAULT_BACKEND=hashicorp. Secrets are stored at <mount>/data/<prefix>/<key>.

PII Detection

PII detection is not configured via environment variables in the Settings class. Instead, the PII scanner supports three detection modes configured programmatically:

  • regex (default) -- Pattern-based detection using 25+ built-in regex patterns covering emails, SSNs, credit cards, phone numbers, IP addresses, API keys (OpenAI, GitHub, AWS, Stripe, Anthropic, GCP, Slack), UUIDs, JWTs, private keys, database URLs, IBANs, MAC addresses, dates of birth, passports, US addresses, connection strings, and bearer tokens.
  • ner -- ML-based named entity recognition using the dslim/bert-base-NER model from HuggingFace. Detects person names, organizations, locations, and miscellaneous entities.
  • hybrid -- Runs both regex and NER, with regex taking precedence on overlapping spans.

Custom PII patterns can be passed programmatically when constructing a PIIScanner instance.


Alerting

The alerting system (alert rules, channels, and thresholds) is configured programmatically rather than through environment variables. Alert rules are defined as AlertRule instances with:

  • name -- Rule identifier
  • metric -- Metric name to monitor
  • threshold -- Value that triggers the alert
  • window_seconds -- Time window for metric aggregation
  • channel -- Channel name to deliver the alert to

Notification channels include:

  • WebhookChannel -- Sends JSON POST payloads to a configured URL
  • SlackChannel -- Sends formatted messages to a Slack incoming webhook

Telemetry

Telemetry metrics (counters, histograms, gauges) are collected automatically by the MetricsMiddleware ASGI middleware and exposed at the /metrics endpoint in Prometheus exposition format. Telemetry is always enabled and requires no configuration.


Routing

The intelligent model router resolves "auto" model requests to concrete models based on tier, privacy, and cost. Tier mappings are defined in src/membrain/routing/router.py:

Tier Models
fast gpt-4o-mini, claude-haiku-4-5-20251001
balanced gpt-4o, claude-sonnet-4-20250514
best o1, claude-opus-4-20250514

Routing behavior (tier selection, privacy constraints, fallback chains) is controlled via request parameters rather than environment variables.


Example .env File

Below is a complete .env file showing every setting with its default value. Uncomment and modify values as needed.

# =============================================================================
# Membrain Gateway Configuration
# =============================================================================
# Copy this file to .env and adjust values for your environment.
# All values shown are the defaults unless otherwise noted.

# -----------------------------------------------------------------------------
# Core / Server
# -----------------------------------------------------------------------------
HOST=0.0.0.0
PORT=8000
DEFAULT_PROVIDER=claude_cli
DEFAULT_MODEL=sonnet

# -----------------------------------------------------------------------------
# Database (empty = in-memory mode, no persistence)
# -----------------------------------------------------------------------------
# DATABASE_URL=postgresql+asyncpg://membrain:membrain@localhost:5432/membrain

# -----------------------------------------------------------------------------
# Redis (empty = no caching, rate limiting, or budgets)
# -----------------------------------------------------------------------------
# REDIS_URL=redis://localhost:6379

# -----------------------------------------------------------------------------
# Cloud Provider API Keys (add the providers you use)
# -----------------------------------------------------------------------------
# OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...
# GOOGLE_API_KEY=...

# -----------------------------------------------------------------------------
# Anthropic Proxy
# -----------------------------------------------------------------------------
# UPSTREAM_ANTHROPIC_URL=https://api.anthropic.com
# ANTHROPIC_PROXY_TIMEOUT=300.0

# -----------------------------------------------------------------------------
# Knowledge / Embeddings
# -----------------------------------------------------------------------------
# EMBEDDING_BACKEND=local
# EMBEDDING_DIMENSION=384
# KNOWLEDGE_SIMILARITY_THRESHOLD=0.3
# KNOWLEDGE_SUMMARIZATION=off
# KNOWLEDGE_TTL_DAYS=0
# KNOWLEDGE_STALE_WEIGHT=0.3
# KNOWLEDGE_DEDUP_SIMILARITY_THRESHOLD=0.95
# KNOWLEDGE_AUTO_TOPICS=false
# KNOWLEDGE_CONSOLIDATION_SCHEDULE=
# KNOWLEDGE_SOURCE_TTL_OVERRIDES={}
# KNOWLEDGE_SOURCE_TRUST_LEVELS={}

# -----------------------------------------------------------------------------
# Rate Limiting (requires Redis; 0 = disabled)
# -----------------------------------------------------------------------------
# RATE_LIMIT_RPM=60

# -----------------------------------------------------------------------------
# Budget Enforcement (requires Redis; 0.0 = disabled)
# -----------------------------------------------------------------------------
# BUDGET_DAILY_LIMIT_USD=0.0
# BUDGET_MONTHLY_LIMIT_USD=0.0

# -----------------------------------------------------------------------------
# Ollama (local LLM)
# -----------------------------------------------------------------------------
# OLLAMA_URL=http://localhost:11434

# -----------------------------------------------------------------------------
# Transparent Network Proxy
# -----------------------------------------------------------------------------
# PROXY_MODE=application
# CA_CERT_PATH=
# CA_KEY_PATH=
# PROXY_PORT=8443
# PROXY_LISTEN=0.0.0.0

# -----------------------------------------------------------------------------
# Vault / Secret Storage
# -----------------------------------------------------------------------------
# VAULT_BACKEND=env
# VAULT_ADDR=
# VAULT_TOKEN=
# VAULT_MOUNT=secret
# VAULT_PATH_PREFIX=membrain

Common Configuration Scenarios

Minimal Development Setup

For local development with Claude CLI only (no database, no Redis):

HOST=127.0.0.1
PORT=8000

Then point Claude Code at the proxy:

export ANTHROPIC_BASE_URL=http://localhost:8000

Full Production Setup

# Server
HOST=0.0.0.0
PORT=8000

# Database
DATABASE_URL=postgresql+asyncpg://membrain:secret@db.internal:5432/membrain

# Redis
REDIS_URL=redis://redis.internal:6379

# Providers
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
UPSTREAM_ANTHROPIC_URL=https://api.anthropic.com

# Knowledge (OpenAI embeddings for better quality)
EMBEDDING_BACKEND=openai
EMBEDDING_DIMENSION=1536
KNOWLEDGE_SIMILARITY_THRESHOLD=0.4
KNOWLEDGE_SUMMARIZATION=basic
KNOWLEDGE_TTL_DAYS=90
KNOWLEDGE_STALE_WEIGHT=0.3
KNOWLEDGE_DEDUP_SIMILARITY_THRESHOLD=0.95
KNOWLEDGE_AUTO_TOPICS=true
KNOWLEDGE_CONSOLIDATION_SCHEDULE=0 2 * * *

# Rate limiting and budgets
RATE_LIMIT_RPM=120
BUDGET_DAILY_LIMIT_USD=50.0
BUDGET_MONTHLY_LIMIT_USD=500.0

# Vault (HashiCorp)
VAULT_BACKEND=hashicorp
VAULT_ADDR=https://vault.internal:8200
VAULT_TOKEN=hvs.xxxxx
VAULT_MOUNT=secret
VAULT_PATH_PREFIX=membrain

Network Proxy Mode (TLS Interception)

PROXY_MODE=hybrid
CA_CERT_PATH=/etc/membrain/ca.pem
CA_KEY_PATH=/etc/membrain/ca-key.pem
PROXY_PORT=8443
PROXY_LISTEN=0.0.0.0

Privacy-First Setup (Local Models Only)

OLLAMA_URL=http://localhost:11434
DEFAULT_PROVIDER=ollama
EMBEDDING_BACKEND=local
EMBEDDING_DIMENSION=384