Shadow AI Is Your Biggest Blind Spot

Right now, someone at your company is pasting customer medical history into ChatGPT. Someone else is feeding source code to Copilot. A third is using Claude CLI from their terminal to summarize legal contracts — complete with client names and account numbers. None of it approved by IT. None of it logged. None of it visible to your security team.

This isn't speculation. It's happening at scale, and the gap between what organizations think they're controlling and what's actually flowing to third-party AI providers is widening every month.

The Scale of the Problem

65%

of AI tools used at work run without IT approval (Salesforce, 2023)

15%

of employees regularly paste sensitive data into public LLMs (CyberHaven, 2024)

40%

of organizations have reported an AI-related privacy incident (Cisco, 2024)

Shadow AI isn't a new category of risk — it's a new intensity of an old one. Shadow IT has existed since employees first installed unauthorized software. But shadow AI is different in one critical way: data flows in one direction. When someone installs an unauthorized database tool, the data stays on-premises. When someone pastes a contract into ChatGPT, that data leaves your environment, enters a third-party API, and may be used for model training, retained in logs, or accessible to the provider's personnel.

You no longer control it. And in most cases, you never knew it left.

In 2023, Samsung engineers accidentally leaked proprietary source code and internal meeting notes by submitting them to ChatGPT for debugging and summarization. Samsung banned employee use of generative AI tools shortly after — but the data had already been transmitted to OpenAI's servers. The damage was done before anyone noticed.

Consider a scenario that is playing out across firms every day: associates use public AI assistants to draft client communications, inadvertently including privileged case details and opposing party strategies. DLP tools flag nothing — because they aren't watching AI traffic, and the attorneys are using personal browsers on firm-issued laptops.

These aren't edge cases. They're representative of a pattern playing out across enterprises every day, at every scale, in every industry that handles sensitive information.

Why Existing Tools Miss It

Security teams haven't been sitting idle. There's a growing ecosystem of tools marketed as "AI security" solutions. The problem is that each category has a fundamental architectural gap that leaves shadow AI invisible.

AI Gateways (LiteLLM, Portkey, Cloudflare AI Gateway)

AI gateways are excellent at what they do: proxying, routing, and observing AI traffic that has been explicitly configured to flow through them. You set up the gateway, point your application's API calls at it, and it logs, caches, and enforces policies on that traffic.

The gap is obvious once you see it: they only see traffic you route through them. An employee using the ChatGPT web interface? Invisible. A developer running Claude CLI from their terminal? Invisible. A contractor using the Gemini app on their laptop? Invisible. None of these clients know your gateway exists, and none of them will route through it.

Gateways solve a real problem — they're essential for controlling AI usage in applications your engineering team builds. But they do nothing for the human beings at your organization who access AI directly, which is to say, most of the risk.

Traditional DLP (Nightfall, Netskope, Zscaler)

Data Loss Prevention platforms were built for a world of file transfers, email attachments, and web form submissions. They inspect outbound traffic looking for patterns: credit card numbers, SSNs, health record IDs. They're reasonably good at this within their intended scope.

AI protocols are a different problem. A modern AI request is a JSON payload containing a messages array, each message potentially containing nested tool_use blocks, image data, thinking tokens, and multi-turn conversation context. PII doesn't sit in a predictable field — it's embedded in natural language inside deeply nested structures. A patient name might appear in: "role": "user", "content": "My patient John Smith had a glucose reading of..."

Traditional DLP tools either miss this entirely (because they're scanning at the wrong layer), generate overwhelming false positives (because regex patterns over-match in conversational text), or lack support for streaming responses where the response arrives in chunks over an HTTP/2 connection. They weren't designed for this, and retrofitting pattern matching onto AI protocols produces poor results in both directions.

AI Security Platforms (Varonis Atlas, Cisco AI Defense)

This newer category takes a different approach: instead of inspecting traffic, they discover AI usage by scanning your cloud environment, SaaS applications, code repositories, and connected services. They build an inventory of AI tools your organization is using and flag risky configurations or overpermissioned integrations.

Discovery is useful. Knowing that your sales team has connected Salesforce to an AI tool you don't recognize is genuinely valuable. But discovery is retrospective — it tells you what happened, not what is happening. It doesn't intercept the request, strip the PII, or enforce a policy before the data reaches the provider. By the time the platform reports a risk, the data is already gone.

Real-time enforcement requires sitting in the path. Discovery platforms, by design, do not.

The Network-Level Answer

If you need to see all AI traffic — from every application, every browser, every CLI tool, every managed device — you need to be at the network level. Not at the application level (where only configured traffic flows), not at the cloud level (where you see inventory, not data), but in the actual network path between your employees and the AI providers they're reaching.

The architecture is DNS override plus TLS termination plus AI-native inspection:

How it works
DNS override — Managed devices resolve api.openai.com, api.anthropic.com, generativelanguage.googleapis.com, and similar hostnames to your gateway's IP instead of the provider's IP.
TLS termination — The gateway presents certificates for those hostnames (via a trusted internal CA deployed to managed devices), terminates the TLS connection, inspects the plaintext request, then re-encrypts and forwards to the real provider.
AI-native inspection — The gateway understands AI protocols: JSON chat completion format, streaming responses, tool_use blocks, thinking tokens. It can parse and transform the request, not just pattern-match over it.

The result: every AI request from every managed device flows through your security layer, regardless of which application made it, regardless of whether that application was ever configured to use a proxy, regardless of whether the user is using a browser, a CLI tool, or a desktop application.

Capability	App Gateway	DLP / CASB	AI Security Platform	Network Proxy
Shadow AI coverage	✗ Configured traffic only	~ Partial (HTTPS inspection)	✗ Discovery only	✓ All managed devices
AI protocol awareness	✓ Full	✗ Generic DLP patterns	✗ Metadata only	✓ Full
Real-time enforcement	✓ Yes	~ Block / allow only	✗ No	✓ Yes (PII strip, rewrite)
Code changes needed	✗ Yes (point apps at gateway)	✓ No	✓ No	✓ No (DNS + CA cert)
Deployment time	Days to weeks	Days to weeks	Days to weeks	Hours

Network-level deployment doesn't require modifying applications. It requires deploying a certificate to managed devices and updating DNS resolution — both standard MDM operations that most enterprise IT teams can execute in an afternoon.

Beyond Detection: A Cognitive Layer

Detection is necessary but not sufficient. Once you can see every AI request flowing out of your organization, the question becomes: what do you do with that visibility?

The most valuable use of network-level AI interception isn't blocking — it's transformation. And the most powerful transformation isn't just removing PII from a request before it leaves your network. It's building infrastructure that makes every AI interaction smarter, safer, and more valuable over time.

Memory

Every AI response that flows through your gateway is a potential knowledge artifact. A consultant summarizing a client engagement, an engineer documenting a debugging session, a lawyer drafting contract language — these responses contain institutional knowledge that currently lives nowhere. A cognitive layer extracts, embeds, and stores this knowledge, making it retrievable for future interactions. Your organization gets smarter every time someone uses AI, instead of starting from zero each time.

Threat Detection

PII stripping at the network level means the data never reaches the provider. Not masked, not tokenized — removed before the TLS connection to the upstream API is opened, replaced with safe placeholders that let the AI response remain useful while the sensitive content stays inside your perimeter. This is categorically different from what DLP can offer.

Judgment

Policies can be applied per team, per user, per project, per data classification. An engineering team can freely use AI for code review. A finance team's AI access can be restricted to deny requests that contain customer financial data. An HR team can be allowed to use AI for policy drafting but blocked from submitting personnel records. These aren't firewall rules — they're context-aware policies that understand what's in the request before deciding what to do with it.

Routing

Not every request needs to go to OpenAI. Requests that contain sensitive information can be routed to self-hosted models running in your own infrastructure, where the data never crosses a network boundary at all. Requests that are cost-sensitive can be routed to smaller, cheaper models. Requests that require compliance logging can be routed through a higher-assurance path. The network layer is the natural place to make these decisions, because it's the only layer that sees all traffic.

Key Insight

Memory, threat detection, judgment, and routing aren't isolated features — they share state. A policy that routes sensitive requests to a local model can also extract knowledge from those responses. A PII scanner that strips data before forwarding can also log that a particular user or team is regularly sending sensitive content, triggering an alert workflow. The value compounds when these functions operate on the same request, at the same moment, in a single pipeline.

What This Means for Your Organization

Shadow AI isn't a problem that goes away if you ban AI tool usage. The productivity pressure is too strong. Engineers will find workarounds. Employees will use their phones. Banning accelerates shadow AI; it doesn't eliminate it.

The only viable path is to move AI usage into an observable, controllable channel — one that doesn't require employees to change their behavior or developers to modify their applications. Here are four concrete steps for CISOs thinking through this problem:

Audit your current blind spots. Pull DNS query logs for a week. Count how many requests to AI provider hostnames are coming from devices you don't have application-layer visibility into. The number is almost always larger than security teams expect. That number is your exposure.
Think network-level, not application-level. Every solution that requires you to "point your applications at the gateway" will have coverage gaps by definition. The correct architectural question is: how do I intercept AI traffic at the network boundary, before it matters whether the application was configured to cooperate?
Demand AI-native inspection, not retrofitted DLP. Ask vendors to demonstrate how they handle streaming responses, tool_use blocks, multi-modal inputs, and nested message structures. If the answer is regex patterns over concatenated text, the false positive rate will be unmanageable and the miss rate will be unacceptable.
Look beyond detection to enforcement and enrichment. Detection tells you what happened. Enforcement changes what happens. Enrichment makes future interactions safer and more valuable. The right architecture does all three, in the path of every request, without adding latency that drives users to find other tools.

The organizations that will handle AI risk well are the ones that get ahead of the infrastructure question now, before a Samsung-style incident forces an emergency response. That means investing in network-level visibility, AI-native protocol handling, and policy enforcement that works whether or not your employees are using the tools you intended them to use.

The tools exist. The architecture is well-understood. The remaining question is whether your organization moves first, or waits for the audit finding.

This is why we built MemBrain

Network-level AI governance with PII stripping, knowledge building, and policy enforcement — for every request, from every device.

Talk to us about your deployment →