How to Find Shadow AI Tools Employees Are Using Without Approval
To find unauthorized AI tools employees use, you watch the four places their activity leaves a trace: network egress (DNS and proxy logs to AI domains), identity logs (OAuth grants and SSO sign-ins), the browser and endpoint (where prompts are typed and files pasted), and your existing SaaS, where AI features now hide inside tools you already approved. Shadow AI detection is the practice of correlating those signals into a single picture of who is sending what, to which model, before that data ends up somewhere it shouldn't.
Someone on your team has a browser tab open right now with a paid AI assistant in it, and they have pasted in a client list, a contract, or a chunk of your codebase to save twenty minutes. They are not being reckless. They are being efficient. The problem is that you cannot see it, you never approved it, and the data has already left the building.
This is shadow AI: any generative AI tool used for work that your organisation has not sanctioned, reviewed, or secured. It is not a fringe behaviour. Gartner's 2025 survey of 302 cybersecurity leaders found that 69% of organisations suspect or have evidence that employees are using prohibited public GenAI tools (Gartner, via Infosecurity Magazine). The honest reading of that figure is simpler: if you have more than a handful of staff, some of them are doing this, and you almost certainly have not looked.
Why this is worth an afternoon of your time
The cost is not theoretical. IBM's 2025 Cost of a Data Breach Report found that organisations with high levels of shadow AI saw an average of $670,000 in higher breach costs than those with little or none, and that one in five breaches involved shadow AI (IBM Newsroom, 2025). The same report noted that 63% of breached organisations either had no AI governance policy or were still drafting one. Gartner expects this to harden into a trend: it predicts that by 2030, more than 40% of organisations will suffer a security or compliance incident caused by unauthorised AI use (Gartner, via ITPro).
The uncomfortable part is who is doing it. UpGuard's research, reported by Cybersecurity Dive, found shadow AI use is widespread and that executives and senior managers use unsanctioned tools at higher rates than junior staff (Cybersecurity Dive). The people with the broadest data access are often the ones routing it through tools nobody vetted. So this is not a matter of disciplining a few rule-breakers. It is a visibility gap, and visibility is something you can build.
The four places shadow AI leaves a trace
You do not detect shadow AI by asking people what they use. You detect it by watching where their activity already shows up in systems you control. There are four, and a thorough discovery looks at all of them because each catches what the others miss.
1. Network egress — DNS and proxy logs
Every time someone visits an AI service, their device resolves a domain and opens a connection. If you have a firewall, a secure web gateway, or DNS logging, that traffic is already recorded. Pulling a list of resolved domains and matching it against known AI endpoints — the obvious chat assistants, image generators, transcription services, and the long tail of AI wrappers — gives you a fast first map of which tools are in use and roughly how heavily.
This is the cheapest place to start and the most common blind spot. Most teams have the logs and have simply never queried them for this. The limit is that egress tells you a domain was contacted, not what was sent, and it misses anyone working from a personal device or off the corporate network.
2. Identity — OAuth grants and SSO sign-ins
This is the signal most worth your attention, because it is where the real exposure lives. When an employee clicks "Sign in with Google" or "Connect to Microsoft" on an AI tool, they create an OAuth grant — a standing permission that often inherits their access to email, files, or calendars. That connection persists long after the browser tab closes.
Your Google Workspace or Microsoft 365 admin console lists every third-party app that has been granted access, by user and by scope. Reviewing that list shows you not just which AI tools are connected, but how much they can reach. As the shadow AI tooling vendors note, OAuth grant monitoring and browser/SSO signals are the backbone of modern discovery (Reco). A read-only chat tool is one risk profile; an AI agent with write access to your shared drive is another entirely.
3. The browser and the endpoint
This is where the actual sensitive data crosses the line — at the moment of typing a prompt or pasting a file. Detecting it means looking at the device itself. Managed-browser extensions and endpoint agents can observe interactions with AI services and, with data-loss-prevention rules layered on, flag when something that looks like a customer record, source code, or financial data is being submitted (Netwrix).
This is the most powerful layer and the one that demands the most care. Prompt-level inspection is, in effect, reading what staff type. In the UK that engages your obligations under UK GDPR and the Data Protection Act — monitoring must be proportionate, staff should be informed, and a data protection impact assessment is the sensible default. Detection that quietly surveils people erodes the trust you need for any policy to hold. We would always pair this layer with a clear, communicated acceptable-use policy rather than deploy it silently.
4. Embedded AI inside tools you already approved
The hardest shadow AI to find is the kind that hides in plain sight. Your approved CRM ships an AI summariser. Your note-taking app added an assistant. Your design tool now has generative features. Nobody signed off on the AI specifically, because the application was already on the approved list. Netwrix calls out this exact problem: AI now lives "inside already-approved applications," which is why SaaS feature discovery has to be part of any honest audit. This is unsanctioned AI usage that no domain block will ever catch, because the traffic goes to a vendor you trust.
A practical sequence for a first audit
You do not need to buy a platform to start. A useful first pass for shadow AI discovery and governance, in order:
- Pull the logs you already have. Query DNS and proxy egress for the last 90 days against a list of known AI domains. This is an afternoon's work and usually the most revealing single step.
- Audit your identity provider. Export the third-party app grants from Workspace or Microsoft 365. Sort by scope. Anything with read/write access to email or files goes to the top of the review pile.
- Inventory embedded AI. Walk your approved SaaS list and note which products have shipped AI features and whether they train on or retain your inputs. Vendor documentation usually states this.
- Classify, then prioritise. A free chat tool used to draft a tweet is not the risk. A tool ingesting client data, source code, or anything regulated is. Sort by sensitivity of data exposed, not by tool popularity.
- Decide the response per tool. Block, allow, or — the option most teams skip — sanction a safe equivalent. The reason people reach for shadow AI is that it works. Removing the tool without offering an approved alternative just pushes the behaviour onto personal phones, where you have no visibility at all.
The tooling landscape, in plain terms
When you outgrow a manual audit, the market for shadow AI detection tools in 2026 splits roughly into camps. Identity- and SaaS-centric platforms — the likes of Reco, Nudge Security, BetterCloud, and Trelica — lean on OAuth and SSO signals to map what is connected to your environment (Reco). Endpoint and DLP-led products focus on the prompt and the paste, catching sensitive data at the moment it moves. Your existing secure web gateway or CASB likely already does network-level discovery if you turn the AI-category reporting on.
The trap is treating one camp as the whole answer. Network logs miss embedded AI. Identity audits miss the personal-device user. Endpoint agents miss anyone you have not deployed to. Real shadow AI visibility comes from correlating signals across all four layers — which is precisely why a tidy integration between your identity provider, your network logging, and your endpoint estate matters more than any single product on a comparison page.
What this is really about
The instinct, when you first see the scale of it, is to lock everything down. That tends to fail. People adopted these tools because the tools made their work faster and better, and a blanket ban turns a visible problem into an invisible one. The organisations that handle this well treat detection as the start of a conversation, not a hunt for offenders: see what is being used, understand why, sanction the safe options, and put guardrails on the rest.
That requires the plumbing to actually work together — your logs queryable, your identity grants reviewable, your approved tools genuinely good enough that nobody needs to go around them. If that connective tissue is missing, no amount of policy will hold, because the visibility it depends on isn't there.
Honestly, plenty of teams can run the first audit above themselves in an afternoon, and if you can, you should — you'll learn more from your own logs than from any vendor demo. The work becomes worth handing off when the signals need joining up: when "we found 40 AI tools" has to become "here is exactly who can reach what, monitored, with a safe alternative in place." That is a systems problem, and systems are what we build.
- IBM Newsroom — 2025 Cost of a Data Breach Report (shadow AI findings)
- Gartner GenAI blind spots / 40% by 2030 — via Infosecurity Magazine
- Gartner 69% prohibited GenAI use / shadow AI breaches by 2030 — via ITPro
- Reco — Shadow AI Detection Tools comparison (OAuth/SSO signals)
- Netwrix — Shadow AI Detection Tools (detection methods, embedded AI)
- Cybersecurity Dive — Shadow AI is widespread, executives use it most (UpGuard)