223 AI Data Violations Per Month. Per Company.

The average company now logs 223 cases per month where an employee sends sensitive data to an AI tool in ways that break policy. In the top quartile, that number is 2,100.

Those numbers come from Netskope's 2026 Cloud and Threat Report, drawn from anonymized telemetry across millions of users worldwide. It's the most detailed look anyone has published at how much sensitive data is flowing to AI tools from inside companies.

The number doubled year over year. And it's almost certainly an undercount. Here's why.

How Netskope sees what it sees

If you've never worked in enterprise security, you may not know what Netskope does. The short version: Netskope is a cloud security company that acts as a toll booth between employees and the internet. Every piece of traffic, every file upload, every prompt typed into ChatGPT or Claude or Gemini passes through Netskope's cloud on the way out. The company decrypts the SSL traffic, inspects the content in real time and applies whatever policies the customer has set: block this, flag that, let the rest through.

This is how enterprise data loss prevention (DLP) has worked for years. Companies like Netskope, Zscaler and Forcepoint sit in the path of the data and watch everything go by. They see the file names, the content, the sender, the destination. When something trips a rule (a spreadsheet with customer records uploaded to a personal Google Drive, a contract draft attached to a Gmail message, a database export heading to an unapproved cloud app), they log it, block it or both. More recently, they've added the same scanning to AI tools, inspecting prompts and uploads to ChatGPT, Claude and Gemini in real time.

The model works. It catches real violations. But it only catches violations in traffic that flows through it.

Why 223 is the floor, not the ceiling

Here's the gap. Even among Netskope's own customer base (mostly large enterprises with security budgets), half don't have DLP tools covering AI apps. For those companies, the number isn't 223. It's zero, not because nothing is leaking, but because nobody's watching.

And Netskope's customers are the ones who already bought a security product. The vast majority of companies, the mid-market firms and small businesses that make up most of the economy, don't run cloud DLP at all. For them, there's no toll booth, no inspection and no count. The real number of AI data violations happening across all companies every month is unknowable. 223 is the floor in a building with no ceiling.

And even in companies that do have DLP, 47% of AI users still access tools through personal accounts. That means they're typing prompts on personal devices, on home Wi-Fi, outside the corporate proxy. The toll booth only works if you drive through it.

Usage grew six times faster than violations did. Prompts went from about 3,000 per month to 18,000 in the average company. In the top 1%, that's over 1.4 million prompts a month. Violations only doubled. That math doesn't mean companies got safer. It means detection is falling behind.

What's getting pasted

The breakdown of what's leaking matters more than the total count.

Source code is the top category. Developers pasting proprietary code into ChatGPT, Copilot or Claude to debug or refactor it. After that: regulated data (personal, financial and health records), intellectual property and passwords or API keys, usually embedded in the code being pasted.

54% of all violations involve regulated data. That's the category with fines attached. HIPAA, GDPR, state privacy laws. Not "we'd rather you didn't." The kind with legal teeth.

And only 3% of AI users in the average company are causing the violations. A small group generating a big problem. Which suggests the fix isn't banning AI tools company-wide. It's catching the data at the point where it's about to leave.

What telemetry collects (and what can go wrong)

Netskope's report is valuable. But it's worth understanding what makes it possible: the company sees everything. Every prompt. Every file. Every URL. That's how DLP works. You can't flag a patient record in a ChatGPT prompt without reading the prompt first.

This kind of telemetry creates a tension. The same data that powers the report also creates a target. And the history of security companies handling user data is not spotless.

In 2020, Motherboard and PCMag revealed that Avast, the antivirus company used by over 435 million people, had been collecting detailed browsing data through its security software and selling it through a subsidiary called Jumpshot. The data included every URL visited, every search query, precise timestamps and unique device identifiers. Jumpshot sold it to over 100 companies, including Google, Microsoft, Pepsi and the advertising giant Omnicom. The FTC fined Avast $16.5 million and banned them from selling browsing data.

The tagline on Avast's download page while this was happening? "Shield your privacy."

That's not an argument against DLP. Companies need to know what's leaving their networks. But it's a reminder that every tool that can see your data is also a tool that can misuse your data. The question is always: who's watching the watchers?

Why we built BeatMask differently

When we started building BeatMask, we went through the same decision every security product faces. We could collect telemetry. Aggregated detection counts, prompt categories, the types of data people were pasting. It would have given us great data for reports like Netskope's. We could have published our own version of the 223 stat.

We decided not to. No prompt content. No user profiles. No cross-site tracking. Nothing leaves the device. The detection happens locally and stays local.

That decision costs us something. We can't publish telemetry reports. We can't tell you how many API keys we've caught this quarter or what percentage of our users paste source code into ChatGPT. We don't know, by design.

The tradeoff: you don't have to trust that we'll handle your data responsibly. You can open dev tools, watch the network traffic and see nothing leave. We built it so the question "what do they do with my data?" has a verifiable answer: nothing, because they never had it.

Netskope's 223 is important. It proves the problem is real, measurable and getting worse. But the architecture that makes it visible is the same architecture that puts all that data in one place. We chose a different path.