Microsoft Copilot Was Reading 'Confidential' Emails for a Month - Despite Your DLP Policies

For nearly a month, Microsoft 365 Copilot was reading and summarizing emails that organizations had explicitly marked as confidential. The AI ignored sensitivity labels. It bypassed data loss prevention policies. And it cheerfully presented summaries of protected content to anyone who asked.

Microsoft confirmed the bug on February 18 and says a fix has been deployed worldwide. The company’s statement emphasized that no one accessed data they weren’t already authorized to see. That framing misses why organizations use confidentiality labels in the first place.

What Happened

The bug, tracked internally as CW1226324, first appeared on January 21, 2026. It affected the “work tab” in Copilot Chat - the interface that lets users interact with content across their Microsoft 365 environment.

According to Microsoft’s acknowledgment, “a code issue is allowing items in the sent items and draft folders to be picked up by Copilot even though confidential labels are set in place.”

When users applied sensitivity labels like “Confidential” to emails - protections specifically designed to prevent automated processing and restrict access - Copilot ignored them. The AI would read messages in Sent Items and Drafts folders and provide summaries on request, treating labeled content identically to unlabeled content.

Microsoft deployed a configuration update on February 19 that the company says addresses the issue globally for enterprise customers.

Microsoft’s Defense Misses the Point

Microsoft’s official statement emphasized that “this did not provide anyone access to information they weren’t already authorized to see.” The company framed the bug as a UI issue rather than a data breach.

This is technically accurate and entirely beside the point.

Organizations don’t apply confidentiality labels because they want to hide emails from the person who wrote them. They apply labels because they want to control how that data flows through automated systems. The whole point of a DLP policy is to prevent sensitive content from being processed, summarized, extracted, or transmitted in ways the organization hasn’t approved.

When an executive drafts an email about an upcoming acquisition and labels it confidential, they’re not trying to hide it from themselves. They’re trying to ensure it doesn’t get fed into an AI model, summarized in a chat interface, or processed by any system that might leak, log, or expose that content.

Copilot reading and summarizing confidential emails is exactly the scenario these controls are designed to prevent. That it was the same user’s emails doesn’t make it acceptable - it means the fundamental data governance model broke.

Why This Matters for Enterprises

Microsoft 365 Copilot integrates across Word, Excel, Outlook, and other Office applications. It has access to email, documents, calendars, and chat. Organizations adopted it specifically because of this deep integration.

But deep integration requires deep trust. Organizations deploying Copilot have to believe that Microsoft’s security controls actually work - that when they configure DLP policies and apply sensitivity labels, those protections will be honored by every system in the Microsoft ecosystem, including the AI.

This bug demonstrated that for nearly a month, they didn’t. Every Copilot user in affected organizations could have asked the AI to summarize content that DLP policies should have blocked.

The implications extend beyond the specific emails involved:

Compliance uncertainty. Organizations in regulated industries use sensitivity labels to demonstrate control over protected data. If a HIPAA-covered entity applied labels to emails containing patient information, and Copilot summarized that content anyway, they have a potential compliance incident - regardless of whether anyone outside the organization saw it.

Audit trail gaps. When Copilot summarizes an email, does that activity get logged? If an organization needs to demonstrate that confidential content wasn’t processed by AI systems, can they? The answers aren’t obvious.

Policy enforcement trust. Security teams now have to wonder what other controls might fail silently. If DLP policies didn’t work on AI summarization for a month without anyone noticing, what else isn’t working?

The Broader Pattern

This isn’t an isolated incident. Enterprise AI tools are accumulating a track record of security controls failing to apply to AI features:

GitHub Copilot had three critical RCE vulnerabilities that could execute arbitrary commands through unsanitized inputs
AI agent frameworks have been caught ignoring security policies entirely in favor of task completion
Internal research at major AI labs shows that text-based safety training doesn’t transfer to tool calls

The pattern is consistent: security controls designed for traditional software don’t automatically apply to AI features bolted onto that software. AI systems have different attack surfaces, different failure modes, and different ways of processing data than the applications they integrate with.

When Microsoft built Copilot to read emails, they knew it needed to respect sensitivity labels. The bug wasn’t a missing feature - it was a failure in the implementation of a feature they knew they needed. That’s the kind of bug that erodes confidence in the entire system.

What Organizations Should Do

Verify the fix is deployed. Microsoft says the configuration update rolled out globally on February 19. Confirm your tenant has received it. Don’t assume.

Review Copilot access logs. Determine whether users in your organization requested summaries of content that should have been protected during the vulnerability window. If you can’t determine this, that’s a separate problem worth addressing.

Test your controls. Create test emails with sensitivity labels and verify Copilot actually respects them now. Don’t take Microsoft’s word for it.

Assess your exposure. If your organization uses sensitivity labels for regulatory compliance, document the incident and your response. You may need this for audit purposes.

Reconsider your Copilot deployment scope. Some organizations may want to restrict Copilot access until they’re confident that DLP and sensitivity controls are working reliably. The productivity benefits of AI don’t justify exposing confidential communications.

The Bottom Line

Microsoft shipped an AI feature that ignored the security labels its own platform uses to protect sensitive content. For nearly a month, any Copilot user could have asked the AI to summarize emails their organization had explicitly marked as confidential.

Microsoft’s defense - that users only saw their own emails - fundamentally misunderstands why organizations use these controls. The purpose of a DLP policy isn’t just to hide data from unauthorized people. It’s to control how data flows through automated systems.

When your AI ignores your own security controls for a month, saying “but no one else saw it” isn’t reassuring. It’s an admission that the controls don’t work as advertised.