Your AI Agent Is the New Insider Threat

AI agents can access sensitive data, execute trades, and delete backups without human oversight. Most companies aren't ready for what happens when they go wrong.

The threat just gained employee privileges.

Today’s Darktrace report on AI security painted a concerning picture: 73% of security professionals say AI-powered threats are already significantly impacting their organizations. But buried in the data is a more specific warning. Issy Richards, Darktrace’s VP of Product, put it bluntly: “Agentic AI introduces a new class of insider risk.”

These systems can act with the reach of an employee - accessing sensitive data, triggering business processes, executing trades - without human context or accountability. And unlike human insiders who work at human speed, a compromised agent can poison 87% of downstream decision-making within four hours.

We’re not talking about hypothetical risk anymore.

The Scale Problem Nobody’s Ready For

Gartner estimates that 40% of enterprise applications will integrate task-specific AI agents by the end of 2026. That’s up from less than 5% in 2025. Non-human identities - service accounts, API keys, AI agents - are expected to exceed 45 billion globally by year’s end. That’s more than twelve times the human workforce.

Yet only 10% of organizations report having a strategy for managing these autonomous systems.

Meanwhile, Palo Alto Networks Chief Security Intel Officer Wendi Whitmore identified AI agents as the primary insider threat for 2026. Her reasoning is straightforward: autonomous agents are always-on and often have privileged access. This makes them high-value targets. Attackers are starting to bypass traditional lateral movement entirely - instead, they “go straight to the internal LLM and start querying the model, having it do all the work on their behalf.”

The insider threat isn’t a person anymore. It’s the helpful AI assistant you deployed last quarter.

How Agent Attacks Actually Work

In December 2025, OWASP released its Top 10 for Agentic Applications - a framework developed with more than 100 security experts to classify the biggest risks. What distinguishes this from the existing LLM Top 10 is the focus on autonomy: these aren’t just language model vulnerabilities, they’re risks that emerge when AI systems can plan, decide, and act across multiple steps and systems.

The real-world incidents behind the framework read like a catalog of everything that can go wrong.

Tool Misuse (ASI02): In July 2025, a malicious pull request injected destructive instructions into Amazon Q’s codebase: “delete file-system and cloud resources using AWS CLI commands.” The extension lived in the wild for five days before detection. Over one million developers had installed it. Amazon confirmed no functional damage occurred, but the attack chain was proven.

Supply Chain Poisoning (ASI04): In September 2025, researchers discovered the first malicious MCP server in the wild - a package on npm impersonating Postmark’s email service. It worked as advertised, but every message sent through it was secretly BCC’d to an attacker. Any AI agent using this for email operations was unknowingly exfiltrating every message it sent.

Another MCP package contained two reverse shells - one triggering at install, one at runtime for redundancy. It downloaded fresh payloads per installation, enabling targeted attacks. The campaign reached 126 packages and 86,000 downloads.

Unexpected Code Execution (ASI05): In November 2025, researchers found three unsanitized command injection flaws in official Anthropic extensions for Claude Desktop - the Chrome connector, iMessage connector, and Apple Notes connector. The attack chain: user asks Claude a question, web search returns an attacker-controlled page, hidden instructions trigger the vulnerable extension, arbitrary code executes with system privileges. Attackers could access SSH keys, AWS credentials, and browser passwords. CVSS 8.9 severity. Anthropic patched it, but the window existed.

Memory Poisoning: The Long Game

The most insidious threat is memory poisoning (ASI06 in the OWASP framework). Unlike prompt injection that ends when the session closes, poisoned memory persists.

Here’s how it works: an adversary feeds an agent subtle false information across multiple interactions. Over weeks or months, these entries integrate into the agent’s operational context. The agent “learns” the malicious instructions and recalls them in future sessions. Its safety guardrails erode not through a single attack, but through gradual corruption.

Research has shown that five crafted documents can manipulate AI responses 90% of the time. The “MINJA” memory injection attack achieves over 98% success rates. A poisoned memory entry might convince a financial agent that a specific, non-existent account is legitimate. A DevOps agent might be steered to misuse a privileged API key.

In healthcare, an EHR agent assisting clinicians could be attacked to redirect patient identifiers - returning medical records for the wrong patient, potentially leading to misdiagnosis or incorrect medications.

As AI systems begin sharing context and collaborating, poisoned memory in one agent could propagate to others through shared preference databases or collaborative memory pools. One compromised assistant could corrupt an entire ecosystem.

When Agents Talk to Agents

Multi-agent systems introduce cascading failures - classified as ASI08 by OWASP. A single fault, whether a hallucination, prompt injection, or corrupted data, propagates across autonomous agents and amplifies into system-wide harm.

Unlike traditional software errors that stay contained, agentic AI failures multiply through agent-to-agent communication, shared memory, and feedback loops.

Galileo AI’s December 2025 research on multi-agent failures quantified the problem: in simulated systems, a single compromised agent poisoned 87% of downstream decision-making within four hours. Studies documenting 1,642 execution traces across production multi-agent systems show failure rates ranging from 41% to 86.7%.

Most organizations still rely on human-centric insider threat detection calibrated for human-velocity attack patterns. Agents don’t work at human speed.

The Superuser Problem

Whitmore identified what Palo Alto Networks calls the “superuser problem.” Autonomous agents granted broad permissions can chain together access to sensitive applications and resources without security teams’ awareness.

A well-crafted prompt injection or tool-misuse exploit can co-opt an organization’s most powerful, trusted “employee.” The attacker suddenly has an autonomous insider that can silently execute trades, delete backups, or exfiltrate entire customer databases.

And here’s the governance failure: only 37% of organizations have formal AI deployment security policies - down 8 points from last year, according to the Darktrace report. Security policies are declining as deployment accelerates.

What Defense Looks Like

The answer isn’t to stop deploying agents - that ship has sailed. But the security model needs to catch up to the deployment model.

Least privilege, applied to agents: “It becomes equally important for us to make sure that we are only deploying the least amount of privileges needed to get a job done, just like we would do for humans,” Whitmore said. OWASP introduces the concept of “least agency” - only grant agents the minimum autonomy required for safe, bounded tasks.

Human-in-the-loop checkpoints: For actions with financial, operational, or security impact, implement validation layers that act as circuit breakers. An agent should never be allowed to transfer funds, delete data, or change access control policies without explicit human approval.

Memory integrity controls: Implement immutable audit trails for agent long-term storage. Know what your agent “remembers” and how it changes over time.

Supply chain scanning: Know what code is inside your agents before deployment. The Barracuda Security report from November 2025 identified 43 agent framework components with embedded vulnerabilities introduced via supply chain compromise.

Identity threat detection: Monitor agent activity for anomalous behavior - unusual access patterns, privilege escalations, unexpected tool usage. Apply the same scrutiny you’d give a human insider with privileged access.

The Accountability Question

California’s AB 316, which took effect January 1, 2026, precludes defendants from using an AI system’s autonomous operation as a defense to liability claims. If your agent causes harm, you cannot argue that you lacked control over its decisions. The “AI did it” defense is legally foreclosed.

Colorado’s AI Act, effective June 2026, will require deployers of high-risk AI systems to conduct annual impact assessments and implement risk management programs.

The regulatory framework is hardening around the assumption that organizations are responsible for what their agents do - even when those agents act autonomously.

The Darktrace report found that 46% of security professionals feel unprepared to defend against AI-driven attacks. That number needs to drop fast. The agent you deployed to improve productivity has the same access as a senior employee.

Treat it like one.


The OWASP Top 10 for Agentic Applications 2026 is available at genai.owasp.org.