AI Security This Week: Cursor RCE, 8,000 Exposed MCP Servers, and LLMs Jailbreaking LLMs

A critical Cursor vulnerability let attackers execute arbitrary code through shell built-in commands that bypassed every security control. Security researchers found over 8,000 MCP servers sitting on the public internet with no authentication. And a new study in Nature Communications shows that large reasoning models can autonomously jailbreak other AI systems with a 97% success rate.

Here’s what went wrong this week and why these vulnerabilities matter for anyone using AI development tools.

Cursor RCE: Shell Built-ins Bypass Everything

Cursor, the AI-powered code editor used by millions of developers, patched a critical vulnerability this month that enabled remote code execution through indirect prompt injection.

The flaw, tracked as CVE-2026-22708, exploited implicit trust in shell built-in commands like export and typeset. These commands could execute without user notification or approval, even when a user’s allowlist was completely empty.

An attacker could chain these built-ins to poison your shell environment. For example, redirecting malicious code to ~/.zshrc (your shell startup script) combined with the export command - which required no approval - gave attackers persistent code execution every time you opened a terminal.

The attack worked in both “zero-click” and “one-click” scenarios. A malicious MCP server, a compromised repository, or even content loaded from the web could trigger it.

Cursor released a fix and now requires explicit user approval for any commands their parser can’t classify. But their security guidelines now explicitly warn against trusting allowlists as a security barrier - even “safe” commands can be weaponized through environmental manipulation.

If you’re using Cursor, update to version 2.3 or later immediately.

8,000 MCP Servers: The Agentic AI Security Crisis

The Model Context Protocol was supposed to make AI agents more capable by giving them structured access to external tools and data. Instead, it’s created a sprawling attack surface that nobody secured.

In February 2026, security researchers scanned the internet and found over 8,000 MCP servers publicly accessible with admin panels, debug endpoints, or API routes exposed without any authentication.

The vulnerabilities keep stacking up:

CVE-2025-6514 in mcp-remote, a popular OAuth proxy for connecting local MCP clients to remote servers, allowed remote code execution through command injection. Over 500,000 developers were affected.

CVE-2026-2256 in MS-Agent’s Shell tool let attackers read secrets like API keys and tokens, drop payloads, modify workspace state, establish persistence, and pivot to internal services - all because the shell command input wasn’t sanitized.

Invariant Labs demonstrated that a malicious MCP server could silently exfiltrate a user’s entire WhatsApp history by combining “tool poisoning” with a legitimate whatsapp-mcp server running in the same agent.

The core problem: MCP was designed with an implicit trust model. It lacks robust built-in security controls, which enables prompt injection, tool poisoning, and credential theft. The protocol’s power comes from giving AI agents access to sensitive systems. That same power makes it dangerous when exposed.

Autonomous Jailbreak Agents: 97% Success Rate

A March 2026 study published in Nature Communications found that large reasoning models can function as autonomous jailbreak agents - LLMs attacking other LLMs - with a 97.14% success rate.

The researchers discovered that the persuasive capabilities of large reasoning models (LRMs) simplify and scale jailbreaking, converting it into an inexpensive activity accessible to non-experts. You don’t need to craft elaborate prompts anymore. You can just ask a reasoning model to do it for you.

A separate study found that Persuasive and Authority Prompting (PAP) outperformed every other jailbreak strategy tested, including the classic DAN persona approach. When a prompt invokes expertise, urgency, or institutional framing - “as a cybersecurity researcher conducting authorized testing…” - the model’s helpfulness training overrides its safety guardrails.

Meanwhile, researchers developing the Head-Masked Nullspace Steering (HMNS) technique presented at ICLR 2026 showed that probing LLMs internally to identify weaknesses in their safety guardrails is more effective than clever prompt manipulation alone. They’re attacking the model’s “decision pathways” directly.

The implications: AI safety measures are failing against AI-powered attacks. Guardrails trained to resist human manipulation aren’t equipped for adversaries that can iterate thousands of times faster and probe model internals systematically.

The Wider Damage: n8n, LangChain, and More

Several other critical vulnerabilities surfaced in the AI tooling ecosystem:

n8n (CVE-2026-21858, CVSS 10.0): A critical unauthenticated RCE vulnerability in the popular workflow automation platform let attackers take over locally deployed instances completely. They could forge administrator sessions, extract credentials, and pivot to connected infrastructure. Affected versions prior to 1.121.0.

LangChain (CVE-2025-68664, CVSS 9.3): A serialization injection vulnerability in LangChain’s dumps() and dumpd() functions enabled secret extraction from environment variables and potential arbitrary code execution through Jinja2 templates. Update to 1.2.5 or 0.3.81.

GitHub Copilot (CVE-2026-21516, CVE-2026-21256): Command injection flaws in Copilot’s JetBrains and Visual Studio integrations scored CVSS 8.8. Developer workstations hold private keys, long-lived tokens, and CI credentials - a single local code execution can cascade into supply chain compromise.

Google Gemini ASCII Smuggling: Google is refusing to patch a vulnerability that lets attackers embed invisible commands in text using Unicode characters. OpenAI’s ChatGPT, Anthropic’s Claude, and Microsoft’s Copilot already sanitize these inputs. Gemini, Grok, and DeepSeek remain vulnerable.

What This Means

This week’s incidents reveal an uncomfortable truth: the AI development ecosystem is deeply insecure.

The tooling is moving faster than security. Cursor’s allowlist bypass, MCP’s authentication gaps, and n8n’s RCE all stem from the same cause - shipping features before securing them. Developers trust these tools with credentials, source code, and system access. That trust is increasingly misplaced.

AI attacking AI is now practical. The 97% jailbreak success rate from autonomous agents isn’t theoretical research. It’s a capability that exists today. Attackers don’t need deep expertise anymore - they can delegate the hard work to reasoning models.

Implicit trust is everywhere. MCP assumes servers are trustworthy. Cursor assumed shell built-ins were safe. LangChain assumed user data wouldn’t contain serialization markers. Every implicit trust assumption becomes an attack vector.

What You Can Do

For Cursor users: Update to version 2.3+ immediately. Review what MCP servers and external content your environment loads. Don’t rely on allowlists for security.

For MCP users: Audit any MCP servers you’re running. Ensure they’re not exposed to the public internet. Require authentication. Be extremely cautious about which servers you connect to - tool poisoning attacks can exfiltrate data through seemingly legitimate servers.

For n8n users: Update to 1.121.0 or later. If you’re running n8n anywhere accessible beyond localhost, assume it may have been compromised if you were on an older version.

For LangChain users: Update to 1.2.5 or 0.3.81. Audit any code that serializes user-controlled data.

For everyone: The common thread is that AI tools are being trusted with system access they haven’t earned. Treat AI development tools like any other software with privileged access: keep them updated, audit their network exposure, and assume they can be manipulated through their inputs.

The velocity of AI feature development shows no signs of slowing. Neither does the rate at which vulnerabilities are being discovered. The security gap is widening.