Chatbots that answer questions are so 2025. Now Microsoft wants an AI that does your actual work - in its own virtual computer, running in the cloud, while you’re off doing something else.
Copilot Tasks, announced this week, is Microsoft’s entry into the rapidly expanding market for agentic AI: systems that don’t just generate text but take actions on your behalf across multiple applications and services.
How It Works
You describe what you want done in plain English - “Arrange three client meetings in Boston next month, draft agendas, and book flexible hotel rooms” - and Copilot breaks that into a multi-step plan.
Once you approve the plan, something unusual happens. Instead of running on your laptop or phone, Copilot spins up its own cloud-hosted computer and browser. It executes the steps in this isolated sandbox, interacting with web services and applications as if it were a remote employee sitting at a virtual desk.
Tasks can run once, on a schedule, or on a recurring basis. The system reports back when it’s done.
Three AI Brains, One Interface
Under the hood, Copilot Tasks integrates two specialized reasoning agents that Microsoft rolled out separately last year:
Researcher uses OpenAI’s deep research model to conduct multi-step investigations across the web and your work data. It can pull from third-party sources like Salesforce, ServiceNow, and Confluence through connectors.
Analyst runs on OpenAI’s o3-mini reasoning model and handles advanced data analysis. It can write and execute Python code in real time - you can watch the code run and check its work.
A mode selector lets you choose Auto (the system decides which to use), Researcher, or Analyst depending on the task.
What It Can Actually Do
Early examples from Microsoft and testers include:
- Surfacing urgent emails each evening with pre-drafted replies
- Monitoring apartment or rental car listings and booking viewings
- Compiling Monday morning briefings summarizing meetings and travel plans
- Tracking competitor pricing weekly
- Unsubscribing from promotional emails
- Scheduling rides and checking hotel prices
The common thread: recurring tasks that require browsing multiple sites, coordinating across applications, and following conditional logic.
The Permission Question
Microsoft emphasizes that users retain control. Before taking “significant actions” like making payments or sending messages, the system requests consent. You can review, pause, or cancel tasks at any point.
But this raises the question: how much do you actually review?
The promise of agentic AI is that it handles busywork autonomously. If you have to carefully review every action, you’ve just replaced busywork with review work. If you don’t review carefully, you’re trusting an AI to send messages, make purchases, and modify data on your behalf.
Security Concerns Are Real
Microsoft itself has been documenting the security challenges of agentic AI deployments. The company’s security team published research earlier this month identifying common risks:
Cross-prompt injection attacks where malicious content in documents or UI elements overrides agent instructions, potentially leading to data exfiltration or malware installation.
Excessive privileges where agents have broader access than they need, creating larger blast radii when things go wrong.
Insufficient authentication where agents are exposed without proper access controls.
Just weeks ago, researchers disclosed a Microsoft 365 Copilot vulnerability that allowed the AI to bypass data loss prevention policies and access confidential emails it shouldn’t have been able to read.
These aren’t theoretical concerns - they’re active problems Microsoft is working to address even as it expands agentic capabilities.
The Competitive Landscape
Microsoft isn’t alone in this race. The past few months have seen:
- Anthropic launching Claude Cowork, turning its AI into a persistent digital worker that handles recurring tasks autonomously
- Google releasing Gemini task automation for Android, letting the AI operate apps autonomously in a secure virtual window
- OpenAI introducing Agent Mode in ChatGPT
- Perplexity offering Computer use capabilities
Everyone is building AI agents. The differentiation is in execution: how well does it work, how secure is it, and how seamlessly does it integrate with existing workflows?
Microsoft’s advantage is obvious: tight integration with the Microsoft 365 ecosystem that runs much of corporate America. If you’re already paying for Copilot and your company lives in Outlook, Teams, and Office, having an agent that can orchestrate across those applications is compelling.
Who Gets Access
Copilot Tasks launched this week as a “research preview” for a limited group of users. Microsoft plans to expand access gradually, with a waitlist available through Microsoft’s website.
No word yet on pricing, though it’s reasonable to expect it will require a Copilot subscription and potentially additional charges for compute-intensive tasks that spin up cloud resources.
The Bigger Picture
The shift from chatbots to agents represents a genuine evolution in how AI integrates into work. Answering questions is useful. Drafting documents is useful. But actually executing multi-step workflows - booking meetings, monitoring prices, managing email - gets closer to the original promise of AI as a digital assistant.
The tradeoffs are equally real. More autonomy means more trust, which means more potential for things to go wrong at scale. A chatbot that gives bad advice requires you to act on it. An agent that takes bad actions does the damage directly.
Microsoft is betting that the productivity gains outweigh the risks, and that their security controls are good enough to contain the inevitable problems. For enterprises evaluating agentic AI, that bet is worth examining carefully.
The Bottom Line
Copilot Tasks is Microsoft’s most ambitious AI autonomy play yet: a system that creates its own virtual computer to execute multi-step workflows while you’re not watching. The technology is impressive; the question is whether the guardrails are strong enough to match the capabilities.