Desktop agents, not copilots, are the enterprise battleground for agentic AI

For the last two years, most enterprise AI has been delivered as a copilot: summarise this, draft that, suggest the next step inside a sanctioned application. Useful? Yes. Transformative? Only in narrow slices of work.

That frame is now breaking.

A new class of desktop-resident AI agents – OpenClaw, Claude Code, Cowork, Codex, and the products converging around them – does not just answer questions or autocomplete text. These systems can write and execute code, manipulate files, use the browser, call APIs, schedule tasks, and keep working after the prompt ends. In one memorable phrase from CNBC’s Deirdre Bosa, AI has gone “from talking to doing.”

If you want a practical answer to what is agentic AI, start there: agentic AI is software that can take a goal, choose tools, act on the environment, and verify progress with limited supervision. And today, the clearest expression of that idea is not the chatbot. It is the desktop agent.

This matters because it creates a strategic divide in the enterprise AI market. On one side will be companies that find a way to let employees use desktop agents safely within their risk tolerance. On the other will be companies that remain stuck behind restrictive copilot deployments – limited to vendor-controlled, sandboxed experiences that can suggest, but not truly execute.

The first group will build compounding advantages. The second will get incremental productivity and call it a strategy.

What makes desktop agents different? Arbitrary code execution.

The unlock is simple: arbitrary code execution.

Once an AI agent can write and run code on a user’s machine, it is no longer confined to the boundaries of a single app. It can bridge systems the way a human does: open a folder, parse PDFs, clean a spreadsheet, log into a dashboard, call an internal API, move data into a CRM, draft an email, and create a report. In principle, if a human can complete the workflow on a computer, a capable desktop agent can automate large parts of it.

That is why desktop agents are fundamentally more powerful than copilots.

A copilot lives inside an approved context. It can help you in Excel, or in your CRM, or in your email client. A desktop agent works across the messy reality of actual enterprise work: browser tabs, local files, shared drives, APIs, PDFs, spreadsheets, dashboards, internal tools, and all the undocumented handoffs in between.

This is also why the phrase “AI agent takes control of computer” has resonated so strongly. It sounds dramatic, but it is basically correct.

Anthropic’s own product evolution makes the point. When it launched Cowork, it effectively admitted that Claude Code had escaped the coding box. Users, it said, were employing it for “almost everything else.” Anthropic’s Boris Cherny listed examples that had little to do with software engineering: vacation research, slide decks, cleaning up email, recovering wedding photos, even controlling devices. The category is not defined by code as an end goal. Code is just the mechanism that gives the agent reach.

OpenClaw helped popularize the next set of primitives: persistent agents, scheduled tasks, and remote control from a phone. Anthropic quickly followed with remote control and scheduled tasks in Claude Code and Cowork. Perplexity launched “Computer.” Notion launched custom agents. This is not copycat product marketing. It is category convergence. The industry is discovering the same thing at once: agentic automation becomes dramatically more valuable when the agent can keep working, operate across tools, and execute real actions.

For enterprise leaders, that is the key shift. The question is no longer whether AI can help an employee think faster. It is whether AI can help an employee do more of the work itself.

The non-technical revolution is the real story

The narrative around AI agents still leans too heavily on developers. That is understandable – coding was the earliest visible proof point. But it is already incomplete.

The bigger story is that non-technical employees are becoming builders of their own automations.

One non-technical operator described Claude Code as unlocking three things for him: direct access to APIs, the ability to stitch multiple tools together, and the ability to run recurring scripts. Within about an hour, he had set up a daily email that ranked the three messages he most needed to answer. Another user fed Claude Code raw DNA data from an ancestry test and used it to identify health-related genes to monitor. Lenny Rachitsky used Cowork to analyze 320 podcast transcripts and extract the ten most important themes and lessons for product builders.

This is not “AI helps engineers code faster.” This is business users creating their own agentic automation.

That is why Claude Code is so often described as misnamed. As one observer put it, it should be thought of as “Claude Computer,” not Claude Code. That framing is much closer to reality.

The data supports this. In Anthropic’s study of real-world agent use, software engineering accounted for only around half of tool calls. The rest were outside engineering: back-office automation at 9.1%, marketing and copywriting at 4.4%, sales and CRM at 4.3%, finance and accounting at 4.0%, and more beyond that. In other words, more than half of observed agentic AI use was already outside software engineering.

The AI Daily Brief’s own pulse surveys point in the same direction. Among its leading-edge user base, 62% reported automation or agentic use cases, and 71% had vibe coded in the previous month. That audience is not representative of the whole economy, but it is a useful leading indicator. The frontier has moved.

From Affinda’s vantage point, this matters especially in document-heavy workflows, where so much enterprise work already follows the same pattern: read, interpret, structure, verify, and route. Think claims packs, invoices, contracts, onboarding documents, procurement packets, loan files, and compliance evidence. These are exactly the kinds of workflows that desktop agents can orchestrate – especially when paired with reliable document extraction and validation layers.

This is where agentic AI use cases become real for operations, finance, HR, legal, and go-to-market teams.

Copilot vs. desktop agent is an architecture decision

For enterprise leaders evaluating AI agents for business, the most important choice is not which model wins the latest benchmark. It is which architecture your organisation is willing to embrace.

The copilot model is familiar:

The desktop agent model is different:

This is why the divide now matters so much. Copilots are a bounded productivity layer. Desktop agents are execution infrastructure.

That does not mean copilots are obsolete. For many high-control environments, they remain sensible. But leaders should be honest about the trade-off: a copilot strategy is also a constraint strategy. You are choosing a thinner version of AI.

Cowork is interesting because it suggests a middle ground. Anthropic appears to mount only explicitly approved files into a containerised or VM-like environment and asks for approval before significant actions. That is an important design move. It preserves much of the power of a desktop agent while introducing a governable execution model.

Even then, compared with traditional copilots, it is still frighteningly powerful.

And the risks are real. Anthropic explicitly warns about prompt injection and destructive actions. A desktop agent that can read files, click around the web, and run code creates a larger attack surface than a copilot living inside a sidebar. Enterprises should treat that as a feature of the category, not an edge case.

But the answer is not to pretend these tools do not exist.

In fact, Anthropic’s user behavior data hints at what mature AI agent governance looks like. More experienced Claude Code users enabled full auto-approval roughly 40% of the time, versus about 20% for newer users. But they also interrupted the agent more often: 9% versus 5%. The lesson is important. Effective governance is not full autonomy and it is not full lockdown. It is active supervision.

The real governance question is not, “Should we allow desktop agents?” The real question is, “Under what conditions, in which environments, with which permissions, approvals, and logs, do we allow them to act?”

That is a much more useful enterprise AI governance question.

The market is not software assistance. It is knowledge work.

This is why desktop agents are a much bigger category than copilots.

SemiAnalysis described the opportunity as a “$15 trillion information work economy” and posed the provocative question: “If agents can eat software, what labour pool can they not touch?” The phrasing is intentionally sharp, but the core insight is correct. Once AI can handle the universal work pattern of read, think, write, verify, the addressable market expands from software assistance to white-collar execution.

That pattern is not unique to code. It is how most knowledge work operates:

Software engineering simply reached the threshold first because code is unusually verifiable. The broader logic generalises.

This is also why the market conversation should not get stuck on labour substitution alone. Aaron Levie recently made a Jevons-paradox-style argument for knowledge work: when the cost of producing software, analysis, research, and workflow automation falls, organisations do not simply do the same amount for less money. They do far more of it. More internal tools. More experiments. More segmentation. More reporting. More automation. More edge-case handling.

In that sense, desktop agents are not just “AI employees.” They are a new operating layer for knowledge work.

The strategic importance of the category is already visible in vendor behavior. OpenAI hired OpenClaw creator Peter Steinberger to work on “the next generation of personal agents,” saying such systems would quickly become core to its product offerings. Anthropic built Cowork in 10 days with four engineers, and most of the code was reportedly written by Claude Code itself. At Notion, Ivan Zhao says the company already has around 700 agents supporting a workforce of roughly 1,000 employees.

The center of gravity is moving.

An AI governance framework for desktop agents

So what should enterprise leaders actually do?

Not ban the category. Not allow a free-for-all. The right answer is to build a workable AI governance framework for desktop agents now.

A practical starting point looks like this:

1. Start where the upside is obvious and the blast radius is limited

Target repetitive, document-heavy, multi-system workflows first: intake, reconciliation, exception handling, research prep, reporting, and internal operations. These are ideal for agentic automation because the labour is real, the interfaces are fragmented, and the gains are measurable.

2. Put agents in managed environments

Use isolated VMs, managed desktops, containerised workspaces, or equivalent controls. Cowork’s design points in the right direction. Do not make an employee’s unconstrained local machine your default deployment model.

3. Separate read, write, and execute permissions

An agent that can read documents is not the same as one that can delete files, send emails, push code, or initiate payments. Treat those as different control classes. Least privilege matters more here than it did with copilots.

4. Require approvals for high-consequence actions

External communications, production changes, deletions, financial actions, and irreversible record updates should sit behind human approval gates. Let the agent do the heavy lifting; reserve judgment and authorization for the human.

5. Log everything

Good enterprise AI governance is operational, not rhetorical. Log prompts, actions, generated code, files touched, APIs called, approvals granted, and outputs produced. If you cannot audit it, you cannot scale it.

6. Train employees to supervise, not just prompt

The valuable skill is no longer asking clever questions. It is specifying goals, providing context, checking intermediate work, and intervening when needed. In other words, employees need to learn how to manage agents.

7. Revisit identity and access design

The hardest governance question may be this: should agents inherit exactly the same permissions as humans? In some workflows, yes. In many others, no. Over time, enterprises will likely move toward agent-specific identities with narrower scopes, better logging, and explicit escalation paths.

That is what enterprise AI governance should mean in this category: not a policy PDF, but a control system that makes powerful agents usable.

The cost of waiting is rising

The case for action is not mainly that employees will inevitably do this on their own anyway, though some will.

The stronger case is competitive.

The capability overhang – the gap between what AI can now do and what most enterprises are actually using it for – is widening. The companies on the leading edge are not just saving time. They are changing operating leverage. They are letting non-technical teams build their own tooling. They are turning document-heavy workflows into agentic automation. They are creating internal distributions of knowledge work that were previously too expensive to attempt.

Those gains compound.

The next enterprise divide will not be between companies that bought copilots and companies that did not. It will be between companies that learned how to let AI act and companies that only let AI suggest.

Desktop agents are riskier than copilots. They are harder to govern. They are also much more important.

Enterprise leaders should take the category seriously now, build a governance model that fits their risk tolerance, and start learning in production-like conditions. Because the organisations that figure out desktop agents first will not just have better AI tooling.

They will have a different cost structure, a faster learning loop, and a more capable workforce.