top of page

The Five Types of Enterprise AI Agents

  • Writer: Steven Willmott
    Steven Willmott
  • May 13
  • 8 min read

Updated: 2 days ago


How many agents do we have? And when do they start hiring their own agents?


AI Agents are taking the enterprise by storm (at least on executive roadmaps!) and many leading AI tech companies are rolling out frameworks and platforms for AI Agents. Navigating through what works and what doesn't can be hard to keep track of.


There's huge value in a lot of the deployments, but they also each present their own challenges. As the number of agents increases the interactions between them also become much harder to control.


From all our Enterprise discussion we've typically seen five broad categories of agents, each with their own benefits and risks. Here's a quick guide to the emerging landscape! We'll start first with the agent types and what they do. Then we'll come back around the challenges and risks with running these systems.


Type 1: Knowledge Agents


Agents that read enterprise content and return grounded answers, summaries, drafts, or citations.


Great examples range from Glean search services, to RAG agents that synthesise corporate data and analysis agents that use real-time data feeds to make recommendations.


These agents are deployed in intranets, Teams/Slack, policy portals, enablement libraries, onboarding, compliance Q&A, and internal search. Microsoft Copilot Studio, for example, describes knowledge sources as enterprise data, documents, websites, Dynamics, Power Platform, and external systems used to ground agent responses.


Type 2: Service Agents


Human-facing agents that handle customer or employee requests and escalate when needed.


These are the most obvious type of agent visible from outside an organisation. Deployed in customer support, IT help desk, HR service, contact centres, order tracking, password resets, returns, scheduling, and case triage. Salesforce describes these service agents as systems that can respond within guardrails and handle issues ranging from FAQs to product returns and there are many vendor platforms emerging to provide these capabilities.


Type 3: Embedded Workflow/Action Agents


These are AI-enabled steps inside existing SaaS, low-code, RPA, CRM, ITSM, BI, or security workflows that classify, summarise, enrich, route, or trigger actions, often without being labeled as "agents".


Deployed as background automations: lead scoring, invoice triage, support routing, alert summarisation, ticket enrichment, auto-escalation, approval prep, email classification, data aggregation, or "AI assist" features inside platforms. OpenAI’s agent guide defines action tools as capabilities that can update records, send messages, or hand off tickets, which is a useful way to think about these embedded automations once another agent can trigger them.


This category contains a huge range of types of intelligence, potentially from many SaaS and other product vendors. It also contains what one might call "hidden" agents - systems one would not consider intelligent but that automate "something".


Type 4: Specialist Deepwork Agents


Agents that help teams do deep, iterative work of some kind. The most widespread use of this is by technical teams using coding agents such as Codex and Claude Code to inspect systems, write code, query data, run tests, debug issues, or prepare changes. However, we're starting to see it in other areas from scientific discovery support, and creative work.


In the coding domain, these agents are deployed in IDEs, terminals, notebooks, GitHub/GitLab, CI/CD, data warehouses, BI tools, observability platforms, cloud consoles, and security tooling. Claude Code is described as reading codebases, editing files, running commands, and integrating with development tools; Codex can read, edit, and run code in local and cloud environments. In non coding cases deployments vary but they might be in an R&D Lab, or highly interactive video creation with runway agents.


Type 5: Long-running Operator / Orchestrator Agents


These agents are best exemplified by the increasing use of Claude Code and Codex for knowledge work (Claude Code in the form of Claude Cowork), and even OpenClaw. OpenClaw agents are also the agents the IT team is probably outright banning at the moment. These, and more locked down variants, are persistent agents that sit above other tools, run for extended periods, coordinate tasks, call other agents or automations, and return completed work rather than just answers.


Deployed as cloud coding workers, local machine operators, incident triage agents, workflow managers, research agents, business-process coordinators, and manager agents. Codex cloud and Claude Cowork can work on background tasks in parallel using its own cloud environment. OpenAI also describes manager-style multi-agent systems where one agent coordinates specialised agents via tool calls. There are also much simpler agent orchestration approaches built into agent platforms like crew.ai that allow you to stitch together agents of different types.


The most important distinction between the different types of orchestrator agent isn't necessarily their power or the model behind them, but where they run. Some run on the user's local computer (and have a great deal of local access), others run in the cloud in a sandbox environment, and others run in the limited "agents" only environment provided by the agent framework being used.


Agent Value


Each of these types of agent provides a distinct type of value: internal, external, in a workflow, fully automated or as a collaborator to a human team.


The value of agents is not that they make AI more conversational. It is that they package AI into repeatable units of work. Knowledge agents compress search. Service agents compress intake. Embedded workflow agents compress routing and enrichment. Specialist workbench agents compress expert iteration. Long-running orchestrators compress multi-system coordination. The value increases as the agent gets closer to real work, but so does the need to understand its authority.


Agent Risk


Given the obvious pressure to deploy, we have to ask the obvious questions around the risks these systems present when in production. The riskiest agent in the enterprise may not be the one with the most impressive demo. It may be the small, embedded workflow that no one registered as an agent, sitting downstream of a more powerful orchestrator.


At first glance the unit of risk in deploying agents is the model powering the agent (Gemini vs Claude vs OpenAI vs…) and it’s natural to ask which model is good at what. When it comes to agents though the unit of risk is really "what can the agent do?": from producing the wrong answer to a question that later cascades to more decisions, to taking a digital or physical action that causes harm.


The risks should be organised around what the agent can cause, not the use-case itself.


We break this down into several areas of risk:


  1. Hidden automation becomes untracked authority. Many organisations will have agentic behaviour embedded inside SaaS platforms, workflow builders, scripts, connectors, and "AI assist" features. These systems may aggregate data, classify intent, route requests, draft outputs, or trigger actions without being listed in an AI registry. The validation problem is not just "find all chatbots". It is "find all AI-mediated decisions and actions". OWASP’s Agentic Skills guidance recommends maintaining inventories, approval workflows, and audit logs for agent skills and actions.

  2. Trigger chains create second-order risk. A hidden workflow may be acceptable when triggered by a human. It may be unsafe when triggered by another agent. Example: a support service agent classifies a customer as eligible for a refund, then calls a hidden CRM workflow that opens a refund approval, which triggers a finance automation, which notifies the customer. Each step may be reasonable alone. The chain is the risk. OWASP's agentic risk taxonomy explicitly calls out cascading failures, orchestration and multi-agent exploitation, access-control violations, untraceability, and tool misuse as agent-specific risks.

  3. Tool access changes the validation problem. Once an agent can call tools, use APIs, update records, send messages, query databases, run commands, or hand off work, validation is no longer just response-quality testing. OpenAI's agent guide separates data tools, action tools, and orchestration tools; action tools include updating records and sending messages, while orchestration tools can make agents available as tools for other agents.

  4. Local and cloud operator agents expand the blast radius. A long-running agent with access to a terminal, browser, repository, ticketing system, or cloud environment has a much larger operational footprint than a chat assistant. It can touch code, credentials, logs, local files, dependencies, infrastructure scripts, and downstream systems. This does not make such agents unusable, but it does mean they need sandboxing, scoped credentials, network controls, approval gates, and detailed logs.

  5. Multi-agent systems make accountability harder. In a single chat interaction, the user can often inspect the prompt and answer. In a multi-agent workflow, a manager agent may call a specialised agent, which calls a tool, which triggers a hidden automation, which changes a record. OpenAI describes both manager-style systems, where one agent coordinates other specialised agents, and decentralised systems, where agents hand off control to one another. Validation has to capture the full execution path: input, retrieved context, selected tools, intermediate decisions, downstream triggers, final action, and rollback path.

  6. Guardrails need to be tied to action risk. A knowledge agent can usually tolerate different controls than an agent that can issue refunds, change permissions, modify production code, or send customer communications. OpenAI recommends rating tools by factors such as read-only versus write access, reversibility, required account permissions, and financial impact, then using those ratings to trigger human review or additional checks.


Deploying the Five Agent Types


Coming back to the five agent types in the taxonomy, how do these risk patterns map back onto them. First it is useful to classify the agent by type, then classify its authority. Ask what it can read, what it can change, where its outputs go, what can trigger it, what it can trigger, how long it can run, and how you prove what happened afterward.


A good authority model that tells you about potential risks is:


  1. What can it read? Documents, tickets, CRM records, source code, customer data, logs, emails, secrets, financial data, policies, or prior agent outputs.

  2. What can it write or execute? Messages, tickets, records, code, SQL, shell commands, approvals, payments, workflows, permission changes, or customer-facing communications.

  3. Where do its knowledge outputs go? Are its summaries, classifications, recommendations, or generated answers shown only to a human, inserted into a system of record, used to make decisions, passed to another agent, or used to trigger automation? What happens if the information is wrong?

  4. What can trigger it? A human user, scheduled job, webhook, email, Slack message, ticket update, document upload, monitoring alert, system event, or another agent.

  5. What can it trigger? APIs, hidden workflows, RPA bots, low-code flows, other agents, CI/CD pipelines, customer notifications, approval chains, escalations, or financial actions.

  6. How long can it run? A single response, a multi-turn session, a background task, a scheduled recurring task, a persistent monitor, or a long-running cloud/local operator.

  7. How do we stop, replay, and audit it? Logs, traces, tool-call records, retrieved context, generated outputs, sandbox artifacts, approvals, rollback paths, pause controls, and kill switches.


Looking at the five agent types against the risk matrix one could imagine is approximately like the one in the following heatmap.


Enterprise AI Agent Risk Heatmap showing illustrative risk scores from 1 (low) to 5 (very high) across five agent types — knowledge agents, service agents, embedded workflow agents, technical workbench agents, and long-running orchestrator agents — evaluated against seven validation questions including what the agent can read, write, trigger, and how it can be audited.

The scores are not a claim that every service agent is riskier than every workbench agent, or that every orchestrator should be banned. They are a way to ask better questions. A narrow coding agent that only opens pull requests may be easier to validate than a customer service agent that can issue refunds. A hidden workflow may look harmless until another agent can trigger it. The point of the matrix is to move the conversation from "what model is it using?" to "what authority does this system actually have?"


Orchestrator agents are clearly the highest autonomy, and often highest access system (hence their row is almost all bright red), but other types are not without risk. The values here are really only illustrative since they depend on the permissions, allowable actions and public facing nature of the agents.


Conclusions


As teams work to get value from AI, agents provide a great way to encapsulate AI functionality. However, each type of agent behaves very differently and brings its own challenges. We're also entering into a time where multiple agents will call each other, further making systems more complex.


Hopefully this short tour of the agent types has been helpful. We also have a more extensive Agent Risk handbook you can find here!


Use of AI in this Post


AI was used for some of the research and phrasing in the article, but all work was human guided and human reviewed. The heatmap image is also AI generated.

Comments


bottom of page