Why One AI Agent Isn't Enough: Designing Specialized Agent Profiles

The first version of my AI assistant had one mode. It did everything — wrote code, ran tests, drafted articles, managed files, answered random questions — all through the same system prompt with the same tool set and the same safety rules.

It worked, in the sense that tasks got completed. It also produced a consistent pattern of problems that I could not fix by tweaking the prompt.

When it was writing code, it would sometimes stop to draft a blog post about the code instead of writing the code. When it was supposed to be testing, it would wander off and start refactoring unrelated files. When I asked it to explore an unfamiliar codebase, it would try to make changes instead of just reading and reporting.

The issue was not intelligence. The model was capable enough for each task individually. The issue was identity confusion. One agent trying to be twelve things at once has no clear role in any given moment, so it defaults to whatever the prompt most recently emphasized.

The Problem With One Prompt

A general-purpose agent prompt looks something like this:

You are a helpful AI assistant. You can write code, run tests, manage files, draft documents, and answer questions. Use the available tools to complete tasks.

That prompt gives the model permission to do everything. It gives the model no guidance on what matters right now. Every task arrives with the full weight of every capability and every instruction. The result is a model that tries to code during a writing task, tries to write during a coding task, and applies the same level of caution to deleting a file as it does to reading one.

You can patch this with longer prompts. "When writing code, focus only on code. When testing, focus only on testing." But the longer the prompt gets, the more the model cherry-picks. It follows the instruction that is closest to its current inclination and ignores the rest.

A general-purpose prompt does not scale across specialized tasks. It produces an agent that is mediocre at everything and excellent at nothing.

What a Profile Actually Is

The fix is not a better prompt. It is multiple prompts.

An agent profile is a configuration that defines four things:

1. System prompt. Who the agent is right now. What it cares about. What it ignores.

2. Tool access. Which tools the agent can use. A testing agent needs to run tests. A documentation agent does not need to delete files.

3. Safety level. How much oversight the agent needs. Read-only agents can browse freely. Agents that make changes need approval gates.

4. Context scope. What information the agent sees. An exploration agent loads the full file tree. A coding agent loads only the files relevant to the current task.

These four dimensions let you create agents that are genuinely different from each other — not just the same model with a different hat.

The Twelve Profiles

Karl Code currently runs twelve built-in profiles. Here is the core set:

  • Orchestrator. The dispatcher. Reads incoming requests, decides which profile should handle them, and routes work. Does not write code or run commands directly.
  • Code. Writes and modifies source files. Has file write access, code execution, and Git tools. Cannot send messages to external services.
  • Test. Runs the test suite, interprets failures, and writes fixes for failing tests. Has execution and file write access, but only inside the test directory and source files.
  • Debug. Investigates errors and stack traces. Read-only access to most of the system, plus execution for reproducing issues. Cannot make fixes — it reports findings to the Code profile.
  • Document. Writes documentation, README files, and commit messages. File write access limited to docs and markdown files.
  • Verify. Reviews code changes before they ship. Read-only access to everything, plus the ability to run linting and type checks. Its job is to find problems, not fix them.
  • Explore. Reads codebases to answer questions. Fully read-only. No file modification, no execution. The safest profile.

The remaining five are specialized variants — a planning agent for multi-step work, a security reviewer, a refactor specialist, and a couple of project-specific profiles tuned for particular workflows.

Tool Access Is a Safety Mechanism

The most important design decision is not the system prompt. It is the tool access list.

A documentation agent that cannot run shell commands is inherently safe. It cannot accidentally delete a directory, execute a malicious script, or push to a remote repository. The system prompt says "write documentation." The tool access list enforces it.

This is the same principle as the middleware safety pattern: code-enforced constraints do not degrade. A prompt that says "do not run commands" is a suggestion. A tool access list that omits the execution tool is a guarantee.

Tool access scoping is the difference between asking an agent not to do something dangerous and making it physically unable to.

Safety Levels and Approval Gates

Each profile has a safety level that determines how much autonomy it gets.

Read-only profiles (Explore, Verify, Debug) operate without approval gates. They can read anything, run read-only commands, and report back. The worst they can do is produce an incorrect analysis.

Write profiles (Code, Test, Document) operate with approval gates on destructive operations. Writing a new file is usually fine. Overwriting an existing file requires approval. Deleting a file always requires approval. Running a command that modifies the system requires approval.

Orchestration profiles have no direct file access at all. They route work and track state. They cannot touch the filesystem.

This means the risk surface of the system is not "the agent can do anything." It is "the Code profile can write files, and only when a human approves destructive changes." That is a much smaller surface to reason about.

Context Scope Prevents Drift

The last dimension — context scope — is the one I underestimated.

When I gave every profile the full project context, the Code agent would read blog post drafts and try to "fix" them. The Test agent would notice unrelated code style issues and try to address them. The Documentation agent would see the task queue and start reorganizing it.

Trimming context per profile fixed this. The Code agent sees source files and the current task. The Test agent sees the test suite and relevant source. The Documentation agent sees the docs directory and the changelog. Nobody sees everything, and nobody gets distracted by work that is not theirs.

Context scope is how you tell an agent what to care about. Loading irrelevant files is not neutral — it actively degrades focus.

The General Principle

The case for specialized profiles is the same as the case for specialized roles in any team. A developer who also writes all the documentation, runs all the tests, manages the deployment pipeline, and answers support tickets will do none of those things well. Not because they are incapable, but because context-switching has a cost.

AI agents have the same problem, with an additional twist: they cannot self-regulate focus. A human developer knows, intuitively, that writing a commit message is not the time to refactor the database layer. An LLM with a general-purpose prompt has no such intuition. Everything in the prompt carries equal weight.

Splitting the agent into profiles solves three problems at once. It focuses the model's attention, it shrinks the tool access surface, and it clarifies the safety boundary for each task. The system prompt for each profile can be short and specific, because the profile only does one thing.

You do not need twelve profiles. Start with three: one that reads, one that writes, and one that routes. The reading agent explores and reports. The writing agent makes changes. The routing agent decides which is which. That covers most use cases.

Add more profiles when you notice that one of them is doing two jobs that keep interfering with each other. That is the signal. Not a framework, not a best-practices document — the observation that your writing agent keeps trying to test, or your test agent keeps trying to document.

Specialization is not about having more agents. It is about each agent having a clear enough identity that it does the right thing without you hovering over its shoulder.