Building an internal agent: Subagent support

Published on December 31, 2025. llm (25), agents (15), internal-agent (10)

Most of the extensions to our internal agent have been the direct result of running into a problem that I couldn’t elegantly solve within our current framework. Evals, compaction, large-file handling all fit into that category. Subagents, allowing an agent to initiate other agents, are in a different category: I’ve frequently thought that we needed subagents, and then always found an alternative that felt more natural.

Eventually, I decided to implement them anyway, because it seemed like an interesting problem to reason through. Eventually I would need them… right? (Aside: I did, indeed, eventually use subagents to support code-driven workflows invoking LLMs.)

This is part of the Building an internal agent series.

Why subagents matter

“Subagents” is the name for allowing your agents to invoke other agents, which have their own system prompt, available tools, and context windows. Some of the reasons you’re likely to consider subagents:

Provide an effective strategy for context window management. You could provide them access to uploaded files, and then ask them to extract specific data from those files, without polluting your primary agent’s context window with the files’ content
You could use subagents to support concurrent work. For example, you could allow invocation of multiple subagents at once, and then join on the completion of all subagents. If your agent workflows are predominantly constrained by network IO (to e.g. model evaluation APIs), then this could support significant reduction in clock-time to complete your workflows
I think you could convince yourself that there are some security benefits to performing certain operations in subagents with less access. I don’t actually believe that’s meaningfully better, but you could at least introduce friction by ensuring that retrieving external resources and accessing internal resources can only occur in mutually isolated subagents

Of all these reasons, I think that either the first or the second will be most relevant to the majority of internal workflow developers.

How we implemented subagents

Our implementation for subagents is quite straightforward:

We define subagents in subagents/*.yaml, where each subagent has a prompt, allowed tools (or option to inherit all tools from parent agent), and a subset of the configurable fields from our agent configuration
Each agent is configured to allow specific subagents, e.g. the planning subagent
Agents invoke subagents via the subagent(agent_name, prompt, files) tool, which allows them to decide which virtual files are accessible within the subagent, and also the user prompt passed to the subagent (the subagent already has a default system prompt within its configuration)

This has worked fairly well. For example, supporting the quick addition of planning and think subagents which the parent agent can use to refine its work. We further refactored the implementation of the harness running agents to be equivalent to subagents, where effectively every agent is a subagent, and so forth.

How this has worked / what next

To be totally honest, I just haven’t found subagents to be particularly important to our current workflows. However, user-facing latency is a bit of an invisible feature, with it not mattering at all until at some point it starts subtly creating undesirable user workflows (e.g. starting a different task before checking the response), so I believe long-term this will be the biggest advantage for us.

Addendum: as alluded to in the introduction, this subagents functionality ended up being extremely useful when we introduced code-driven workflows, as it allows handing off control to the LLM for a very specific determination, before returning control to the code.