Agent alternates between Thought (reasoning about what to do next) and Action (calling a tool), repeating until the task is complete or budget is exhausted. Routing is decided by the model at runtime.
A hedge fund analyst needs to answer: "What is the 3-year revenue CAGR of the top-5 cloud infrastructure companies and how does it compare to Azure's current guidance?" A single LLM call cannot answer this — it requires web search for recent earnings, a calculator for CAGR computation, a database lookup for Azure guidance, and synthesis of results. The ReAct agent performs 6-9 tool calls per query, completes in 45 seconds, and produces a cited answer.
Without the loop, the analyst spends 2 hours per query. The key structural insight: the agent does not know at start time which tools it will need or in what order. Tool selection and sequencing are emergent — decided by the Reasoner at each iteration based on what it has already learned.
| Metric | Signal |
|---|---|
| Mean tool calls per task | Indicates task complexity and loop efficiency — high counts signal either deep tasks or poor Reasoner decision-making |
| Tool call success rate | Fraction of calls that return a valid result — failures force re-iteration and inflate cost |
| Context utilization % | Buffer fullness at task completion — approaching 100% signals context stuffing risk |
| Task completion rate within budget | Fraction of tasks that reach a final answer before hitting iteration or token limits |
| Answer quality score | End-to-end accuracy evaluated against ground truth or human judgment — the primary output metric |
| Node | What it does | What it receives | What it produces |
|---|---|---|---|
| Reasoner | Determines next action or declares task complete | Original query + observation buffer | Thought + action directive, or final answer signal |
| Tool Dispatcher | Selects and calls the appropriate tool; enforces allowed tool policy | Action directive from Reasoner | Tool call result (raw) |
| Web Search | Fetches current earnings data and analyst reports | Search query string | Ranked search results with snippets |
| Calculator | Computes CAGR and other financial metrics exactly | Revenue figures and time period | Numeric result with formula trace |
| Database Lookup | Retrieves Azure guidance and structured financial records | Structured query | Record set from internal data store |
| Observation Buffer | Appends each tool result to the running context; feeds Reasoner on next iteration | Raw tool result | Updated context window for Reasoner |
| Origin of Value | Where it appears | How it is captured |
|---|---|---|
| Future Cashflow | Answer quality | Quality increases monotonically with tool access breadth. Each additional privileged tool (proprietary DB, real-time feed) widens the quality gap over competitors using only public data. |
| Governance | Tool Dispatcher | The Dispatcher is the policy enforcement layer — it defines which tools the agent may call and under what conditions. Governance is not in the Reasoner; it is in the allowed-tool registry. |
| Conditional Action | Each iteration | Every Reasoner-Dispatcher cycle is compute spend. Budget blindness — agent iterating without cost awareness — is the primary cost failure mode. |
| Risk Exposure | Tool calls and Observation Buffer | Tool hallucination (agent fabricating output instead of calling the tool) and context stuffing (buffer exceeds context window) are the two catastrophic failure vectors. |
VCM analog: Access Token. The agent's value derives from its access to tools. A Tool-Use Loop without privileged tool access is just a reasoning loop — the tools are the moat, not the reasoning.
The Reasoner never emits a Done=true signal — it continues generating actions indefinitely. This occurs when the task is underspecified, the completion criterion is ambiguous, or the model is not prompted to self-terminate. Fix: define an explicit completion condition in the system prompt, and enforce a hard iteration cap at the Dispatcher level independent of model output.
The Reasoner fabricates a plausible tool result in its Thought step rather than issuing a real tool call. The Observation Buffer receives a hallucinated observation, and subsequent reasoning compounds the error. This is undetectable from the model's output alone. Fix: require all observations to come from a verified Dispatcher response; never allow the model to self-supply observations.
After many iterations, the Observation Buffer fills the context window. The Reasoner loses access to the original query or early observations. Quality degrades silently — the model does not signal that its context is truncated. Fix: implement a summarization step that compresses older observations before appending new ones, or cap observation verbosity at the Dispatcher.
The agent does not track token spend or iteration count against its allocated budget. It exhausts compute before completing the task, or produces a best-effort answer without flagging incompleteness. Fix: inject a budget state variable into the Reasoner's context at each iteration; prompt it to produce a partial answer when approaching limits.
| Variant | Modification | When to use |
|---|---|---|
| Constrained ReAct | Hard cap on iterations and total tokens enforced at Dispatcher; agent prompted to produce best-effort answer on budget expiry | Production environments where cost and latency SLAs must be guaranteed |
| Parallel Tool-Use | Reasoner dispatches multiple tool calls simultaneously; Dispatcher fans out, collects results, Reasoner synthesizes | Independent tool calls with no ordering dependency — reduces wall-clock latency |
| Cached Tool-Use | Dispatcher memoizes tool results keyed by (tool, normalized input) within the session | Agent likely to re-query the same data (e.g., same company across multiple questions in a session) |
| Pattern | Relationship |
|---|---|
| 10.11 Pipeline | Use when the tool sequence is fixed and known at design time — eliminates runtime routing overhead |
| 20.23 Orchestrator-Workers | Multiple Tool-Use Loop agents coordinated by a higher-level Orchestrator — appropriate when subtasks require separate agents |
| 10.15 Evaluator-Optimizer | Add a quality gate on the final answer before delivery — the Evaluator can trigger a fresh loop if quality is insufficient |
The Tool-Use Loop is the pattern where the moat is most legible: enumerate the tools, enumerate the policies. An agent with access to a proprietary real-time financial database, an internal calculation engine, and a compliance-checked tool registry is structurally differentiated from one calling only public APIs. The Dispatcher's allowed-tool registry is the IP surface.
Audit signal: request a Dispatcher trace log. It shows every tool called, every result received, and every iteration. A firm that cannot produce this log has no observability into its agent's decision process — and cannot price errors, scope liability, or improve the system.
Red flag: observation buffers without summarization. When context stuffing is unmanaged, agent quality degrades non-linearly as task complexity increases. The system appears to work in demos (short tasks) and fails silently in production (long research tasks).