Tool Use Patterns for AI Agents Explained
Pattern 1: Single-Shot Lookup
The simplest and most common pattern. The user asks a question, the model calls one tool to get the data, and generates a response incorporating the result. Examples include checking order status, looking up a customer profile, getting the current weather, or retrieving a document. The interaction requires exactly one tool call, and the model can answer the user's question fully from that single result.
Single-shot lookup covers the majority of tool interactions in most agents. It is the fastest pattern (one model call, one tool execution, one model call with the result) and the easiest to implement and debug. If your agent handles mostly single-shot lookups, you do not need complex orchestration. A basic tool loop is sufficient.
The key design consideration for single-shot lookup is tool result quality. The model's response quality is bounded by what the tool returns. If the tool returns a raw database row with internal field names and ID references, the model has to interpret and humanize that data. If the tool returns a clean, well-structured result with human-readable field names and resolved references, the model produces a better response with less reasoning effort.
Pattern 2: Chain-of-Tools
The model calls several tools in sequence, where each call depends on the result of the previous one. The canonical example: look up a customer by email, use the customer ID to find their orders, use an order ID to check shipment status. Each step produces data that the next step needs, so the calls cannot be parallelized.
Chain-of-tools is necessary when the user's request spans multiple data domains or requires data resolution (converting a human-readable identifier into a system ID before querying). The model handles the chaining logic naturally: it sees the first result, recognizes it needs another call, generates the second call using data from the first result, and so on until it has enough information to answer.
The risk with chains is latency accumulation. Each link in the chain adds a model inference call plus a tool execution round trip. A four-step chain with 200ms tool calls and 500ms model calls takes approximately 2.8 seconds, which is noticeable. For frequently executed chains, consider building composite tools that perform the entire chain in a single call. A get_customer_order_status tool that takes an email and returns the order status directly replaces a three-tool chain with a single-shot lookup.
Pattern 3: Parallel Fan-Out
The model needs data from multiple independent sources and calls several tools simultaneously. Examples: checking inventory across three warehouses, fetching reviews from multiple platforms, gathering metrics from several monitoring services, or comparing prices across vendors. The calls are independent, so they can execute in parallel.
The model signals parallel fan-out by generating multiple tool calls in a single response turn. Your execution layer should detect multiple calls, dispatch them concurrently, and return all results together. The total latency is the duration of the slowest call rather than the sum of all calls, which is a substantial improvement when aggregating data from multiple sources.
Fan-out followed by synthesis is a powerful combination. The model fans out to gather data from multiple sources, receives all results, and then synthesizes them into a coherent response. "Compare our pricing across competitors" might fan out to three price-checking tools, then synthesize the results into a comparison table with recommendations. The synthesis step is where the model's reasoning ability adds the most value, because raw data from multiple sources needs interpretation and context to be useful.
Pattern 4: Human-in-the-Loop
The model generates a tool call, but before execution, the system presents the intended action to the user for approval. This pattern is essential for high-impact operations: financial transactions, data deletion, sending communications, changing system configuration. The model proposes the action, the user approves or rejects, and execution proceeds only on approval.
Implementation requires a confirmation gate in the tool execution layer. When the model generates a call to a confirmation-required tool, instead of executing it, the system formats a human-readable description of the intended action and presents it to the user. The user's response (approve/reject) is sent back to the model, which either proceeds with execution (if approved) or adjusts its approach (if rejected).
The design tension in human-in-the-loop is between safety and friction. Requiring confirmation for every tool call makes the agent safe but annoying. Requiring confirmation for no tool calls makes the agent fast but potentially dangerous. The right balance depends on the risk profile of each tool. Read-only lookups almost never need confirmation. Write operations that affect user data usually should. Financial operations should always require confirmation.
Pattern 5: Reactive Tool Use
The agent monitors for conditions and calls tools in response to events rather than user requests. A monitoring agent might periodically check system health and call an alerting tool when metrics exceed thresholds. A scheduling agent might check calendar conflicts when new events are proposed. A compliance agent might scan new content against policy rules.
Reactive tool use requires an event loop or polling mechanism that triggers the agent without user interaction. The model receives the event or monitoring data, reasons about whether action is needed, and calls tools to take appropriate action. This pattern is common in autonomous agents that run continuously rather than responding to individual user messages.
The key challenge with reactive tool use is preventing unnecessary actions. The model needs clear criteria for when to act and when to hold. Without explicit thresholds or decision rules in the system prompt, the model may over-react to normal variations or under-react to genuine issues. Pair reactive tool use with confirmation gates for high-impact actions, even when there is no user actively watching, by routing confirmations to a notification channel for human review.
Pattern 6: Memory-Augmented Tool Use
The agent stores outcomes of tool calls in persistent memory and recalls them in future interactions to inform tool selection, avoid redundant calls, and build cumulative knowledge. This is the most advanced pattern and the one that transforms a stateless tool-using agent into a learning system.
In the memory-augmented pattern, after each tool call, the agent stores a concise observation about the outcome: what was called, what it returned, and what it means in context. Before making future tool calls, the agent recalls relevant past outcomes using semantic search, entity connections, or cognitive scoring. If the user asks about the same entity or a related topic, the agent has context from previous interactions without calling the tool again.
Adaptive Recall enables the memory-augmented pattern through its standard tool interface. The store tool persists outcome observations. The recall tool retrieves relevant past outcomes using cognitive scoring that prioritizes recent, frequent, and well-corroborated memories. The knowledge graph connects outcomes to the entities they involve, enabling the agent to retrieve all past interactions related to a specific customer, order, or system even when the current query does not mention them directly.
The memory-augmented pattern also improves tool selection. If the agent remembers that a particular query pattern requires a specific sequence of tools, it can execute that sequence more efficiently on subsequent encounters. If the agent remembers that a tool consistently fails for certain input patterns, it can avoid those patterns or preemptively use an alternative. Over time, the agent develops expertise specific to its users, its tools, and its domain.
Choosing the Right Pattern
Most production agents use several of these patterns depending on the specific interaction. A customer support agent might use single-shot lookup for status checks, chain-of-tools for investigating issues, parallel fan-out for gathering context from multiple systems, and human-in-the-loop for processing refunds. The tool execution layer should support all patterns, and the model should be free to choose the appropriate pattern for each request.
Start with single-shot and chain patterns, which cover the majority of use cases. Add parallel fan-out when you need to aggregate data from multiple sources. Add human-in-the-loop for write operations with real consequences. Add memory-augmented tool use when you want the agent to improve over time. Each pattern adds capability and complexity, so build them incrementally rather than implementing everything at once.
Enable the memory-augmented pattern with Adaptive Recall. Store tool outcomes, recall relevant history, and build an agent that gets better with every interaction.
Get Started Free