Tool Use & Function Calling • RLHF Module

Equation

Tool-Calling Objective

Chapter 15 describes tool calling as extending the assistant policy with an action space that includes structured function invocations. During training, tool outputs are masked and the reward focuses on whether the final answer or tool sequence solves the task.

\begin{aligned} \max_{\theta} \; \mathbb{E}_{x \sim \mathcal{D}}\Big[ R\big(x, y_{1:T}, o_{1:T}\big) \Big] \\ \text{s.t. } y_t \in \{\text{text tokens},\; \text{tool\_call}(f, a)\}, \; o_t = \text{Tool}(f, a) \end{aligned}

Mask tool tokens when computing the loss. Chapter 15 warns that otherwise the model overfits to observed outputs instead of planning calls.

Intuition

Why Structured Tool Use Works

Tool use shifts the assistant from answering directly to orchestrating resources, prompts, and tools. The assistant decides when to fetch data, when to run code, and how to summarise the results for the user. Providers expose these capabilities as JSON or Python snippets, and chat templates convert them into token streams.

Chapter 15 highlights the Model Context Protocol: servers expose resources (read-only data), prompts (prebuilt workflows), and tools (actions). Clients and hosts remain decoupled, so the same tools can be reused across models by swapping the middle layer.

Implementations should handle retries, schema validation, and masking of tool outputs. Multi-turn formatting requires splitting assistant turns into segments around each tool call.

Analogy

Analogy: Control Room Conductor

A conductor routes calls to the right specialist, records their response, and reports back to the customer. Tool-using assistants do the same with APIs and plug-ins.

Control room conductor

A dispatcher routes calls to specialists, records responses, and reports back. Chapter 15 treats the assistant as the conductor for external tools.

API switchboard

Operators plug into different lines based on the request. Tool-using models select resources, prompts, and tools in the Model Context Protocol.

Visualization

Tool Orchestration Lab

Explore sample transcripts, plan latency budgets, and sketch MCP-style workflows before implementing them in your own stack.

Tool call simulator

Inspect sample tool-calling transcripts from Chapter 15 (JSON, Python execution, MCP).

Parameters

Interactive visualization

Analogy: atari

Chapter 15 highlights JSON schemas for weather lookups. The model emits a function_call payload which the server executes.

system

You are a weather assistant. Available tool: get_weather(city: string, units: string). Always respond with JSON.

user

Where should I visit in Lisbon this weekend? Give weather and a suggestion.

assistant

{"tool_call": "get_weather", "arguments": {"city": "Lisbon", "units": "metric"}}

tool - get_weather

{"city": "Lisbon", "forecast": "22C, sunny", "advisory": "Light breeze"}

assistant

The forecast for Lisbon is 22C and sunny with a light breeze. Consider visiting the Belem waterfront for outdoor cafes.

Tool latency planner

Estimate latency and reliability as you add more tool calls (Chapter 15).

Parameters

Tool calls per turn2.00

Multi-step workflows increase network round trips.

Average tool latency (ms)150.00

Execution time of each external tool.

Failure rate (%)5.00

Probability that a single call fails (timeouts, schema errors).

Interactive visualization

Analogy: atari

Total latency

460 ms

Network: 160 ms - Execution: 300 ms

Success probability

100%

Mitigate cascading failures with retries or fallback plans, as advised in Chapter 15.

Tool orchestration planner

Compose Model Context Protocol style workflows with resources, prompts, and tools.

Parameters

Interactive visualization

Analogy: atari

Fetch ticket context, call policy lookup tool, draft reply for review.

Step 1
RESOURCE
```
mcp.get_resource("support_ticket")
```
Load ticket fields and current status.
Step 2
TOOL
```
lookup_policy
```
Retrieve escalation policy based on priority.
Step 3
PROMPT
```
compose_reply
```
Draft summary + action plan for the agent.

Takeaways

Operational Notes

Keep a consistent schema per provider; chat templates translate JSON or Python into tokens.
Mask tool outputs and log them separately for auditing.
Track end-to-end latency and success rate as you add more calls; implement retries.
Leverage MCP servers to reuse resources and tools across products.
Distinguish between reasoning tokens and tool output tokens when mixing with reasoning models.

Self-check

Tool Use Check

Verify that you understand tool schemas, MCP workflows, and latency trade-offs from Chapter 15.

Answered 0/5 · Correct 0/5

1
What is the optimisation objective when adding tool calls according to Chapter 15?
2
Which message format detail does Chapter 15 highlight when training tool use?
3
What does the Model Context Protocol (MCP) introduce?
4
Why are tool outputs masked during training?
5
What trade-off does the chapter emphasise when adding more tool calls?