Tracing in the OpenAI Responses API (what it is, what’s inside, and how to control it)
When people say “Trace” in the OpenAI ecosystem, they usually mean an end-to-end log of what the model did—its decisions, any tool calls it made, and the intermediate steps that led to the final answer. OpenAI’s docs define an agent trace as “the end-to-end log of decisions, tool calls, and reasoning steps.”
In the Responses API, you won’t typically see a single top-level field literally named "trace". Instead, you get a typed response object whose output[] items (messages, tool calls, etc.) effectively are the trace of the model’s execution for that request—and you can choose to store/retrieve it and optionally include extra detail.
What does a “Trace” include?
A practical “trace” for a Responses API run is usually composed of:
-
Request configuration
-
model, instructions, input, tools, tool_choice, etc.
-
-
Output items (
response.output[])-
Assistant messages (e.g.,
"type": "message") -
Tool calls (e.g.,
"type": "function_call", and built-in calls like web/file search) -
Status fields and IDs you can use to correlate steps (like
call_id) -
Token usage (input/output/total)
-
-
Tool inputs and tool outputs
-
For function calling, the model emits a tool call with arguments, then your app runs the tool and sends back a function_call_output item in a follow-up turn.
-
-
Optional “extra details” you can include
-
Web/file search results, tool outputs, logprobs, encrypted reasoning content, etc. via
include.
-
How to increase the amount of trace detail
1) Use include to request more fields
Both Create and Retrieve support an include array that can attach extra details to the returned object. The docs list includables such as:
-
"file_search_call.results" -
"web_search_call.results" -
"web_search_call.action.sources" -
"code_interpreter_call.outputs" -
"message.output_text.logprobs" -
"reasoning.encrypted_content" -
(plus some image-url includables)
2) Store responses so you can retrieve the full trace later
Responses are stored by default, and you can disable that per request.
-
store: true→ retrievable later byresponse_id -
store: false→ not stored for later retrieval
3) Add metadata to label and filter traces/logs
You can attach up to 16 key-value pairs as metadata, which helps you query/filter objects in dashboards or your own pipeline.
4) Use streaming for a “live trace”
If you set stream: true, the API emits server-sent events (SSE) like response.created, response.in_progress, and response.completed, which is essentially a real-time trace feed.
How to reduce trace detail / shrink what you keep
1) Don’t store (store: false)
This is the biggest lever if your concern is retention: disable storage per request.
2) Don’t ask for extra fields
If you omit include, you’ll get a smaller, cleaner response object.
3) Limit tool usage
If your “trace” is getting huge because the model calls tools repeatedly, cap it with parameters like max_tool_calls.
4) Compact long conversations (advanced)
For long-running multi-turn sessions, /responses/compact replaces prior assistant/tool steps (and encrypted reasoning) with a single encrypted compaction item—shrinking what you carry forward.
5) If streaming, reduce bandwidth overhead
There’s a streaming option include_obfuscation; turning it off can reduce payload overhead if you trust your network links.
Python example: get a trace-like record, and “set trace” detail
Below is a minimal example using the official Python SDK pattern from the developer quickstart.
Example A — Create + store + include extra detail, then retrieve by ID
What this gives you: a response object that contains the assistant message item(s) in output[], plus the optional included fields, and an ID you can retrieve later.
Python example: tool calling (showing tool call output + AI reply)
Function calling in Responses is typically a 2-step flow:
-
Model returns a
function_calloutput item (withcall_id+ JSON arguments) -
Your code runs the tool and sends back a
function_call_outputinput item in a follow-up request
Example B — Weather tool (toy example)
Example JSON: “trace output”, “tool result”, and “AI reply”
Below are illustrative examples (IDs shortened). The structure matches what you’ll see in Responses outputs: message items, function calls, and later the assistant reply.
1) Trace-like output (Step 1): model asks to call a tool
2) Tool call result (what your code returns)
3) Trace-like output (Step 2): final assistant message after tool output
No comments:
Post a Comment