Tracing in the OpenAI Responses API (what it is, what’s inside, and how to control it)

When people say “Trace” in the OpenAI ecosystem, they usually mean an end-to-end log of what the model did—its decisions, any tool calls it made, and the intermediate steps that led to the final answer. OpenAI’s docs define an agent trace as “the end-to-end log of decisions, tool calls, and reasoning steps.”

In the Responses API, you won’t typically see a single top-level field literally named "trace". Instead, you get a typed response object whose output[] items (messages, tool calls, etc.) effectively are the trace of the model’s execution for that request—and you can choose to store/retrieve it and optionally include extra detail.

What does a “Trace” include?

A practical “trace” for a Responses API run is usually composed of:

Request configuration
- model, instructions, input, tools, tool_choice, etc.
Output items (response.output[])
- Assistant messages (e.g., "type": "message")
- Tool calls (e.g., "type": "function_call", and built-in calls like web/file search)
- Status fields and IDs you can use to correlate steps (like call_id)
- Token usage (input/output/total)
Tool inputs and tool outputs
- For function calling, the model emits a tool call with arguments, then your app runs the tool and sends back a function_call_output item in a follow-up turn.
Optional “extra details” you can include
- Web/file search results, tool outputs, logprobs, encrypted reasoning content, etc. via include.

How to increase the amount of trace detail

1) Use `include` to request more fields

Both Create and Retrieve support an include array that can attach extra details to the returned object. The docs list includables such as:

"file_search_call.results"
"web_search_call.results"
"web_search_call.action.sources"
"code_interpreter_call.outputs"
"message.output_text.logprobs"
"reasoning.encrypted_content"
(plus some image-url includables)

2) Store responses so you can retrieve the full trace later

Responses are stored by default, and you can disable that per request.

store: true → retrievable later by response_id
store: false → not stored for later retrieval

3) Add `metadata` to label and filter traces/logs

You can attach up to 16 key-value pairs as metadata, which helps you query/filter objects in dashboards or your own pipeline.

4) Use streaming for a “live trace”

If you set stream: true, the API emits server-sent events (SSE) like response.created, response.in_progress, and response.completed, which is essentially a real-time trace feed.

How to reduce trace detail / shrink what you keep

1) Don’t store (`store: false`)

This is the biggest lever if your concern is retention: disable storage per request.

2) Don’t ask for extra fields

If you omit include, you’ll get a smaller, cleaner response object.

3) Limit tool usage

If your “trace” is getting huge because the model calls tools repeatedly, cap it with parameters like max_tool_calls.

4) Compact long conversations (advanced)

For long-running multi-turn sessions, /responses/compact replaces prior assistant/tool steps (and encrypted reasoning) with a single encrypted compaction item—shrinking what you carry forward.

5) If streaming, reduce bandwidth overhead

There’s a streaming option include_obfuscation; turning it off can reduce payload overhead if you trust your network links.

Python example: get a trace-like record, and “set trace” detail

Below is a minimal example using the official Python SDK pattern from the developer quickstart.

Example A — Create + store + include extra detail, then retrieve by ID


from openai import OpenAI
import json

client = OpenAI()

# "Set trace detail": store it, tag it, and include extra fields (e.g., logprobs).
resp = client.responses.create(
    model="gpt-5.2",
    input="Write one sentence about penguins. Keep it short.",
    store=True,
    metadata={"feature": "trace_demo", "session": "abc123"},
    include=["message.output_text.logprobs"],
)

print("response_id:", resp.id)
print("assistant_text:", resp.output_text)

# "Get trace": dump the full response object you received
print(json.dumps(resp.model_dump(), indent=2))

# Later: retrieve the stored response (optionally with include again)
retrieved = client.responses.retrieve(
    resp.id,
    include=["message.output_text.logprobs"],
)

print("retrieved assistant_text:", retrieved.output_text)

What this gives you: a response object that contains the assistant message item(s) in output[], plus the optional included fields, and an ID you can retrieve later.

Python example: tool calling (showing tool call output + AI reply)

Function calling in Responses is typically a 2-step flow:

Model returns a function_call output item (with call_id + JSON arguments)
Your code runs the tool and sends back a function_call_output input item in a follow-up request

Example B — Weather tool (toy example)


from openai import OpenAI
import json

client = OpenAI()

def get_current_weather(location: str, unit: str = "celsius"):
    # Your real implementation would call an API.
    return {"location": location, "unit": unit, "temperature": 7, "conditions": "cloudy"}

tools = [
    {
        "type": "function",
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
            },
            "required": ["location"],
        },
    }
]

# Step 1: ask the model; it may emit a function_call
r1 = client.responses.create(
    model="gpt-5.2",
    input="What's the weather in Boston? Reply in one sentence.",
    tools=tools,
    store=True,
)

# Find the function call item in output
func_call = next(item for item in r1.output if item.type == "function_call")
args = json.loads(func_call.arguments)

# Run your tool
tool_result = get_current_weather(**args)

# Step 2: send the tool result back
r2 = client.responses.create(
    model="gpt-5.2",
    previous_response_id=r1.id,
    input=[
        {
            "type": "function_call_output",
            "call_id": func_call.call_id,
            "output": json.dumps(tool_result),
        }
    ],
)

print("Final assistant reply:", r2.output_text)

Example JSON: “trace output”, “tool result”, and “AI reply”

Below are illustrative examples (IDs shortened). The structure matches what you’ll see in Responses outputs: message items, function calls, and later the assistant reply.

1) Trace-like output (Step 1): model asks to call a tool


{
  "id": "resp_abc",
  "object": "response",
  "status": "completed",
  "store": true,
  "metadata": { "feature": "trace_demo", "session": "abc123" },
  "output": [
    {
      "id": "fc_001",
      "type": "function_call",
      "call_id": "call_001",
      "name": "get_current_weather",
      "arguments": "{\"location\":\"Boston, MA\",\"unit\":\"celsius\"}",
      "status": "completed"
    }
  ]
}

2) Tool call result (what your code returns)


{
  "location": "Boston, MA",
  "unit": "celsius",
  "temperature": 7,
  "conditions": "cloudy"
}

3) Trace-like output (Step 2): final assistant message after tool output


{
  "id": "resp_def",
  "object": "response",
  "status": "completed",
  "previous_response_id": "resp_abc",
  "output": [
    {
      "id": "msg_002",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "In Boston, it’s about 7°C and cloudy right now.",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 120,
    "output_tokens": 18,
    "total_tokens": 138
  }
}

SLQ notes

February 10, 2026

Open AI Responses API - Trace

Tracing in the OpenAI Responses API (what it is, what’s inside, and how to control it)

What does a “Trace” include?

How to increase the amount of trace detail

1) Use `include` to request more fields

2) Store responses so you can retrieve the full trace later

3) Add `metadata` to label and filter traces/logs

4) Use streaming for a “live trace”

How to reduce trace detail / shrink what you keep

1) Don’t store (`store: false`)

2) Don’t ask for extra fields

3) Limit tool usage

4) Compact long conversations (advanced)

5) If streaming, reduce bandwidth overhead

Python example: get a trace-like record, and “set trace” detail

Example A — Create + store + include extra detail, then retrieve by ID

Python example: tool calling (showing tool call output + AI reply)

Example B — Weather tool (toy example)

Example JSON: “trace output”, “tool result”, and “AI reply”

1) Trace-like output (Step 1): model asks to call a tool

2) Tool call result (what your code returns)

3) Trace-like output (Step 2): final assistant message after tool output

No comments:

Post a Comment

February 10, 2026

Open AI Responses API - Trace

Tracing in the OpenAI Responses API (what it is, what’s inside, and how to control it)

What does a “Trace” include?

How to increase the amount of trace detail

1) Use include to request more fields

2) Store responses so you can retrieve the full trace later

3) Add metadata to label and filter traces/logs

4) Use streaming for a “live trace”

How to reduce trace detail / shrink what you keep

1) Don’t store (store: false)

2) Don’t ask for extra fields

3) Limit tool usage

4) Compact long conversations (advanced)

5) If streaming, reduce bandwidth overhead

Python example: get a trace-like record, and “set trace” detail

Example A — Create + store + include extra detail, then retrieve by ID

Python example: tool calling (showing tool call output + AI reply)

Example B — Weather tool (toy example)

Example JSON: “trace output”, “tool result”, and “AI reply”

1) Trace-like output (Step 1): model asks to call a tool

2) Tool call result (what your code returns)

3) Trace-like output (Step 2): final assistant message after tool output

No comments:

Post a Comment

1) Use `include` to request more fields

3) Add `metadata` to label and filter traces/logs

1) Don’t store (`store: false`)