Auto-instrumentation gives you one span per LLM call, which is useless for multi-step agents. Here is the @observe plus custom-tool-span pattern that makes Langfuse traces actually debuggable, with the asyncio gotcha the docs bury.

Langfuse @observe + Custom Spans: How to Actually Trace Multi-Step Agents

After you ship the Langfuse cost dashboard, the next problem shows up by Wednesday: a 12-step agent fails on a user's run and the trace shows one observation with no way to find the broken step. By the end of this you will wrap any Python agent in @observe, emit a custom span per tool call, and score terminal outcomes so failures are filterable.

1. Install the SDK and the OTel Exporter (2 min)

bash

pip install langfuse openinference-instrumentation-openai
export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
export LANGFUSE_HOST="http://localhost:3000"  # or your self-hosted URL

The openinference-instrumentation-openai package is the part most people skip, and it is the only reason traces show up automatically. Without it, @observe gives you a parent span with no LLM telemetry inside it.

2. Wrap Your Agent with `@observe` and Decorate Tool Calls (5 min)

python

from langfuse import observe, get_client
from openai import OpenAI
client = OpenAI()
langfuse = get_client()
@observe(name="agent.run")
def run_agent(user_input: str) -> str:
    messages = [{"role": "user", "content": user_input}]
    for step in range(12):
        # Every LLM call is auto-traced by openinference
        resp = client.chat.completions.create(
            model="gpt-4o", messages=messages, tools=TOOL_SCHEMAS
        )
        msg = resp.choices[0].message
        messages.append(msg)
        if not msg.tool_calls:
            return msg.content
        for call in msg.tool_calls:
            # Custom span — this is the bit the docs bury
            with langfuse.start_as_current_observation(
                name="tool.call",
                as_type="span",
                input={"tool": call.function.name, "args": call.function.arguments},
            ) as span:
                result = dispatch(call.function.name, json.loads(call.function.arguments))
                span.update(output=result)
            messages.append({"role": "tool", "tool_call_id": call.id,
                             "content": json.dumps(result)})

Two things to notice. First, @observe decorates the outer function only; openinference auto-instruments the OpenAI client and emits LLM spans as children. Second, langfuse.start_as_current_observation is the API for custom spans. The as_type="span" is the difference between a generic observation and a true span that shows up in the waterfall UI. Use as_type="generation" only for actual model calls.

3. Score the Final Outcome (1 min)

python

run_agent("cancel my order #4471")
langfuse.score_current_span(name="success", value=True)
# Or attach a user-feedback score from a thumbs-up event in the UI

Scores are how you stop reading traces by hand. Set a success score on every terminal span and filter the trace list for score=success=false to find the failures in seconds.

What the Docs Don't Tell You

The decorator emits a span per call, but the OpenTelemetry context propagates by thread, not by asyncio task. If you run your agent in asyncio.gather or with anyio.create_task_group, every concurrent call collapses into the first parent's trace. Fix: import langfuse.use_otel_context() and pass the current context to each task, or set LANGFUSE_TRACING_ENABLED=true plus call langfuse_context.update_current_observation(metadata={"trace_id": ...}) inside the task. This bug costs a day to find because the UI looks fine — it just has the wrong parents.

Next Step

Wire @observe around the three highest-traffic agents in your repo, add a tool.call span per tool dispatch, and a success score per terminal call. After 24 hours, the trace list filtered by score=success=false is the only debugging surface you need.

— Mr. Technology

Langfuse @observe + Custom Spans: How to Actually Trace Multi-Step Agents

Langfuse @observe + Custom Spans: How to Actually Trace Multi-Step Agents

1. Install the SDK and the OTel Exporter (2 min)

2. Wrap Your Agent with @observe and Decorate Tool Calls (5 min)

3. Score the Final Outcome (1 min)

What the Docs Don't Tell You

Next Step

2. Wrap Your Agent with `@observe` and Decorate Tool Calls (5 min)