langfuse-observability

Name: langfuse-observability
Rating: 3.003 (3 reviews)
Author: langfuse

Instrument LLM applications with Langfuse tracing. Use when setting up Langfuse, adding observability to LLM calls, or auditing existing instrumentation.

3estrellas

0forks

Actualizado 2/1/2026

Obtener Código Fuente

SKILL.md

readonlyread-only

name

langfuse-observability

description

Instrument LLM applications with Langfuse tracing. Use when setting up Langfuse, adding observability to LLM calls, or auditing existing instrumentation.

Langfuse Observability

Instrument LLM applications with Langfuse tracing, following best practices and tailored to your use case.

When to Use

Setting up Langfuse in a new project
Auditing existing Langfuse instrumentation
Adding observability to LLM calls

Workflow

1. Assess Current State

Check the project:

Is Langfuse SDK installed?
What LLM frameworks are used? (OpenAI SDK, LangChain, LlamaIndex, Vercel AI SDK, etc.)
Is there existing instrumentation?

No integration yet: Set up Langfuse using a framework integration if available. Integrations capture more context automatically and require less code than manual instrumentation.

Integration exists: Audit against baseline requirements below.

2. Verify Baseline Requirements

Every trace should have these fundamentals:

Requirement	Check	Why
Model name	Is the LLM model captured?	Enables model comparison and filtering
Token usage	Are input/output tokens tracked?	Enables automatic cost calculation
Good trace names	Are names descriptive? (`chat-response`, not `trace-1`)	Makes traces findable and filterable
Span hierarchy	Are multi-step operations nested properly?	Shows which step is slow or failing
Correct observation types	Are generations marked as generations?	Enables model-specific analytics
Sensitive data masked	Is PII/confidential data excluded or masked?	Prevents data leakage
Trace input/output	Does the trace capture the full data being processed as input, and the result as output?	Enables debugging and understanding what was processed

Framework integrations (OpenAI, LangChain, etc.) handle model name, tokens, and observation types automatically. Prefer integrations over manual instrumentation.

Docs: https://langfuse.com/docs/tracing

3. Explore Traces First

Once baseline instrumentation is working, encourage the user to explore their traces in the Langfuse UI before adding more context:

"Your traces are now appearing in Langfuse. Take a look at a few of them—see what data is being captured, what's useful, and what's missing. This will help us decide what additional context to add."

This helps the user:

Understand what they're already getting
Form opinions about what's missing
Ask better questions about what they need

4. Discover Additional Context Needs

Determine what additional instrumentation would be valuable. Infer from code when possible, only ask when unclear.

Infer from code:

If you see in code...	Infer	Suggest
Conversation history, chat endpoints, message arrays	Multi-turn app	`session_id`
User authentication, `user_id` variables	User-aware app	`user_id` on traces
Multiple distinct endpoints/features	Multi-feature app	`feature` tag
Customer/tenant identifiers	Multi-tenant app	`customer_id` or tier tag
Feedback collection, ratings	Has user feedback	Capture as scores

Only ask when not obvious from code:

"How do you know when a response is good vs bad?" → Determines scoring approach
"What would you want to filter by in a dashboard?" → Surfaces non-obvious tags
"Are there different user segments you'd want to compare?" → Customer tiers, plans, etc.

Additions and their value:

Addition	Why	Docs
`session_id`	Groups conversations together	https://langfuse.com/docs/tracing-features/sessions
`user_id`	Enables user filtering and cost attribution	https://langfuse.com/docs/tracing-features/users
User feedback score	Enables quality filtering and trends	https://langfuse.com/docs/scores/overview
`feature` tag	Per-feature analytics	https://langfuse.com/docs/tracing-features/tags
`customer_tier` tag	Cost/quality breakdown by segment	https://langfuse.com/docs/tracing-features/tags

These are NOT baseline requirements—only add what's relevant based on inference or user input.

5. Guide to UI

After adding context, point users to relevant UI features:

Traces view: See individual requests
Sessions view: See grouped conversations (if session_id added)
Dashboard: Build filtered views using tags
Scores: Filter by quality metrics

Framework Integrations

Prefer these over manual instrumentation:

Framework	Integration	Docs
OpenAI SDK	Drop-in replacement	https://langfuse.com/docs/integrations/openai
LangChain	Callback handler	https://langfuse.com/docs/integrations/langchain
LlamaIndex	Callback handler	https://langfuse.com/docs/integrations/llama-index
Vercel AI SDK	OpenTelemetry exporter	https://langfuse.com/docs/integrations/vercel-ai-sdk
LiteLLM	Callback or proxy	https://langfuse.com/docs/integrations/litellm

Full list: https://langfuse.com/docs/integrations

Always Explain Why

When suggesting additions, explain the user benefit:

"I recommend adding session_id to your traces.

Why: This groups messages from the same conversation together.
You'll be able to see full conversation flows in the Sessions view,
making it much easier to debug multi-turn interactions.

Learn more: https://langfuse.com/docs/tracing-features/sessions"

Common Mistakes

Mistake	Problem	Fix
No `flush()` in scripts	Traces never sent	Call `langfuse.flush()` before exit
Flat traces	Can't see which step failed	Use nested spans for distinct steps
Generic trace names	Hard to filter	Use descriptive names: `chat-response`, `doc-summary`
Logging sensitive data	Data leakage risk	Mask PII before tracing
Manual instrumentation when integration exists	More code, less context	Use framework integration
Langfuse import before env vars loaded	Langfuse initializes with missing/wrong credentials	Import Langfuse AFTER loading environment variables (e.g., after `load_dotenv()`)
Wrong import order with OpenAI	Langfuse can't patch the OpenAI client	Import Langfuse and call its setup BEFORE importing OpenAI client

Related Skills

verify

243K

Use when you want to validate changes before committing, or when you need to check all React contribution requirements.

facebook

Obtener

test

243K

Use when you need to run tests for React core. Supports source, www, stable, and experimental channels.

facebook

Obtener

feature-flags

243K

Use when feature flag tests fail, flags need updating, understanding @gate pragmas, debugging channel-specific test failures, or adding new flags to React.

facebook

Obtener

extract-errors

243K

Use when adding new error messages to React, or seeing "unknown error code" warnings.

facebook

Obtener

flow

243K

Use when you need to run Flow type checking, or when seeing Flow type errors in React code.

facebook

Obtener

flags

243K

Use when you need to check feature flag states, compare channels, or debug why a feature behaves differently across release channels.

facebook

Obtener

langfuse-observability

Langfuse Observability

When to Use

Workflow

1. Assess Current State

2. Verify Baseline Requirements

3. Explore Traces First

4. Discover Additional Context Needs

5. Guide to UI

Framework Integrations

Always Explain Why

Common Mistakes

You Might Also Like

Related Skills

verify

test

feature-flags

extract-errors

flow

flags