Agenta OpenTelemetry Semantic Conventions
This document describes how Agenta applies domain-specific OpenTelemetry conventions to capture and analyze traces from LLM applications.
How It Works
Agenta accepts any span that follows the OpenTelemetry specification. To unlock LLM-specific features—such as nicely formatted chat messages, per-request cost and latency, and links to prompt configurations or evaluators—we add attributes under the ag.
namespace.
We support two primary instrumentation approaches:
- Agenta SDK: When using our SDK with decorators like
@ag.instrument
, we automatically handle the semantic conventions - Auto-instrumentation libraries: We provide adapters that transform data from standard auto-instrumentation libraries (which follow their own conventions) into Agenta's format
Agenta Namespace
All Agenta-specific attributes are organized under the ag
namespace to avoid conflicts with other OpenTelemetry conventions.
Core Semantic Conventions
ag.data
The ag.data
namespace contains the core execution data for each span:
ag.data.inputs
: Input parameters for the spanag.data.outputs
: Output results from the spanag.data.internals
: Internal variables and intermediate values
Data Format
Inputs and Outputs: Stored as JSON format. For chat applications using LLM spans, these are formatted as a list of messages:
{
"data": {
"inputs": {
"prompt": [
{ "role": "system", "content": "System instruction" },
{ "role": "user", "content": "User query" }
],
"functions": [...],
"tools": [...]
},
"outputs": {
"completion": [
{ "role": "assistant", "content": "Assistant response" }
]
}
}
}
Internals: User-provided internal information such as context variables, intermediate calculations, or evaluation data that aren't part of the primary inputs/outputs. These are set by the user in the SDK using ag.store_internals()
.
SDK Integration
When using the @ag.instrument
decorator in Python:
@ag.instrument
def my_function(input_param):
# Function inputs and outputs are automatically captured
# unless explicitly masked
return result
The decorator automatically captures function inputs and outputs in ag.data.inputs
and ag.data.outputs
unless you choose to mask sensitive data.
ag.meta
The ag.meta
namespace stores metadata about the span execution:
{
"meta": {
"system": "openai",
"request": {
"base_url": "https://api.openai.com/v1",
"endpoint": "/chat/completions",
"headers": {...},
"streaming": false,
"model": "gpt-4",
"max_tokens": 1000,
"temperature": 0.7,
"top_p": 0.9,
"top_k": 50
},
"response": {
"model": "gpt-4-0613"
},
"configuration": {
"prompt": {
"input_keys": ["country"],
"messages": [...],
"template_format": "curly",
"llm_config": {
"model": "mistral/mistral-tiny"
}
}
}
}
}
Standard Metadata
Auto-instrumentation maps common semantic-convention keys—e.g. gen_ai.system, gen_ai.request.*—to the structure above.
ag.meta.system
: Identifies the LLM provider or system being used (e.g., "openai", "anthropic")ag.meta.request.base_url
: The base URL for the API requestag.meta.request.endpoint
: The specific API endpoint calledag.meta.request.headers
: HTTP headers sent with the requestag.meta.request.streaming
: Boolean indicating if streaming mode was usedag.meta.request.model
: The model name requestedag.meta.request.max_tokens
: Maximum token limit for the responseag.meta.request.temperature
: Sampling temperature parameterag.meta.request.top_p
: Nucleus sampling parameterag.meta.request.top_k
: Top-k sampling parameter
Metadata is displayed in the observability overview page as contextual information to help navigate and understand span execution.
ag.refs
The ag.refs
namespace contains references to entities within the Agenta system:
{
"refs": {
"application": {
"id": "uuid"
},
"variant": {
"id": "uuid",
"version": "1"
},
"environment": {
"slug": "production",
"id": "uuid"
}
}
}
Reference Types
- Application: Links to the Agenta application
- Variant: References specific prompt variants and versions
- Environment: Links to deployment environments (production, staging, etc.)
- Test Set: References to test datasets
- Test Case: Links to individual test cases
- Evaluation Run: References to evaluation executions
- Evaluator: Links to evaluation functions
These references enable navigation within the Agenta UI and allow filtering spans by specific entities.
ag.metrics
The ag.metrics
namespace tracks performance and cost metrics:
{
"metrics": {
"acc": {
"costs": {
"total": 0.00003925
},
"tokens": {
"total": 157,
"prompt": 26,
"completion": 131
},
"duration": {
"total": 1251.157
}
},
"unit": {
"costs": {
"total": 0.00003925
},
"tokens": {
"total": 157,
"prompt": 26,
"completion": 131
}
}
}
}
Metric Types
- Accumulated (
acc
): Metrics rolled up across child spans - Unit (
unit
): Metrics specific to the individual span
Available Metrics
- Costs: Total cost in USD for LLM API calls
- Tokens: Token usage breakdown (total, prompt, completion)
- Duration: Execution time in milliseconds
Additional Agenta Attributes
ag.type
The ag.type
namespace contains type information about the span:
ag.type.node
can be workflow
, task
, tool
, embedding
, query
, completion
, chat
, rerank
.
ag.tags
The ag.tags
namespace allows you to add custom tags to spans for categorization and filtering:
# Add custom tags using the SDK
ag.add_tags({"key":"value"})
ag.version
Tracks version information for the instrumented application or component.
ag.flags
Contains system-specific flags and metadata used internally by Agenta for span processing and display.
When using auto-instrumentation libraries, most attributes are saved twice - once in their original format and once processed under the ag
namespace
Standard OpenTelemetry Attributes
In addition to Agenta-specific conventions, traces include standard OpenTelemetry attributes:
- Links: Relationships between spans
- Events: Timestamped events within spans
- Version: OpenTelemetry version information
- Status Code: Span completion status
- Start Time: Span initiation timestamp
- Span Name: Human-readable span identifier
- Span Kind: Type of span (server, client, internal, etc.)