Agenta OpenTelemetry Semantic Conventions

This document describes how Agenta applies domain-specific OpenTelemetry conventions to capture and analyze traces from LLM applications.

How It Works

Agenta accepts any span that follows the OpenTelemetry specification. To unlock LLM-specific features—such as nicely formatted chat messages, per-request cost and latency, and links to prompt configurations or evaluators—we add attributes under the ag. namespace.

info

The OTLP endpoint accepts batches up to 5 MB by default (after decompression). Larger requests return an HTTP 413 status code.

We support two primary instrumentation approaches:

Agenta SDK: When using our SDK with decorators like @ag.instrument, we automatically handle the semantic conventions
Auto-instrumentation libraries: We provide adapters that transform data from standard auto-instrumentation libraries (which follow their own conventions) into Agenta's format

Agenta Namespace

All Agenta-specific attributes are organized under the ag namespace to avoid conflicts with other OpenTelemetry conventions.

Core Semantic Conventions

ag.data

The ag.data namespace contains the core execution data for each span:

ag.data.inputs: Input parameters for the span
ag.data.outputs: Output results from the span
ag.data.internals: Internal variables and intermediate values

Data Format

Inputs and Outputs: Stored as JSON format. For chat applications using LLM spans, these are formatted as a list of messages:

{
  "data": {
    "inputs": {
      "prompt": [
        { "role": "system", "content": "System instruction" },
        { "role": "user",   "content": "User query" }
      ],
      "functions": [...],
      "tools": [...]
    },
    "outputs": {
      "completion": [
        { "role": "assistant", "content": "Assistant response" }
      ]
    }
  }
}

Internals: User-provided internal information such as context variables, intermediate calculations, or evaluation data that aren't part of the primary inputs/outputs. These are set by the user in the SDK using ag.store_internals().

SDK Integration

When using the @ag.instrument decorator in Python:

@ag.instrument
def my_function(input_param):
    # Function inputs and outputs are automatically captured
    # unless explicitly masked
    return result

The decorator automatically captures function inputs and outputs in ag.data.inputs and ag.data.outputs unless you choose to mask sensitive data.

ag.meta

The ag.meta namespace stores metadata about the span execution:

{
  "meta": {
    "system": "openai",
    "request": {
      "base_url": "https://api.openai.com/v1",
      "endpoint": "/chat/completions",
      "headers": {...},
      "streaming": false,
      "model": "gpt-4",
      "max_tokens": 1000,
      "temperature": 0.7,
      "top_p": 0.9,
      "top_k": 50
    },
    "response": {
      "model": "gpt-4-0613"
    },
    "configuration": {
      "prompt": {
        "input_keys": ["country"],
        "messages": [...],
        "template_format": "curly",
        "llm_config": {
          "model": "mistral/mistral-tiny"
        }
      }
    }
  }
}

Standard Metadata

info

Auto-instrumentation maps common semantic-convention keys—e.g. gen_ai.system, gen_ai.request.*—to the structure above.

ag.meta.system: Identifies the LLM provider or system being used (e.g., "openai", "anthropic")
ag.meta.request.base_url: The base URL for the API request
ag.meta.request.endpoint: The specific API endpoint called
ag.meta.request.headers: HTTP headers sent with the request
ag.meta.request.streaming: Boolean indicating if streaming mode was used
ag.meta.request.model: The model name requested
ag.meta.request.max_tokens: Maximum token limit for the response
ag.meta.request.temperature: Sampling temperature parameter
ag.meta.request.top_p: Nucleus sampling parameter
ag.meta.request.top_k: Top-k sampling parameter

Metadata is displayed in the observability overview page as contextual information to help navigate and understand span execution.

ag.refs

The ag.refs namespace contains references to entities within the Agenta system:

{
  "refs": {
    "application": {
      "id": "uuid"
    },
    "variant": {
      "id": "uuid", 
      "version": "1"
    },
    "environment": {
      "slug": "production",
      "id": "uuid"
    }
  }
}

Reference Types

Application: Links to the Agenta application
Variant: References specific prompt variants and versions
Environment: Links to deployment environments (production, staging, etc.)
Test Set: References to test datasets
Test Case: Links to individual test cases
Evaluation Run: References to evaluation executions
Evaluator: Links to evaluation functions

These references enable navigation within the Agenta UI and allow filtering spans by specific entities.

ag.metrics

The ag.metrics namespace tracks performance and cost metrics:

{
  "metrics": {
    "acc": {
      "costs": {
        "total": 0.00003925
      },
      "tokens": {
        "total": 157,
        "prompt": 26,
        "completion": 131
      },
      "duration": {
        "total": 1251.157
      }
    },
    "unit": {
      "costs": {
        "total": 0.00003925
      },
      "tokens": {
        "total": 157,
        "prompt": 26,
        "completion": 131
      }
    }
  }
}

Metric Types

Accumulated (acc): Metrics rolled up across child spans
Unit (unit): Metrics specific to the individual span

Available Metrics

Costs: Total cost in USD for LLM API calls
Tokens: Token usage breakdown (total, prompt, completion)
Duration: Execution time in milliseconds

Additional Agenta Attributes

ag.type

The ag.type namespace contains type information about the span:

ag.type.node can be workflow, task, tool, embedding, query, completion, chat, rerank.

ag.tags

The ag.tags namespace allows you to add custom tags to spans for categorization and filtering:

# Add custom tags using the SDK
ag.add_tags({"key":"value"})

ag.version

Tracks version information for the instrumented application or component.

ag.flags

Contains system-specific flags and metadata used internally by Agenta for span processing and display.

note

When using auto-instrumentation libraries, most attributes are saved twice - once in their original format and once processed under the ag namespace

Standard OpenTelemetry Attributes

In addition to Agenta-specific conventions, traces include standard OpenTelemetry attributes:

Links: Relationships between spans
Events: Timestamped events within spans
Version: OpenTelemetry version information
Status Code: Span completion status
Start Time: Span initiation timestamp
Span Name: Human-readable span identifier
Span Kind: Type of span (server, client, internal, etc.)

How It Works​

Agenta Namespace​

Core Semantic Conventions​

ag.data​

Data Format​

SDK Integration​

ag.meta​

Standard Metadata​

ag.refs​

Reference Types​

ag.metrics​

Metric Types​

Available Metrics​

Additional Agenta Attributes​

ag.type​

ag.tags​

ag.version​

ag.flags​

Standard OpenTelemetry Attributes​