Skip to main content

Quick Start: Your First LLM Application

This guide walks you through creating, evaluating, and deploying your first LLM application using Agenta's user interface. In just a few minutes, you'll have a working prompt that you can use in production.

Prefer learning by watching? We have a 3-minute video that covers the same steps.

What You'll Learn

By the end of this tutorial, you'll know how to:

Step 1: Create Your First Prompt

Let's start by creating a simple prompt that returns the capital city of any country.

Creating the Application

  1. Click "Create New Prompt" on your dashboard
  2. Select "Completion Prompt" from the options
  3. Name your prompt "get_capital"
Understanding Prompt Types

Agenta supports two main prompt types:

  • Completion Prompts: Single-turn prompts for generating one response (like summaries, translations, or factual answers)
  • Chat Prompts: Multi-turn prompts for conversations (like chatbots or interactive assistants)

For this tutorial, we're using a completion prompt since we want a single, direct answer.

Step 2: Test and Improve Your Prompt

Now let's experiment with the prompt in Agenta's playground to make sure it works correctly.

Initial Testing

The template comes with a basic prompt for getting country capitals. Let's test it:

  1. Go to the playground (it should open automatically after creating your prompt)
  2. Enter "France" in the input field
  3. Click "Run" to test the prompt
  4. Check the result - it should return "The capital of France is Paris"

Refining the Prompt

The current response is a full sentence, but let's say we want just the city name. We can improve this:

  1. Edit the prompt to be more specific:

    What is the capital of {{country}}? Answer with only the capital name.
  2. Change the model to "gpt-4o-mini" for better performance:

  1. Run the prompt again with "France" as input
  2. Verify the result now shows just "Paris"

Saving Your Changes

Now let's save this improved version:

  1. Click the "Commit" button
  2. Select "As a new variant"
  3. Name the variant "explicit-prompt"
Understanding Variants

Variants in Agenta work like branches in Git repositories. Each variant is versioned. You can use variants to experiment with different concepts (e.g., long prompt and short prompt), different models (gpt-4o-mini and gpt-4o), or have different team members work (alex variant1, amani-var2).

Step 3: Create a Test Set

Test sets help you evaluate your prompts consistently. Let's create one using the data point we just tested.

Adding to a Test Set

You can create a test set directly from the playground:

  1. Click the three dots next to your output result
  2. Select "Add to test set"
  3. Configure the test set in the drawer that opens:
    • Test set name: Create new and name it "capitals"
    • Input mapping: Leave "country" mapped to "country"
    • Output mapping: Leave "output" mapped to "correct_answer"

This creates a test case where:

  • Input: France
  • Expected output: Paris

You can add more countries to this test set later to ensure your prompt works consistently across different inputs.

Step 4: Evaluate Your Prompt

Evaluation helps you measure how well your prompt performs against your test cases.

Running an Evaluation

  1. Go to the Evaluation page from the main navigation
  2. Click "Start new evaluation"
  3. Configure your evaluation:
    • Test set: Select "capitals" (the one we just created)
    • Variant: Select "explicit-prompt"
    • Evaluator: Select "Exact Match" (to check if output exactly matches expected result)
info

Agenta comes with a set of built-in evaluators that you can use to evaluate your prompt. You can also create your own custom evaluator using code or webhooks.

Analyzing the Results

The evaluation will run in the background. Once complete, you can:

  • View overall results on the evaluation dashboard
  • Click on the evaluation to see detailed results for each test case
  • Identify any failures where the output didn't match expectations

This helps you understand how reliable your prompt is before deploying it to production.

Step 5: Deploy to Production

Once you're satisfied with your prompt's performance, it's time to deploy it.

Deploying Your Prompt

  1. Go to the Registry page to see all your prompt versions
  2. Find and select your "explicit-prompt" variant
  3. Click "Deploy" in the upper right corner of the drawer that opens
  4. Select "Production" as the target environment

Your prompt is now live and ready to use in your applications!

Step 6: Integrate with Your App: Fetch the Prompt Configuration

info

Agenta provides two main ways to integrate with your application:

  • Use Agenta for Prompt Management: Fetch the prompt configuration and use it in your own code
  • Use Agenta as a Proxy: Use Agenta as a middleware that forwards requests to the LLM

In this guide, we'll use the first approach. The second approach is covered in the Integration Guide. Its advantages are simpler integration and getting observability out of the box. The disadvantages are that it does not support streaming responses and adds a slight latency to the response (approximately 0.3 seconds).

We are going to fetch the prompt configuration and use it in our own code:

First, import and initialize the Agenta SDK:

import agenta as ag
# os.environ["AGENTA_API_KEY"] = "YOUR_AGENTA_API_KEY"
# os.environ["AGENTA_HOST"] = "https://cloud.agenta.ai" # only needed when self-hosting
ag.init()

# Fetch your prompt configuration from the registry
# You can also use `ag.ConfigManager.get_from_registry_async` for asynchronous applications
config = ag.ConfigManager.get_from_registry(
app_slug="your-app-slug",
environment_slug="production"
)

# Use the helper class `PromptTemplate` to format your prompt and convert it to OpenAI-compatible parameters
prompt = PromptTemplate(**config["prompt"]).format(country="France")
client = openai.OpenAI()
response = client.chat.completions.create(
**prompt.to_openai_kwargs()
)
The JSON response with your prompt configuration.
{
"prompt": {
"messages": [
{
"role": "system",
"content": "You are an expert in geography"
},
{
"role": "user",
"content": "What is the capital of {{country}}? "
}
],
"input_keys": [
"country"
],
"llm_config": {
"model": "gpt-3.5-turbo",
"tools": [],
"top_p": 0.2,
"max_tokens": 257,
"temperature": 0.2,
"presence_penalty": -1.7,
"frequency_penalty": -1.5,
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "MySchema",
"schema": {
"type": "object",
"properties": {}
},
"strict": false,
"description": "A description of the schema"
}
}
},
"template_format": "curly"
}
}

Step 7: Integrating with Your Application: Adding Observability

Agenta is built on top of OpenTelemetry and is compatible with a number of semantic conventions in the ecosystem (OpenLLMetry, OpenInference, Logfire). It comes therefore with built-in auto-instrumentation support for all major LLM frameworks and SDKs such as OpenAI, LiteLLM, LangChain.

To add instrumentation to our application, let's use the auto-instrumentation support for OpenAI. We are going to use the OpenLLMetry library to auto-instrument the OpenAI client.

First, we need to install the OpenLLMetry library:

pip install opentelemetry-instrumentation-openai

Then we can add auto-instrumentation to our application (new lines are highlighted):

import agenta as ag
from opentelemetry.instrumentation.openai import OpenAIInstrumentor

import openai

# os.environ["AGENTA_API_KEY"] = "YOUR_AGENTA_API_KEY"
# os.environ["AGENTA_HOST"] = "https://cloud.agenta.ai" # only needed when self-hosting
ag.init()

OpenAIInstrumentor().instrument()

# Fetch your prompt configuration from the registry
# You can also use `ag.ConfigManager.get_from_registry_async` for asynchronous applications
config = ag.ConfigManager.get_from_registry(
app_slug="your-app-slug",
environment_slug="production"
)

# Use the helper class `PromptTemplate` to format your prompt and convert it to OpenAI-compatible parameters
prompt = PromptTemplate(**config["prompt"]).format(country="France")
client = openai.OpenAI()
response = client.chat.completions.create(
**prompt.to_openai_kwargs()
)

You can go now the the observability page and see the trace generated. It includes the cost, latency, and inputs/outputs of the prompt.

What's Next?

Congratulations! You've successfully created, tested, deployed, and set up observability for your first LLM application.

Here are some next steps to explore: