This tutorial guides you through deploying am LLM application in agenta using Mistral-7B from Hugging Face.

  1. Create a Hugging Face Account

  2. Navigate to the Mistral-7B-v0.1 model page

  3. Click on the ‘Deploy’ option, and then select ‘Interface API’

  4. Create an Access Token (If you haven’t already)

  5. Initially, you might start with a simple Python script to interact with the model:

import requests

API_URL = "https://api-inference.huggingface.co/models/mistralai/Mistral-7B-v0.1"
headers = {"Authorization": "Bearer [Your_Token]"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

output = query({"inputs": "Your query here"})
  1. Modify the script and add the Agenta SDK
import agenta as ag
import requests

API_URL = "https://api-inference.huggingface.co/models/mistralai/Mistral-7B-v0.1"
headers = {"Authorization": "Bearer [Your_Token]"}

ag.init()
ag.config.default(
    prompt_template=ag.TextParam("Summarize the following text: {text}"),
)

@ag.entrypoint
def generate(text: str) -> str:
    prompt = ag.config.prompt_template.format(text=text)
    payload = {"inputs": prompt}
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()[0]["generated_text"]
  1. Deploy to Agenta
agenta init
agenta variant serve app.py
  1. Now, Interact with your app either locally http://localhost or in cloud https://cloud.agenta.ai/apps
  1. You can create a test set and evaluate the model’s performance using Agenta’s evaluation techniques
  1. If you want to deploy the variant, navigate to the playground, Click on ‘Publish’ and choose the envirenment environment to which you wish to deploy
  1. Go to ‘Endpoints’ section, select the environment then use the provided endpoint to send requests to the LLM app
  1. Set up a Hugging Face Account
    Sign up for an account at Hugging Face.

  2. Access Mistral-7B Model
    Go to the Mistral-7B-v0.1 model page.

  3. Deploy the Model
    Choose the ‘Deploy’ option and select ‘Interface API’.

  4. Generate an Access Token
    If you haven’t already, create an Access Token on Hugging Face.

  5. Initial Python Script
    Start with a basic script to interact with Mistral-7B:

    import requests
    
    API_URL = "https://api-inference.huggingface.co/models/mistralai/Mistral-7B-v0.1"
    headers = {"Authorization": f"Bearer [Your_Token]"}
    
    def query(payload):
        response = requests.post(API_URL, headers=headers, json=payload)
        return response.json()
    
    output = query({"inputs": "Your query here"})
    
  6. Integrate Agenta SDK
    Modify the script to include the Agenta SDK:

    import agenta as ag
    import requests
    
    API_URL = "https://api-inference.huggingface.co/models/mistralai/Mistral-7B-v0.1"
    headers = {"Authorization": f"Bearer [Your_Token]"}
    
    ag.init()
    ag.config.default(
        prompt_template=ag.TextParam("Summarize the following text: {text}")
    )
    
    @ag.entrypoint
    def generate(text: str) -> str:
        prompt = ag.config.prompt_template.format(text=text)
        payload = {"inputs": prompt}
        response = requests.post(API_URL, headers=headers, json=payload)
        return response.json()[0]["generated_text"]
    
  7. Deploy with Agenta
    Execute these commands to deploy your application:

    agenta init
    agenta variant serve app.py
    
  8. Interact with Your App
    Access a playground for you app in agenta in https://cloud.agenta.ai/apps (or locally if you have self-hosted agenta).

    Image of Mistral-7B Interface

  9. Evaluate Model Performance
    Use Agenta’s tools to create a test set and evaluate your LLM application.

    Image of Evaluation Interface

  10. Publish Your Variant
    To deploy, go to the playground, click ‘Publish’, and choose the deployment environment.

    Image of Deployment Options

  11. Access the Endpoints
    In the ‘Endpoints’ section, select your environment and use the provided endpoint to interact with your LLM app.

    Image of Endpoint Access

Note that you can compare this model to other LLMs by serving other variants of the app. Just create a new app_cohere.py for instance using cohere model and serve it in the same app using agenta variant serve app_cohere.py.