Skip to content
🎯 New workshop: Govern AI Costs in Real Time — Hands-On with agentgateway agentgateway has joined the Agentic AI FoundationLearn more

For the complete documentation index, see llms.txt. Markdown versions of all docs pages are available by appending .md to any docs URL.

Page as Markdown

OpenAI-compatible providers

Configure OpenAI-compatible providers like Mistral, DeepSeek, or Groq with custom host and path overrides.

Configure LLM providers that expose the OpenAI Chat Completions API but do not have a first-class provider type in the AgentgatewayBackend API.

Overview

In agentgateway, you configure an OpenAI-compatible provider by setting ai.provider.openai and pointing it at the provider’s host, port, and path. Use the path field when the provider serves chat completions from a non-standard path.

Built-in OpenAI-compatible providers

The following providers expose an OpenAI-compatible chat completions endpoint. To configure one, use the ai.provider.openai shape with port: 443 and the host and path values in the table. The example later on this page uses Groq.

Providerhostpath
Baseteninference.baseten.co/v1/chat/completions
Cerebrasapi.cerebras.ai/v1/chat/completions
Cohereapi.cohere.ai/compatibility/v1/chat/completions
DeepInfraapi.deepinfra.com/v1/openai/chat/completions
DeepSeekapi.deepseek.com/v1/chat/completions
Fireworks AIapi.fireworks.ai/inference/v1/chat/completions
Groqapi.groq.com/openai/v1/chat/completions
Hugging Facerouter.huggingface.co/v1/chat/completions
Mistralapi.mistral.ai/v1/chat/completions
OpenRouteropenrouter.ai/api/v1/chat/completions
Together AIapi.together.xyz/v1/chat/completions
xAIapi.x.ai/v1/chat/completions

If your provider is not in this list but still exposes the OpenAI Chat Completions API, use the generic endpoint template.

Before you begin

Install and set up an agentgateway proxy.

Set up access to an OpenAI-compatible provider

The following steps create a generic secret and AgentgatewayBackend that you can use for any of the providers in the table. The model names in the tabs are examples; substitute a model that your provider supports.

  1. Get an API key for your provider. For example, get a Groq API key or a Mistral API key.

  2. Save the API key in an environment variable.

    export MY_API_KEY=<insert your API key>
  3. Create a Kubernetes secret to store your API key.

    kubectl apply -f- <<EOF
    apiVersion: v1
    kind: Secret
    metadata:
      name: llm-provider-secret
      namespace: agentgateway-system
    type: Opaque
    stringData:
      Authorization: $MY_API_KEY
    EOF
  4. Create an AgentgatewayBackend resource that points the openai provider at your provider’s host and path. Select the tab for your provider.

    kubectl apply -f- <<EOF
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayBackend
    metadata:
      name: llm-backend
      namespace: agentgateway-system
    spec:
      ai:
        provider:
          openai:
            model: meta-llama/Llama-3.1-8B-Instruct
          host: inference.baseten.co
          port: 443
          path: /v1/chat/completions
      policies:
        auth:
          secretRef:
            name: llm-provider-secret
        tls:
          sni: inference.baseten.co
    EOF

    Review the following table to understand this configuration.

    SettingDescription
    ai.provider.openai.modelOptional upstream model override. Omit this parameter to pass the client-provided model through.
    host and portThe provider’s API host and port. Use 443 for HTTPS endpoints.
    pathThe provider’s chat completions path. Omit this parameter for providers that use the standard /v1/chat/completions path.
    policies.auth.secretRefReferences the secret that contains your provider API key.
    policies.tls.sniEnables TLS and sets the SNI value to the upstream hostname.
  5. Create an HTTPRoute resource that routes incoming traffic to the AgentgatewayBackend.

    kubectl apply -f- <<EOF
    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: llm-route
      namespace: agentgateway-system
    spec:
      parentRefs:
        - name: agentgateway-proxy
          namespace: agentgateway-system
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /llm
        backendRefs:
        - name: llm-backend
          namespace: agentgateway-system
          group: agentgateway.dev
          kind: AgentgatewayBackend
    EOF
  6. Send a request to verify the setup. Replace the model value with the model that you configured on the AgentgatewayBackend.

    curl "$INGRESS_GW_ADDRESS/llm" -H content-type:application/json -d '{
       "model": "<your-model>",
       "messages": [
         {
           "role": "user",
           "content": "Explain retrieval-augmented generation in one sentence."
         }
       ]
     }' | jq

Other OpenAI-compatible providers

Perplexity example

Perplexity exposes an OpenAI-compatible API for search-augmented models and uses the standard chat completions path, so you do not need to set path.

kubectl apply -f- <<EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: perplexity
  namespace: agentgateway-system
spec:
  ai:
    provider:
      openai:
        model: sonar
      host: api.perplexity.ai
      port: 443
  policies:
    auth:
      secretRef:
        name: perplexity-secret
    tls:
      sni: api.perplexity.ai
EOF

Generic OpenAI-compatible endpoint

Use this template when the provider exposes the OpenAI Chat Completions API but is not listed in the Built-in OpenAI-compatible providers table.

apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: generic-openai
  namespace: agentgateway-system
spec:
  ai:
    provider:
      openai:
        model: <upstream-model-name>
      host: api.example.com
      port: 443
      path: /v1/chat/completions
  policies:
    auth:
      secretRef:
        name: provider-secret
    tls:
      sni: api.example.com

Use the following fields to adapt the template:

SettingDescription
ai.provider.openai.modelOptional upstream model override. Omit this parameter to pass the client-provided model through.
host and portRequired target address for the external provider endpoint.
pathThe provider’s chat completions path. Omit this parameter for the standard /v1/chat/completions path.
policies.authAttach the provider API key secret to outbound requests.
policies.tls.sniEnable TLS and set the SNI value to the upstream hostname.

For self-hosted targets that already have guides, prefer the dedicated Ollama and vLLM pages.

Next steps

Was this page helpful?
Agentgateway assistant

Ask me anything about agentgateway configuration, features, or usage.

Note: AI-generated content might contain errors; please verify and test all returned information.

Tip: one topic per conversation gives the best results. Use the + button in the chat header to start a new conversation.

Switching topics? Starting a new conversation improves accuracy.
↑↓ navigate select esc dismiss

What could be improved?

Your feedback helps us improve assistant answers and identify docs gaps we should fix.

Need more help? Join us on Discord: https://discord.gg/y9efgEmppm

Want to use your own agent? Add the Solo MCP server to query our docs directly. Get started here: https://search.solo.io/.