For the complete documentation index, see llms.txt. Markdown versions of all docs pages are available by appending .md to any docs URL.
Virtual models
Configure virtual models with weighted, failover, and conditional routing in simplified LLM mode.
Virtual models let you publish one client-facing model name and route requests across one or more internal target models.
Use llm.virtualModels[] to define the virtual entrypoint and llm.models[] as the concrete upstream targets.
Public and internal models
Use llm.models[].visibility to control whether a model is directly exposed to clients or kept as an internal target.
public: The model can be requested directly by clients and can also be used as a virtual model target.internal: The model is intended for internal routing targets and is not exposed as a direct client model.
Route selection modes
Each virtual model defines its routing strategy under routing.
The routing targets in a virtual model point to concrete llm.models[] entries.
Weighted routing
Use routing.weighted.targets to split traffic between targets with weight.
llm:
models:
- name: gpt-4o-public
visibility: public
provider: openAI
params:
model: gpt-4o
apiKey: "$OPENAI_API_KEY"
- name: gpt-4o-primary
visibility: internal
provider: openAI
params:
model: gpt-4o
apiKey: "$OPENAI_API_KEY"
- name: gpt-4o-fallback
visibility: internal
provider: openAI
params:
model: gpt-4o-mini
apiKey: "$OPENAI_API_KEY"
virtualModels:
- name: smart
routing:
weighted:
targets:
- model: gpt-4o-primary
weight: 90
- model: gpt-4o-fallback
weight: 10Failover routing
Use routing.failover.targets and priority to define ordered failover targets.
Targets with the same priority are load balanced across based on health and latency.
llm:
models:
- name: claude-primary
visibility: internal
provider: anthropic
params:
model: claude-sonnet-4-0
apiKey: "$ANTHROPIC_API_KEY"
- name: claude-backup-a
visibility: internal
provider: anthropic
params:
model: claude-3-5-haiku-20241022
apiKey: "$ANTHROPIC_API_KEY"
- name: claude-backup-b
visibility: internal
provider: anthropic
params:
model: claude-3-5-haiku-20241022
apiKey: "$ANTHROPIC_API_KEY"
virtualModels:
- name: resilient
routing:
failover:
targets:
- model: claude-primary
priority: 1
- model: claude-backup-a
priority: 2
- model: claude-backup-b
priority: 2Conditional routing
Use routing.conditional.targets and when expressions to select targets by request context.
llm:
models:
- name: openai-public
visibility: public
provider: openAI
params:
model: gpt-4o-mini
apiKey: "$OPENAI_API_KEY"
- name: openai-fast
visibility: internal
provider: openAI
params:
model: gpt-4o-mini
apiKey: "$OPENAI_API_KEY"
- name: openai-smart
visibility: internal
provider: openAI
params:
model: gpt-4o
apiKey: "$OPENAI_API_KEY"
virtualModels:
- name: adaptive
routing:
conditional:
targets:
- model: openai-fast
when: request.headers["x-tier"] == "free"
- model: openai-smart
when: request.headers["x-tier"] == "pro"