Skip to main content
Available on all Portkey plans.
Distribute traffic across multiple LLMs to prevent any single provider from becoming a bottleneck.

Examples

{
  "strategy": { "mode": "loadbalance" },
  "targets": [
    { "provider": "@openai-prod", "weight": 0.7 },
    { "provider": "@azure-prod", "weight": 0.3 }
  ]
}
PatternUse Case
Between ProvidersRoute to different providers; model comes from request
Multiple API KeysDistribute load across rate limits from different accounts
Cost OptimizationSend most traffic to cheaper models, reserve premium for a portion
Gradual MigrationTest new models with small percentage before full rollout
The @provider-slug/model-name format automatically routes to the correct provider. Set up providers in Model Catalog.
Create and use configs in your requests.

How It Works

  1. Define targets & weights — Assign a weight to each target. Weights represent relative share of traffic.
  2. Weight normalization — Portkey normalizes weights to sum to 100%. Example: weights 5, 3, 1 become 55%, 33%, 11%.
  3. Request distribution — Each request routes to a target based on normalized probabilities.
  • Default weight: 1
  • Minimum weight: 0 (stops traffic without removing from config)
  • Unset weights default to 1

Considerations

  • Ensure LLMs in your list are compatible with your use case
  • Monitor usage per LLM—weight distribution affects spend
  • Each LLM has different latency and pricing