Skip to main content
Portkey provides a robust platform to observe, govern, and manage your locally or privately hosted custom models using Triton Inference Server.
Here’s the official Triton Inference Server documentation for more details.

Integration Steps

1

Expose your Triton Server

Expose your Triton server using a tunneling service like ngrok or make it publicly accessible. Skip this if you’re self-hosting the Gateway.
ngrok http 8000 --host-header="localhost:8080"
2

Add to Model Catalog

  1. Go to Model Catalog → Add Provider
  2. Enable “Local/Privately hosted provider” toggle
  3. Select Triton as the provider type
  4. Enter your Triton server URL in Custom Host: http://localhost:8000/v2/models/mymodel
  5. Add authentication headers if needed
  6. Name your provider (e.g., my-triton)

Complete Setup Guide

See all setup options
3

Use in Your Application

from portkey_ai import Portkey

portkey = Portkey(
    api_key="PORTKEY_API_KEY",
    provider="@my-triton"
)

response = portkey.chat.completions.create(
    model="your-model-name",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)
Or use custom host directly:
from portkey_ai import Portkey

portkey = Portkey(
    api_key="PORTKEY_API_KEY",
    provider="triton",
    custom_host="http://localhost:8000/v2/models/mymodel",
    Authorization="AUTH_KEY"  # If needed
)

Next Steps

For complete SDK documentation:

SDK Reference

Complete Portkey SDK documentation