Triton Inference Server

Portkey provides a robust platform to observe, govern, and manage your locally or privately hosted custom models using Triton Inference Server.

Here’s the official Triton Inference Server documentation for more details.

Integration Steps

Expose your Triton Server

Expose your Triton server using a tunneling service like ngrok or make it publicly accessible. Skip this if you’re self-hosting the Gateway.

ngrok http 8000 --host-header="localhost:8080"

Add to Model Catalog

Go to Model Catalog → Add Provider
Enable “Local/Privately hosted provider” toggle
Select Triton as the provider type
Enter your Triton server URL in Custom Host: http://localhost:8000/v2/models/mymodel
Add authentication headers if needed
Name your provider (e.g., my-triton)

Complete Setup Guide

See all setup options

Use in Your Application

from portkey_ai import Portkey

portkey = Portkey(
    api_key="PORTKEY_API_KEY",
    provider="@my-triton"
)

response = portkey.chat.completions.create(
    model="your-model-name",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Or use custom host directly:

from portkey_ai import Portkey

portkey = Portkey(
    api_key="PORTKEY_API_KEY",
    provider="triton",
    custom_host="http://localhost:8000/v2/models/mymodel",
    Authorization="AUTH_KEY"  # If needed
)

Next Steps

Gateway Configs

Add retries, timeouts, and fallbacks

Observability

Monitor your Triton deployments

Custom Host Guide

Learn more about custom host setup

BYOLLM Guide

Complete guide for private LLMs

For complete SDK documentation:

SDK Reference

Complete Portkey SDK documentation

Ecosystem

LLM Integrations

Cloud Platforms

Guardrails

Plugins

Vector Databases

Agents

AI Apps

Libraries

Tracing Providers

MCP Clients

MCP Servers

Triton Inference Server

Integration Steps

Complete Setup Guide

Next Steps

Gateway Configs

Observability

Custom Host Guide

BYOLLM Guide

SDK Reference

Ecosystem

LLM Integrations

Cloud Platforms

Guardrails

Plugins

Vector Databases

Agents

AI Apps

Libraries

Tracing Providers

MCP Clients

MCP Servers

​Integration Steps

Complete Setup Guide

​Next Steps

Gateway Configs

Observability

Custom Host Guide

BYOLLM Guide

SDK Reference

Integration Steps

Next Steps