vLLM

Integration Steps
Next Steps

Portkey provides a robust platform to observe, govern, and manage your locally or privately hosted custom models using vLLM.

Here’s a list of all model architectures supported on vLLM.

Integration Steps

Expose your vLLM Server

Expose your vLLM server using a tunneling service like ngrok or make it publicly accessible. Skip this if you’re self-hosting the Gateway.

ngrok http 8000 --host-header="localhost:8080"

Add to Model Catalog

Go to Model Catalog → Add Provider
Enable “Local/Privately hosted provider” toggle
Select OpenAI as the provider type (vLLM follows OpenAI API schema)
Enter your vLLM server URL in Custom Host: https://your-vllm-server.ngrok-free.app
Add authentication headers if needed
Name your provider (e.g., my-vllm)

Complete Setup Guide

See all setup options

Use in Your Application

from portkey_ai import Portkey

portkey = Portkey(
    api_key="PORTKEY_API_KEY",
    provider="@my-vllm"
)

response = portkey.chat.completions.create(
    model="your-model-name",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Or use custom host directly:

from portkey_ai import Portkey

portkey = Portkey(
    api_key="PORTKEY_API_KEY",
    provider="openai",
    custom_host="https://your-vllm-server.ngrok-free.app",
    Authorization="AUTH_KEY"  # If needed
)

Important: vLLM follows the OpenAI API specification, so set the provider as openai when using custom host directly. By default, vLLM runs on http://localhost:8000/v1.

Next Steps

Gateway Configs

Add retries, timeouts, and fallbacks

Observability

Monitor your vLLM deployments

Custom Host Guide

Learn more about custom host setup

BYOLLM Guide

Complete guide for private LLMs

For complete SDK documentation:

SDK Reference

Complete Portkey SDK documentation

LocalAI Triton Inference Server

⌘I

Ecosystem

LLM Integrations

Cloud Platforms

Guardrails

Plugins

Vector Databases

Agents

AI Apps

Libraries

Tracing Providers

MCP Clients

MCP Servers

Integration Steps

Complete Setup Guide

Next Steps

Gateway Configs

Observability

Custom Host Guide

BYOLLM Guide

SDK Reference

Ecosystem

LLM Integrations

Cloud Platforms

Guardrails

Plugins

Vector Databases

Agents

AI Apps

Libraries

Tracing Providers

MCP Clients

MCP Servers

​Integration Steps

Complete Setup Guide

​Next Steps

Gateway Configs

Observability

Custom Host Guide

BYOLLM Guide

SDK Reference

Integration Steps

Next Steps