The Best LLM on Every Prompt

Combine All Models for Faster, Cheaper, and
Better Responses Than Any Single Model ->↓

Combine All Models for Faster, Cheaper, and Better Responses Than Any Single Model ->↓

try our chat talk to us

read docs try now

trusted by engineers at

It Starts with Your Query

api

All Models,
All Providers,
‍One API

All Models, All Providers, One API

Access all LLMs across all providers with a single API key and a standard API.

import requests
url = "https://api.unify.ai/v0/inference"
headers = {
"Authorization": "Bearer YOUR_UNIFY_KEY",
}

payload = {
"model": "mixtral-8x7b-instruct-v0.1",
"provider": "anyscale",
"arguments": {
"messages": [{
"role": "user",
"content": "YOUR_MESSAGE"
}],
"temperature": 50,
"max_tokens": 500,
"stream": True,
}
}

response = requests.post(
url, json=payload, headers=headers, stream=True
)

...

make your first request

import requests
url = "https://api.unify.ai/v0/inference"
headers = {
"Authorization": "Bearer YOUR_UNIFY_KEY",
}

payload = {
"model": "mixtral-8x7b-instruct-v0.1",
"provider": "anyscale",
"arguments": {
"messages": [{
"role": "user",
"content": "YOUR_MESSAGE"
}],
"temperature": 50,
"max_tokens": 500,
"stream": True,
}
}

response = requests.post(
url, json=payload, headers=headers, stream=True
)

...

make your first request

modular

Your Query,
Your Needs,
Custom Routing

Your Query, Your Needs,
Custom Routing

Setup your own cost, latency and output speed constraints. Define a custom quality metric. Personalize your router for your requirements.

performance

Constantly Achieve
‍Peak Performance

Systematically send your queries to the fastest provider, based on the very latest benchmark data for your region of the world, refreshed every 10 minutes.