Special Offer ✨  $50 Free Credits for All New Sign Ups 💸

The Best LLM on Every Prompt

The Best LLM on Every Prompt

Combine All Models for Faster, Cheaper, and
Better Responses Than Any Single Model ->
Combine All Models for Faster, Cheaper, and Better Responses Than Any Single Model ->
trusted by engineers at
deepmind logo
amazon logo
tesla logo
twitter x logo
salesforce logo
ezdubs logo
oxford logo
mit logo
stanford logo
imperial college logo
cambridge logo
trusted by engineers at
deepmind logo
amazon logo
tesla logo
twitter x logo
salesforce logo
ezdubs logo
oxford logo
mit logo
stanford logo
imperial college logo
cambridge logo
deepmind logo
amazon logo
tesla logo
twitter x logo
salesforce logo
ezdubs logo
oxford logo
mit logo
stanford logo
imperial college logo
cambridge logo
It Starts with Your Query
api

All Models,
All Providers,
One API

All Models, All Providers, One API

Access all LLMs across all providers with a single API key and a standard API.

import requests
url = "https://api.unify.ai/v0/inference"
headers = {
   "Authorization": "Bearer YOUR_UNIFY_KEY",
}

payload = {
   "model": "mixtral-8x7b-instruct-v0.1",
   "provider": "anyscale",
   "arguments": {
       "messages": [{
           "role": "user",
           "content": "YOUR_MESSAGE"
       }],
       "temperature": 50,
       "max_tokens": 500,
       "stream": True,
   }
}

response = requests.post(
        url, json=payload, headers=headers, stream=True
)
python
node.js
c
php
ruby
...
make your first request
import requests
url = "https://api.unify.ai/v0/inference"
headers = {
   "Authorization": "Bearer YOUR_UNIFY_KEY",
}

payload = {
   "model": "mixtral-8x7b-instruct-v0.1",
   "provider": "anyscale",
   "arguments": {
       "messages": [{
           "role": "user",
           "content": "YOUR_MESSAGE"
       }],
       "temperature": 50,
       "max_tokens": 500,
       "stream": True,
   }
}

response = requests.post(
        url, json=payload, headers=headers, stream=True
)
python
node.js
c
php
ruby
...
make your first request
Throughput Scatter Graph
modular

Your Query,
Your Needs,
Custom Routing

Your Query, Your Needs,
Custom Routing

Setup your own cost, latency and output speed constraints. Define a custom quality metric. Personalize your router for your requirements.
Throughput Scatter Graph
performance

Constantly Achieve
Peak Performance

Systematically send your queries to the fastest provider, based on the very latest benchmark data for your region of the world, refreshed every 10 minutes.

Mistral logo
Mixtral8x7B Instruct v0.1
+102.56
%
Tokens / Sec
+406.66
%
TTFT
+138.37
%
E2E Latency
+206.85
%
ITL
Meta logo
LLaMa2 70B Chat
+83.97
%
Tokens / Sec
+262.96
%
TTFT
+95.02
%
E2E Latency
+155.66
%
ITL
anyscale logo
anyscale
replicate logo
replicate
together.ai logo
together.ai
octo-ai logo
octoai
Mistral logo
mistral-ai
Unify logo
router
Unify Benchmarks Mistral Preview

Frequently Asked Questions

Do I need to create an account with each provider?
Do you charge anything on top of the upstream providers?
How do you determine what the best model is?
->
back to top