Introduction
SolRouter is a unified API gateway that lets you send requests to any large language model — OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek, and dozens of open-source models — without paying for a subscription. You pay only for the tokens you actually consume.
Most LLM providers lock you into monthly subscriptions or minimum spending commitments. SolRouter solves this by acting as a transparent proxy: you top up a token balance once, make API calls through a single OpenAI-compatible endpoint, and only the tokens you use are deducted from your balance in real time.
Key features
- Any model, one endpoint — GPT-4.1, Claude Sonnet 4, Gemini 2.5, Llama 4, DeepSeek R1 and more through a single
https://api.solrouter.io/v1/apiURL - Pay per token — no subscriptions, no minimums, no seat fees. Unused balance never expires
- OpenAI-compatible — drop in as a replacement for the OpenAI SDK with a single line change to
baseURL - Real-time usage tracking — see token counts, per-request costs, and full request history in your account dashboard
- Streaming support — full Server-Sent Events (SSE) streaming on every model that supports it
- Model fallback — define a priority list of models; SolRouter automatically falls back to the next one if a provider is unavailable
- Structured output —
json_objectandjson_schemaresponse formats for reliable machine-readable responses - Tool / function calling — pass tool definitions and receive structured tool call responses, identical to the OpenAI format
- Vision and multimodal — send images alongside text to any model that supports visual input
How it works
Your app → SolRouter API → Provider (OpenAI / Anthropic / Google / ...)
↓
Token balance deducted
- You create an account and get an API key starting with
sr_ - Top up your balance — credits never expire and there is no minimum amount
- Call the API exactly like you would call the OpenAI API
- Each request deducts input + output tokens from your balance at the provider's published rate
SolRouter does not modify your prompts or responses. The request is forwarded to the chosen provider and the raw response is passed back to you. The only difference from calling a provider directly is that you authenticate with a single sr_ key instead of managing separate keys for each provider.
Supported providers
SolRouter routes requests to the following providers:
| Provider | Example models |
|---|---|
| OpenAI | openai/gpt-4.1, openai/gpt-4o, openai/o3 |
| Anthropic | anthropic/claude-opus-4, anthropic/claude-sonnet-4 |
google/gemini-2.5-pro, google/gemini-2.5-flash | |
| Meta | meta-llama/llama-4-maverick, meta-llama/llama-3.3-70b-instruct |
| Mistral | mistralai/mistral-large, mistralai/codestral |
| DeepSeek | deepseek/deepseek-r1, deepseek/deepseek-r2 |
| xAI | x-ai/grok-3, x-ai/grok-3-mini |
| Cohere | cohere/command-r-plus |
| Perplexity | perplexity/sonar-pro |
The full list with context lengths, pricing, and supported modalities is available on the Models page.
Pricing model
SolRouter passes through the provider's token pricing with no markup. You pay exactly what the provider charges per token.
| What you pay | What you do not pay |
|---|---|
| Prompt tokens consumed | Monthly subscription fees |
| Completion tokens consumed | Per-seat or per-user fees |
| — | Idle time or reserved capacity |
| — | Minimum monthly spend |
Token costs are shown in the usage.cost field of every API response so you always know exactly what a request cost:
{
"usage": {
"prompt_tokens": 312,
"completion_tokens": 87,
"total_tokens": 399,
"cost": 0.0000148
}
}
OpenAI compatibility
The SolRouter API is a strict superset of the OpenAI Chat Completions API. Any library, framework, or tool that targets the OpenAI API works with SolRouter by changing two values:
| Setting | OpenAI value | SolRouter value |
|---|---|---|
| Base URL | https://api.openai.com/v1 | https://api.solrouter.io/v1/api |
| API key | sk-... | sr_... |
TypeScript / JavaScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.solrouter.io/v1/api", // ← change this
apiKey: process.env.SOLROUTER_API_KEY, // ← and this
});
// Everything else stays exactly the same
const completion = await client.chat.completions.create({
model: "openai/gpt-4o-mini",
messages: [{ role: "user", content: "Hello!" }],
});
Python
from openai import OpenAI
import os
client = OpenAI(
base_url="https://api.solrouter.io/v1/api", # ← change this
api_key=os.environ["SOLROUTER_API_KEY"], # ← and this
)
# Everything else stays exactly the same
completion = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
Any other OpenAI-compatible library (LangChain, LlamaIndex, Vercel AI SDK, Instructor, etc.) follows the same pattern — just update baseURL and the API key.
Common use cases
Prototyping and exploration
No subscription means you can experiment with GPT-4o, Claude Opus, and Gemini Pro side by side without committing to multiple monthly plans. Add $5 in credits and compare outputs across models for the same prompt.
Production applications on a budget
Pay only for the tokens your users actually consume. If your application has bursty or unpredictable usage, per-token billing is almost always cheaper than a flat monthly plan.
Multi-model architectures
Route different tasks to the most cost-effective model: use a small, fast model for triage and classification, a larger model for complex reasoning, and a vision model for image analysis — all through the same client and API key.
High-availability with model fallback
Define a fallback chain so your application stays online even if a provider has an outage:
{
"model": "openai/gpt-4o",
"models": ["anthropic/claude-sonnet-4", "google/gemini-2.5-flash"],
"route": "fallback",
"messages": [{ "role": "user", "content": "Summarise this document." }]
}
If gpt-4o is unavailable, SolRouter automatically retries with claude-sonnet-4, then gemini-2.5-flash.
Teams and agencies
Create a separate API key for each project, client, or team member — all drawing from one shared balance. Revoke any key instantly without affecting others.
Security
- API keys are never stored in plaintext — only a hashed representation is kept server-side
- Keys are scoped to your account balance — a compromised key cannot access your personal data or settings
- All traffic is encrypted over TLS 1.2 / 1.3
- Keys can be revoked instantly from the Account page
- The web interface uses short-lived JWT session tokens stored in
HttpOnlycookies, separate from API keys
Next steps
Getting Started
- Quick Start — get your API key and make your first call in 2 minutes
- Environment Setup — securely manage API keys with
.envfiles and secrets managers - First Request — SDK, fetch, Python, and curl examples with response breakdown
- Conversations — system prompts, multi-turn history, and context window management
Authentication
- API Keys — creating, naming, and managing keys
- Security Best Practices — protect credentials, rotate keys, and respond to leaks
- Session Tokens — how the web UI authenticates with HttpOnly JWT cookies
Models
- Available Models — full catalogue with context lengths, pricing, and modalities
- Reasoning Models — when and how to use thinking models (o3, Claude, DeepSeek R1)
- Model Fallback — automatic failover chains for high availability
- Token Counting — estimate costs, understand image tokens, and use prompt caching
API
- API Reference — complete request and response schemas
- Streaming — real-time SSE responses, tool call streaming, and React patterns
- Tool Calling — let the model invoke functions in your application
- Structured Output — enforce JSON schemas for reliable machine-readable responses
- Vision & Multimodal — send images alongside text for analysis and extraction
- Errors — error codes, retry strategies, and building resilient clients
Guides