Introduction


SolRouter is a unified API gateway that lets you send requests to any large language model — OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek, and dozens of open-source models — without paying for a subscription. You pay only for the tokens you actually consume.

Most LLM providers lock you into monthly subscriptions or minimum spending commitments. SolRouter solves this by acting as a transparent proxy: you top up a token balance once, make API calls through a single OpenAI-compatible endpoint, and only the tokens you use are deducted from your balance in real time.


banner

Key features

  • Any model, one endpoint — GPT-4.1, Claude Sonnet 4, Gemini 2.5, Llama 4, DeepSeek R1 and more through a single https://api.solrouter.io/v1/api URL
  • Pay per token — no subscriptions, no minimums, no seat fees. Unused balance never expires
  • OpenAI-compatible — drop in as a replacement for the OpenAI SDK with a single line change to baseURL
  • Real-time usage tracking — see token counts, per-request costs, and full request history in your account dashboard
  • Streaming support — full Server-Sent Events (SSE) streaming on every model that supports it
  • Model fallback — define a priority list of models; SolRouter automatically falls back to the next one if a provider is unavailable
  • Structured outputjson_object and json_schema response formats for reliable machine-readable responses
  • Tool / function calling — pass tool definitions and receive structured tool call responses, identical to the OpenAI format
  • Vision and multimodal — send images alongside text to any model that supports visual input

How it works

Your app  →  SolRouter API  →  Provider (OpenAI / Anthropic / Google / ...)
                ↓
        Token balance deducted
  1. You create an account and get an API key starting with sr_
  2. Top up your balance — credits never expire and there is no minimum amount
  3. Call the API exactly like you would call the OpenAI API
  4. Each request deducts input + output tokens from your balance at the provider's published rate

SolRouter does not modify your prompts or responses. The request is forwarded to the chosen provider and the raw response is passed back to you. The only difference from calling a provider directly is that you authenticate with a single sr_ key instead of managing separate keys for each provider.


Supported providers

SolRouter routes requests to the following providers:

ProviderExample models
OpenAIopenai/gpt-4.1, openai/gpt-4o, openai/o3
Anthropicanthropic/claude-opus-4, anthropic/claude-sonnet-4
Googlegoogle/gemini-2.5-pro, google/gemini-2.5-flash
Metameta-llama/llama-4-maverick, meta-llama/llama-3.3-70b-instruct
Mistralmistralai/mistral-large, mistralai/codestral
DeepSeekdeepseek/deepseek-r1, deepseek/deepseek-r2
xAIx-ai/grok-3, x-ai/grok-3-mini
Coherecohere/command-r-plus
Perplexityperplexity/sonar-pro

The full list with context lengths, pricing, and supported modalities is available on the Models page.


Pricing model

SolRouter passes through the provider's token pricing with no markup. You pay exactly what the provider charges per token.

What you payWhat you do not pay
Prompt tokens consumedMonthly subscription fees
Completion tokens consumedPer-seat or per-user fees
Idle time or reserved capacity
Minimum monthly spend

Token costs are shown in the usage.cost field of every API response so you always know exactly what a request cost:

{
  "usage": {
    "prompt_tokens": 312,
    "completion_tokens": 87,
    "total_tokens": 399,
    "cost": 0.0000148
  }
}

OpenAI compatibility

The SolRouter API is a strict superset of the OpenAI Chat Completions API. Any library, framework, or tool that targets the OpenAI API works with SolRouter by changing two values:

SettingOpenAI valueSolRouter value
Base URLhttps://api.openai.com/v1https://api.solrouter.io/v1/api
API keysk-...sr_...

TypeScript / JavaScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.solrouter.io/v1/api",  // ← change this
  apiKey: process.env.SOLROUTER_API_KEY,        // ← and this
});

// Everything else stays exactly the same
const completion = await client.chat.completions.create({
  model: "openai/gpt-4o-mini",
  messages: [{ role: "user", content: "Hello!" }],
});

Python

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://api.solrouter.io/v1/api",  # ← change this
    api_key=os.environ["SOLROUTER_API_KEY"],      # ← and this
)

# Everything else stays exactly the same
completion = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)

Any other OpenAI-compatible library (LangChain, LlamaIndex, Vercel AI SDK, Instructor, etc.) follows the same pattern — just update baseURL and the API key.


Common use cases

Prototyping and exploration

No subscription means you can experiment with GPT-4o, Claude Opus, and Gemini Pro side by side without committing to multiple monthly plans. Add $5 in credits and compare outputs across models for the same prompt.

Production applications on a budget

Pay only for the tokens your users actually consume. If your application has bursty or unpredictable usage, per-token billing is almost always cheaper than a flat monthly plan.

Multi-model architectures

Route different tasks to the most cost-effective model: use a small, fast model for triage and classification, a larger model for complex reasoning, and a vision model for image analysis — all through the same client and API key.

High-availability with model fallback

Define a fallback chain so your application stays online even if a provider has an outage:

{
  "model": "openai/gpt-4o",
  "models": ["anthropic/claude-sonnet-4", "google/gemini-2.5-flash"],
  "route": "fallback",
  "messages": [{ "role": "user", "content": "Summarise this document." }]
}

If gpt-4o is unavailable, SolRouter automatically retries with claude-sonnet-4, then gemini-2.5-flash.

Teams and agencies

Create a separate API key for each project, client, or team member — all drawing from one shared balance. Revoke any key instantly without affecting others.


Security

  • API keys are never stored in plaintext — only a hashed representation is kept server-side
  • Keys are scoped to your account balance — a compromised key cannot access your personal data or settings
  • All traffic is encrypted over TLS 1.2 / 1.3
  • Keys can be revoked instantly from the Account page
  • The web interface uses short-lived JWT session tokens stored in HttpOnly cookies, separate from API keys

Next steps

Getting Started

  • Quick Start — get your API key and make your first call in 2 minutes
  • Environment Setup — securely manage API keys with .env files and secrets managers
  • First Request — SDK, fetch, Python, and curl examples with response breakdown
  • Conversations — system prompts, multi-turn history, and context window management

Authentication

Models

  • Available Models — full catalogue with context lengths, pricing, and modalities
  • Reasoning Models — when and how to use thinking models (o3, Claude, DeepSeek R1)
  • Model Fallback — automatic failover chains for high availability
  • Token Counting — estimate costs, understand image tokens, and use prompt caching

API

  • API Reference — complete request and response schemas
  • Streaming — real-time SSE responses, tool call streaming, and React patterns
  • Tool Calling — let the model invoke functions in your application
  • Structured Output — enforce JSON schemas for reliable machine-readable responses
  • Vision & Multimodal — send images alongside text for analysis and extraction
  • Errors — error codes, retry strategies, and building resilient clients

Guides

  • Next.js — Route Handlers, streaming, Vercel AI SDK, and Server Actions
  • Python — async client, FastAPI, streaming, tool calling, and retries
  • LangChain — chains, agents, memory, RAG, and structured output with Zod / Pydantic