SolRouter

Introduction

SolRouter is a unified API gateway that lets you send requests to any large language model — OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek, and dozens of open-source models — without paying for a subscription. You pay only for the tokens you actually consume.

Most LLM providers lock you into monthly subscriptions or minimum spending commitments. SolRouter solves this by acting as a transparent proxy: you top up a token balance once, make API calls through a single OpenAI-compatible endpoint, and only the tokens you use are deducted from your balance in real time.

Key features

Any model, one endpoint — GPT-4.1, Claude Sonnet 4, Gemini 2.5, Llama 4, DeepSeek R1 and more through a single https://api.solrouter.io/v1/api URL
Pay per token — no subscriptions, no minimums, no seat fees. Unused balance never expires
OpenAI-compatible — drop in as a replacement for the OpenAI SDK with a single line change to baseURL
Real-time usage tracking — see token counts, per-request costs, and full request history in your account dashboard
Streaming support — full Server-Sent Events (SSE) streaming on every model that supports it
Model fallback — define a priority list of models; SolRouter automatically falls back to the next one if a provider is unavailable
Structured output — json_object and json_schema response formats for reliable machine-readable responses
Tool / function calling — pass tool definitions and receive structured tool call responses, identical to the OpenAI format
Vision and multimodal — send images alongside text to any model that supports visual input

How it works

Your app  →  SolRouter API  →  Provider (OpenAI / Anthropic / Google / ...)
                ↓
        Token balance deducted

You create an account and get an API key starting with sr_
Top up your balance — credits never expire and there is no minimum amount
Call the API exactly like you would call the OpenAI API
Each request deducts input + output tokens from your balance at the provider's published rate

SolRouter does not modify your prompts or responses. The request is forwarded to the chosen provider and the raw response is passed back to you. The only difference from calling a provider directly is that you authenticate with a single sr_ key instead of managing separate keys for each provider.

Supported providers

SolRouter routes requests to the following providers:

Provider	Example models
OpenAI	`openai/gpt-4.1`, `openai/gpt-4o`, `openai/o3`
Anthropic	`anthropic/claude-opus-4`, `anthropic/claude-sonnet-4`
Google	`google/gemini-2.5-pro`, `google/gemini-2.5-flash`
Meta	`meta-llama/llama-4-maverick`, `meta-llama/llama-3.3-70b-instruct`
Mistral	`mistralai/mistral-large`, `mistralai/codestral`
DeepSeek	`deepseek/deepseek-r1`, `deepseek/deepseek-r2`
xAI	`x-ai/grok-3`, `x-ai/grok-3-mini`
Cohere	`cohere/command-r-plus`
Perplexity	`perplexity/sonar-pro`

The full list with context lengths, pricing, and supported modalities is available on the Models page.

Pricing model

SolRouter passes through the provider's token pricing with no markup. You pay exactly what the provider charges per token.

What you pay	What you do not pay
Prompt tokens consumed	Monthly subscription fees
Completion tokens consumed	Per-seat or per-user fees
—	Idle time or reserved capacity
—	Minimum monthly spend

Token costs are shown in the usage.cost field of every API response so you always know exactly what a request cost:

{
  "usage": {
    "prompt_tokens": 312,
    "completion_tokens": 87,
    "total_tokens": 399,
    "cost": 0.0000148
  }
}

OpenAI compatibility

The SolRouter API is a strict superset of the OpenAI Chat Completions API. Any library, framework, or tool that targets the OpenAI API works with SolRouter by changing two values:

Setting	OpenAI value	SolRouter value
Base URL	`https://api.openai.com/v1`	`https://api.solrouter.io/v1/api`
API key	`sk-...`	`sr_...`

TypeScript / JavaScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.solrouter.io/v1/api",  // ← change this
  apiKey: process.env.SOLROUTER_API_KEY,        // ← and this
});

// Everything else stays exactly the same
const completion = await client.chat.completions.create({
  model: "openai/gpt-4o-mini",
  messages: [{ role: "user", content: "Hello!" }],
});

Python

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://api.solrouter.io/v1/api",  # ← change this
    api_key=os.environ["SOLROUTER_API_KEY"],      # ← and this
)

# Everything else stays exactly the same
completion = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)

Any other OpenAI-compatible library (LangChain, LlamaIndex, Vercel AI SDK, Instructor, etc.) follows the same pattern — just update baseURL and the API key.

Common use cases

Prototyping and exploration

No subscription means you can experiment with GPT-4o, Claude Opus, and Gemini Pro side by side without committing to multiple monthly plans. Add $5 in credits and compare outputs across models for the same prompt.

Production applications on a budget

Pay only for the tokens your users actually consume. If your application has bursty or unpredictable usage, per-token billing is almost always cheaper than a flat monthly plan.

Multi-model architectures

Route different tasks to the most cost-effective model: use a small, fast model for triage and classification, a larger model for complex reasoning, and a vision model for image analysis — all through the same client and API key.

High-availability with model fallback

Define a fallback chain so your application stays online even if a provider has an outage:

{
  "model": "openai/gpt-4o",
  "models": ["anthropic/claude-sonnet-4", "google/gemini-2.5-flash"],
  "route": "fallback",
  "messages": [{ "role": "user", "content": "Summarise this document." }]
}

If gpt-4o is unavailable, SolRouter automatically retries with claude-sonnet-4, then gemini-2.5-flash.

Teams and agencies

Create a separate API key for each project, client, or team member — all drawing from one shared balance. Revoke any key instantly without affecting others.

Security

API keys are never stored in plaintext — only a hashed representation is kept server-side
Keys are scoped to your account balance — a compromised key cannot access your personal data or settings
All traffic is encrypted over TLS 1.2 / 1.3
Keys can be revoked instantly from the Account page
The web interface uses short-lived JWT session tokens stored in HttpOnly cookies, separate from API keys

Next steps

Getting Started

Quick Start — get your API key and make your first call in 2 minutes
Environment Setup — securely manage API keys with .env files and secrets managers
First Request — SDK, fetch, Python, and curl examples with response breakdown
Conversations — system prompts, multi-turn history, and context window management

Authentication

API Keys — creating, naming, and managing keys
Security Best Practices — protect credentials, rotate keys, and respond to leaks
Session Tokens — how the web UI authenticates with HttpOnly JWT cookies

Models

Available Models — full catalogue with context lengths, pricing, and modalities
Reasoning Models — when and how to use thinking models (o3, Claude, DeepSeek R1)
Model Fallback — automatic failover chains for high availability
Token Counting — estimate costs, understand image tokens, and use prompt caching

API

API Reference — complete request and response schemas
Streaming — real-time SSE responses, tool call streaming, and React patterns
Tool Calling — let the model invoke functions in your application
Structured Output — enforce JSON schemas for reliable machine-readable responses
Vision & Multimodal — send images alongside text for analysis and extraction
Errors — error codes, retry strategies, and building resilient clients

Guides

Next.js — Route Handlers, streaming, Vercel AI SDK, and Server Actions
Python — async client, FastAPI, streaming, tool calling, and retries
LangChain — chains, agents, memory, RAG, and structured output with Zod / Pydantic