Quickstart
Get your first response from any LLM in under 2 minutes.
1. Create an account
Go to solrouter.io and sign up. You can register with an email and password or continue with GitHub OAuth.
2. Get an API key
Once logged in, open the Account page and navigate to API Keys. Click Create key, give it a name, and copy the key — it starts with sr_ and is shown only once.
Keep your key secret. Anyone who has it can make requests charged to your balance.
3. Top up your balance
Go to the Account → Balance section and add credits. Payments are accepted via card or crypto. Credits never expire and are deducted only when you actually use tokens.
4. Set up your environment variables
Hard-coding API keys directly in source code is a security risk. The recommended approach is to store your key in an environment variable and load it at runtime.
Creating a .env file
Create a .env file in the root of your project:
SOLROUTER_API_KEY=sr_YOUR_API_KEY
Always add
.envto your.gitignoreso it is never committed to version control:echo ".env" >> .gitignore
Loading the key in Node.js
Node.js 20.6+ can load .env files natively with the --env-file flag:
node --env-file=.env index.js
For older Node.js versions, install the dotenv package:
npm install dotenv
Then load it at the top of your entry file before any other imports:
import "dotenv/config";
// or: require("dotenv").config();
const apiKey = process.env.SOLROUTER_API_KEY;
Loading the key in Python
Install python-dotenv:
pip install python-dotenv
Then load the .env file at the start of your script:
from dotenv import load_dotenv
import os
load_dotenv()
api_key = os.environ["SOLROUTER_API_KEY"]
Loading the key in Next.js
Next.js has built-in .env support — no extra packages needed. Create a .env.local file (which is already git-ignored by the Next.js default .gitignore):
SOLROUTER_API_KEY=sr_YOUR_API_KEY
Access it in server-side code (API routes, Server Actions, Route Handlers):
const apiKey = process.env.SOLROUTER_API_KEY;
Important: Never expose your key to the browser. In Next.js, only variables prefixed with
NEXT_PUBLIC_are bundled into the client. YourSOLROUTER_API_KEYmust not have that prefix.
5. Make your first request
The SolRouter API is fully OpenAI-compatible. Point your existing code at https://api.solrouter.io/v1 and swap in your sr_ key — nothing else needs to change.
Using the OpenAI SDK
npm install openai
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.solrouter.io/v1",
apiKey: process.env.SOLROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "openai/gpt-4o-mini",
messages: [
{
role: "user",
content: "What is the meaning of life?",
},
],
});
console.log(completion.choices[0].message.content);
Using fetch directly
const response = await fetch("https://api.solrouter.io/v1/chat/completions", {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.SOLROUTER_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "openai/gpt-4o-mini",
messages: [
{
role: "user",
content: "What is the meaning of life?",
},
],
}),
});
const data = await response.json();
console.log(data.choices[0].message.content);
Using Python
from openai import OpenAI
import os
from dotenv import load_dotenv
load_dotenv()
client = OpenAI(
base_url="https://api.solrouter.io/v1",
api_key=os.environ["SOLROUTER_API_KEY"],
)
completion = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[
{
"role": "user",
"content": "What is the meaning of life?",
}
],
)
print(completion.choices[0].message.content)
Using curl
curl https://api.solrouter.io/v1/chat/completions \
-H "Authorization: Bearer $SOLROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "What is the meaning of life?"
}
]
}'
6. Understanding the response
Every successful call to /v1/chat/completions returns a JSON object. Here is a fully annotated example:
{
"id": "chatcmpl-a1b2c3d4e5f6",
"object": "chat.completion",
"created": 1748000000,
"model": "openai/gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The meaning of life is a deeply philosophical question..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 15,
"completion_tokens": 83,
"total_tokens": 98
}
}
Field-by-field breakdown
| Field | Type | Description |
|---|---|---|
id | string | A unique identifier for this completion. Useful for logging and debugging. |
object | string | Always "chat.completion" for non-streaming requests. Streaming chunks use "chat.completion.chunk". |
created | number | Unix timestamp (seconds since epoch) of when the completion was generated. |
model | string | The exact model that served the request, including the provider prefix (e.g. openai/gpt-4o-mini). |
choices | array | An array of completion candidates. By default there is one element. Set n in your request to get multiple. |
choices[n].index | number | Zero-based index of this choice in the array. |
choices[n].message.role | string | Always "assistant" for responses. |
choices[n].message.content | string | null | The text generated by the model. null when the model calls a tool instead of responding with text. |
choices[n].finish_reason | string | Why the model stopped. See the table below. |
usage.prompt_tokens | number | Number of tokens consumed by your input messages. |
usage.completion_tokens | number | Number of tokens generated by the model. |
usage.total_tokens | number | Sum of prompt and completion tokens. This is what gets billed. |
finish_reason values
| Value | Meaning |
|---|---|
stop | The model reached a natural stopping point or a stop sequence you specified. |
length | The output was cut off because it hit the max_tokens limit. Increase max_tokens if you need longer responses. |
tool_calls | The model decided to call one or more tools instead of responding with text. |
content_filter | The output was blocked by a content policy. |
Accessing the response in TypeScript
The OpenAI SDK ships with full TypeScript types, so you get autocomplete and safety for free:
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.solrouter.io/v1",
apiKey: process.env.SOLROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "openai/gpt-4o-mini",
messages: [{ role: "user", content: "What is the meaning of life?" }],
});
const message = completion.choices[0].message.content; // string | null
const tokenCount = completion.usage?.total_tokens; // number | undefined
const reason = completion.choices[0].finish_reason; // "stop" | "length" | ...
console.log(`Response: ${message}`);
console.log(`Tokens used: ${tokenCount}`);
console.log(`Finished because: ${reason}`);
7. Pick a model
Replace openai/gpt-4o-mini with any model ID from the Models page. Model IDs follow the provider/model-name format, for example:
| Model | ID |
|---|---|
| GPT-4o | openai/gpt-4o |
| GPT-4o mini | openai/gpt-4o-mini |
| Claude Sonnet 4 | anthropic/claude-sonnet-4 |
| Claude Haiku 3.5 | anthropic/claude-haiku-3-5 |
| Gemini 2.5 Pro | google/gemini-2.5-pro |
| Gemini 2.5 Flash | google/gemini-2.5-flash |
| Llama 4 Maverick | meta-llama/llama-4-maverick |
| DeepSeek R2 | deepseek/deepseek-r2 |
Browse the full list and pricing on the Models page.
8. System prompts
A system prompt is a special message with role: "system" that you place at the beginning of the messages array. It lets you give the model persistent instructions that shape its personality, tone, output format, or area of expertise — without those instructions appearing in the conversation visible to the user.
System messages are supported by all major models. If a model does not natively support a system role, SolRouter automatically converts it to a leading user message so your code does not need to change.
Basic system prompt
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.solrouter.io/v1",
apiKey: process.env.SOLROUTER_API_KEY,
});
const completion = await client.chat.completions.create({
model: "openai/gpt-4o-mini",
messages: [
{
role: "system",
content:
"You are a concise technical assistant. Always respond in plain text. " +
"Keep answers under 3 sentences. Do not use bullet points or markdown.",
},
{
role: "user",
content: "What is a closure in JavaScript?",
},
],
});
console.log(completion.choices[0].message.content);
Practical example — customer support bot
Here is a more realistic system prompt you might use in a product:
const SYSTEM_PROMPT = `
You are Aria, a friendly and knowledgeable support assistant for Acme Corp.
Your job is to help users with questions about their orders, account settings, and product features.
Guidelines:
- Always greet users by name if they provide it.
- If you do not know the answer, say so clearly and offer to escalate to a human agent.
- Never make up order numbers, dates, or policy details.
- Keep responses short and scannable. Use bullet points for lists of steps.
- Do not discuss topics unrelated to Acme Corp products.
`.trim();
const completion = await client.chat.completions.create({
model: "anthropic/claude-haiku-3-5",
messages: [
{ role: "system", content: SYSTEM_PROMPT },
{ role: "user", content: "Hi! My order hasn't arrived yet. Order #AX-99812." },
],
});
Tips for writing effective system prompts
| Tip | Example |
|---|---|
| Assign a persona | "You are a senior data engineer specializing in SQL." |
| Specify output format | "Always return valid JSON. Never include prose outside the JSON object." |
| Set length constraints | "Limit all responses to 100 words or fewer." |
| Define what NOT to do | "Do not speculate. If uncertain, say 'I don't know'." |
| Provide context/data | Inject user preferences, product info, or relevant documents into the system prompt at request time. |
9. Conversation history
LLMs are stateless — they have no memory of previous requests. To build a multi-turn conversation, you must send the full history of messages (both user turns and assistant replies) on every request.
The pattern is simple:
- Start with your system prompt and the user's first message.
- After each response, append the assistant's reply to your local messages array.
- When the user sends a new message, append it and send the whole array again.
TypeScript example
import OpenAI from "openai";
import * as readline from "readline";
const client = new OpenAI({
baseURL: "https://api.solrouter.io/v1",
apiKey: process.env.SOLROUTER_API_KEY,
});
type Message = {
role: "system" | "user" | "assistant";
content: string;
};
// Seed the conversation with a system prompt
const messages: Message[] = [
{
role: "system",
content: "You are a helpful assistant. Be concise and friendly.",
},
];
async function chat(userInput: string): Promise<string> {
// 1. Append the new user message
messages.push({ role: "user", content: userInput });
// 2. Send the full history to the API
const completion = await client.chat.completions.create({
model: "openai/gpt-4o-mini",
messages,
});
const assistantMessage = completion.choices[0].message.content ?? "";
// 3. Append the assistant's reply so the next turn has full context
messages.push({ role: "assistant", content: assistantMessage });
return assistantMessage;
}
// Simple interactive REPL for demonstration
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
function prompt() {
rl.question("You: ", async (input) => {
if (input.toLowerCase() === "exit") {
rl.close();
return;
}
const reply = await chat(input);
console.log(`\nAssistant: ${reply}\n`);
prompt();
});
}
console.log('Chat started. Type "exit" to quit.\n');
prompt();
Managing context length
Every model has a maximum context window — the total number of tokens it can process in a single request (prompt + completion combined). As a conversation grows, it will eventually exceed this limit.
Common strategies to handle long conversations:
| Strategy | How it works | Trade-off |
|---|---|---|
| Sliding window | Keep only the last N messages, always preserving the system prompt. | Simple but the model loses early context. |
| Summarization | When the window fills up, ask the model to summarize the conversation so far, then replace old messages with the summary. | Retains key facts but adds latency and cost. |
| Selective pruning | Remove messages that are unlikely to be relevant (e.g. tool call results after they've been acted on). | Requires domain-specific logic. |
Increase max_tokens | Use a model with a larger context window. | Higher cost per request. |
Sliding window example
const MAX_HISTORY_MESSAGES = 20; // keep last 20 turns (10 user + 10 assistant)
function trimHistory(messages: Message[]): Message[] {
const [systemPrompt, ...rest] = messages;
const trimmed = rest.slice(-MAX_HISTORY_MESSAGES);
return [systemPrompt, ...trimmed];
}
// Use trimHistory before each API call:
const completion = await client.chat.completions.create({
model: "openai/gpt-4o-mini",
messages: trimHistory(messages),
});
10. Troubleshooting common errors
When something goes wrong, the API returns a standard HTTP status code along with a JSON body describing the error:
{
"error": {
"message": "Invalid API key provided.",
"type": "authentication_error",
"code": "invalid_api_key"
}
}
Error reference
| HTTP Status | type | Likely cause | How to fix |
|---|---|---|---|
401 Unauthorized | authentication_error | Missing, malformed, or revoked API key. | Check that you are sending the header Authorization: Bearer sr_YOUR_KEY and that the key has not been deleted. |
400 Bad Request | invalid_request_error | The request body is malformed — missing required fields, wrong types, or an invalid model ID. | Inspect the error.message field; it usually names the offending parameter. Verify the model ID against the Models page. |
402 Payment Required | insufficient_credits | Your account balance has run out. | Top up your balance in Account → Balance. |
429 Too Many Requests | rate_limit_error | You have exceeded your requests-per-minute or tokens-per-minute limit. | Implement exponential backoff and retry. Consider upgrading your plan for higher rate limits. |
502 Bad Gateway | upstream_error | The upstream model provider returned an error or timed out. | Retry the request. If the issue persists for a specific model, check the status page or try an equivalent model from a different provider. |
Implementing retry with exponential backoff
Transient errors (especially 429 and 502) are best handled with a retry loop that waits longer between each attempt:
async function chatWithRetry(
client: OpenAI,
params: OpenAI.Chat.ChatCompletionCreateParamsNonStreaming,
maxRetries = 4,
): Promise<OpenAI.Chat.ChatCompletion> {
let attempt = 0;
while (true) {
try {
return await client.chat.completions.create(params);
} catch (err: unknown) {
const status =
err instanceof OpenAI.APIError ? err.status : undefined;
const isRetryable = status === 429 || status === 502 || status === 503;
if (!isRetryable || attempt >= maxRetries) {
throw err;
}
const delayMs = Math.min(1000 * 2 ** attempt + Math.random() * 500, 30_000);
console.warn(`Attempt ${attempt + 1} failed (${status}). Retrying in ${Math.round(delayMs)}ms...`);
await new Promise((resolve) => setTimeout(resolve, delayMs));
attempt++;
}
}
}
The OpenAI SDK also has a built-in
maxRetriesoption that handles this automatically. Set it when constructing the client:new OpenAI({ maxRetries: 3, ... }).
Next steps
- Authentication — understand API keys, scopes, and security best practices
- Models — full model catalogue with context lengths and pricing
- API Reference — complete request and response schemas
- Streaming — real-time token-by-token responses via SSE
- Errors — error codes and how to handle them