Quickstart


Get your first response from any LLM in under 2 minutes.


1. Create an account

Go to solrouter.io and sign up. You can register with an email and password or continue with GitHub OAuth.


2. Get an API key

Once logged in, open the Account page and navigate to API Keys. Click Create key, give it a name, and copy the key — it starts with sr_ and is shown only once.

Keep your key secret. Anyone who has it can make requests charged to your balance.


3. Top up your balance

Go to the Account → Balance section and add credits. Payments are accepted via card or crypto. Credits never expire and are deducted only when you actually use tokens.


4. Set up your environment variables

Hard-coding API keys directly in source code is a security risk. The recommended approach is to store your key in an environment variable and load it at runtime.

Creating a .env file

Create a .env file in the root of your project:

SOLROUTER_API_KEY=sr_YOUR_API_KEY

Always add .env to your .gitignore so it is never committed to version control:

echo ".env" >> .gitignore

Loading the key in Node.js

Node.js 20.6+ can load .env files natively with the --env-file flag:

node --env-file=.env index.js

For older Node.js versions, install the dotenv package:

npm install dotenv

Then load it at the top of your entry file before any other imports:

import "dotenv/config";
// or: require("dotenv").config();

const apiKey = process.env.SOLROUTER_API_KEY;

Loading the key in Python

Install python-dotenv:

pip install python-dotenv

Then load the .env file at the start of your script:

from dotenv import load_dotenv
import os

load_dotenv()

api_key = os.environ["SOLROUTER_API_KEY"]

Loading the key in Next.js

Next.js has built-in .env support — no extra packages needed. Create a .env.local file (which is already git-ignored by the Next.js default .gitignore):

SOLROUTER_API_KEY=sr_YOUR_API_KEY

Access it in server-side code (API routes, Server Actions, Route Handlers):

const apiKey = process.env.SOLROUTER_API_KEY;

Important: Never expose your key to the browser. In Next.js, only variables prefixed with NEXT_PUBLIC_ are bundled into the client. Your SOLROUTER_API_KEY must not have that prefix.


5. Make your first request

The SolRouter API is fully OpenAI-compatible. Point your existing code at https://api.solrouter.io/v1 and swap in your sr_ key — nothing else needs to change.

Using the OpenAI SDK

npm install openai
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.solrouter.io/v1",
  apiKey: process.env.SOLROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "openai/gpt-4o-mini",
  messages: [
    {
      role: "user",
      content: "What is the meaning of life?",
    },
  ],
});

console.log(completion.choices[0].message.content);

Using fetch directly

const response = await fetch("https://api.solrouter.io/v1/chat/completions", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.SOLROUTER_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "openai/gpt-4o-mini",
    messages: [
      {
        role: "user",
        content: "What is the meaning of life?",
      },
    ],
  }),
});

const data = await response.json();
console.log(data.choices[0].message.content);

Using Python

from openai import OpenAI
import os
from dotenv import load_dotenv

load_dotenv()

client = OpenAI(
    base_url="https://api.solrouter.io/v1",
    api_key=os.environ["SOLROUTER_API_KEY"],
)

completion = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": "What is the meaning of life?",
        }
    ],
)

print(completion.choices[0].message.content)

Using curl

curl https://api.solrouter.io/v1/chat/completions \
  -H "Authorization: Bearer $SOLROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "What is the meaning of life?"
      }
    ]
  }'

6. Understanding the response

Every successful call to /v1/chat/completions returns a JSON object. Here is a fully annotated example:

{
  "id": "chatcmpl-a1b2c3d4e5f6",
  "object": "chat.completion",
  "created": 1748000000,
  "model": "openai/gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The meaning of life is a deeply philosophical question..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 83,
    "total_tokens": 98
  }
}

Field-by-field breakdown

FieldTypeDescription
idstringA unique identifier for this completion. Useful for logging and debugging.
objectstringAlways "chat.completion" for non-streaming requests. Streaming chunks use "chat.completion.chunk".
creatednumberUnix timestamp (seconds since epoch) of when the completion was generated.
modelstringThe exact model that served the request, including the provider prefix (e.g. openai/gpt-4o-mini).
choicesarrayAn array of completion candidates. By default there is one element. Set n in your request to get multiple.
choices[n].indexnumberZero-based index of this choice in the array.
choices[n].message.rolestringAlways "assistant" for responses.
choices[n].message.contentstring | nullThe text generated by the model. null when the model calls a tool instead of responding with text.
choices[n].finish_reasonstringWhy the model stopped. See the table below.
usage.prompt_tokensnumberNumber of tokens consumed by your input messages.
usage.completion_tokensnumberNumber of tokens generated by the model.
usage.total_tokensnumberSum of prompt and completion tokens. This is what gets billed.

finish_reason values

ValueMeaning
stopThe model reached a natural stopping point or a stop sequence you specified.
lengthThe output was cut off because it hit the max_tokens limit. Increase max_tokens if you need longer responses.
tool_callsThe model decided to call one or more tools instead of responding with text.
content_filterThe output was blocked by a content policy.

Accessing the response in TypeScript

The OpenAI SDK ships with full TypeScript types, so you get autocomplete and safety for free:

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.solrouter.io/v1",
  apiKey: process.env.SOLROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "openai/gpt-4o-mini",
  messages: [{ role: "user", content: "What is the meaning of life?" }],
});

const message = completion.choices[0].message.content; // string | null
const tokenCount = completion.usage?.total_tokens;      // number | undefined
const reason = completion.choices[0].finish_reason;     // "stop" | "length" | ...

console.log(`Response: ${message}`);
console.log(`Tokens used: ${tokenCount}`);
console.log(`Finished because: ${reason}`);

7. Pick a model

Replace openai/gpt-4o-mini with any model ID from the Models page. Model IDs follow the provider/model-name format, for example:

ModelID
GPT-4oopenai/gpt-4o
GPT-4o miniopenai/gpt-4o-mini
Claude Sonnet 4anthropic/claude-sonnet-4
Claude Haiku 3.5anthropic/claude-haiku-3-5
Gemini 2.5 Progoogle/gemini-2.5-pro
Gemini 2.5 Flashgoogle/gemini-2.5-flash
Llama 4 Maverickmeta-llama/llama-4-maverick
DeepSeek R2deepseek/deepseek-r2

Browse the full list and pricing on the Models page.


8. System prompts

A system prompt is a special message with role: "system" that you place at the beginning of the messages array. It lets you give the model persistent instructions that shape its personality, tone, output format, or area of expertise — without those instructions appearing in the conversation visible to the user.

System messages are supported by all major models. If a model does not natively support a system role, SolRouter automatically converts it to a leading user message so your code does not need to change.

Basic system prompt

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.solrouter.io/v1",
  apiKey: process.env.SOLROUTER_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "openai/gpt-4o-mini",
  messages: [
    {
      role: "system",
      content:
        "You are a concise technical assistant. Always respond in plain text. " +
        "Keep answers under 3 sentences. Do not use bullet points or markdown.",
    },
    {
      role: "user",
      content: "What is a closure in JavaScript?",
    },
  ],
});

console.log(completion.choices[0].message.content);

Practical example — customer support bot

Here is a more realistic system prompt you might use in a product:

const SYSTEM_PROMPT = `
You are Aria, a friendly and knowledgeable support assistant for Acme Corp.
Your job is to help users with questions about their orders, account settings, and product features.

Guidelines:
- Always greet users by name if they provide it.
- If you do not know the answer, say so clearly and offer to escalate to a human agent.
- Never make up order numbers, dates, or policy details.
- Keep responses short and scannable. Use bullet points for lists of steps.
- Do not discuss topics unrelated to Acme Corp products.
`.trim();

const completion = await client.chat.completions.create({
  model: "anthropic/claude-haiku-3-5",
  messages: [
    { role: "system", content: SYSTEM_PROMPT },
    { role: "user", content: "Hi! My order hasn't arrived yet. Order #AX-99812." },
  ],
});

Tips for writing effective system prompts

TipExample
Assign a persona"You are a senior data engineer specializing in SQL."
Specify output format"Always return valid JSON. Never include prose outside the JSON object."
Set length constraints"Limit all responses to 100 words or fewer."
Define what NOT to do"Do not speculate. If uncertain, say 'I don't know'."
Provide context/dataInject user preferences, product info, or relevant documents into the system prompt at request time.

9. Conversation history

LLMs are stateless — they have no memory of previous requests. To build a multi-turn conversation, you must send the full history of messages (both user turns and assistant replies) on every request.

The pattern is simple:

  1. Start with your system prompt and the user's first message.
  2. After each response, append the assistant's reply to your local messages array.
  3. When the user sends a new message, append it and send the whole array again.

TypeScript example

import OpenAI from "openai";
import * as readline from "readline";

const client = new OpenAI({
  baseURL: "https://api.solrouter.io/v1",
  apiKey: process.env.SOLROUTER_API_KEY,
});

type Message = {
  role: "system" | "user" | "assistant";
  content: string;
};

// Seed the conversation with a system prompt
const messages: Message[] = [
  {
    role: "system",
    content: "You are a helpful assistant. Be concise and friendly.",
  },
];

async function chat(userInput: string): Promise<string> {
  // 1. Append the new user message
  messages.push({ role: "user", content: userInput });

  // 2. Send the full history to the API
  const completion = await client.chat.completions.create({
    model: "openai/gpt-4o-mini",
    messages,
  });

  const assistantMessage = completion.choices[0].message.content ?? "";

  // 3. Append the assistant's reply so the next turn has full context
  messages.push({ role: "assistant", content: assistantMessage });

  return assistantMessage;
}

// Simple interactive REPL for demonstration
const rl = readline.createInterface({
  input: process.stdin,
  output: process.stdout,
});

function prompt() {
  rl.question("You: ", async (input) => {
    if (input.toLowerCase() === "exit") {
      rl.close();
      return;
    }
    const reply = await chat(input);
    console.log(`\nAssistant: ${reply}\n`);
    prompt();
  });
}

console.log('Chat started. Type "exit" to quit.\n');
prompt();

Managing context length

Every model has a maximum context window — the total number of tokens it can process in a single request (prompt + completion combined). As a conversation grows, it will eventually exceed this limit.

Common strategies to handle long conversations:

StrategyHow it worksTrade-off
Sliding windowKeep only the last N messages, always preserving the system prompt.Simple but the model loses early context.
SummarizationWhen the window fills up, ask the model to summarize the conversation so far, then replace old messages with the summary.Retains key facts but adds latency and cost.
Selective pruningRemove messages that are unlikely to be relevant (e.g. tool call results after they've been acted on).Requires domain-specific logic.
Increase max_tokensUse a model with a larger context window.Higher cost per request.

Sliding window example

const MAX_HISTORY_MESSAGES = 20; // keep last 20 turns (10 user + 10 assistant)

function trimHistory(messages: Message[]): Message[] {
  const [systemPrompt, ...rest] = messages;
  const trimmed = rest.slice(-MAX_HISTORY_MESSAGES);
  return [systemPrompt, ...trimmed];
}

// Use trimHistory before each API call:
const completion = await client.chat.completions.create({
  model: "openai/gpt-4o-mini",
  messages: trimHistory(messages),
});

10. Troubleshooting common errors

When something goes wrong, the API returns a standard HTTP status code along with a JSON body describing the error:

{
  "error": {
    "message": "Invalid API key provided.",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}

Error reference

HTTP StatustypeLikely causeHow to fix
401 Unauthorizedauthentication_errorMissing, malformed, or revoked API key.Check that you are sending the header Authorization: Bearer sr_YOUR_KEY and that the key has not been deleted.
400 Bad Requestinvalid_request_errorThe request body is malformed — missing required fields, wrong types, or an invalid model ID.Inspect the error.message field; it usually names the offending parameter. Verify the model ID against the Models page.
402 Payment Requiredinsufficient_creditsYour account balance has run out.Top up your balance in Account → Balance.
429 Too Many Requestsrate_limit_errorYou have exceeded your requests-per-minute or tokens-per-minute limit.Implement exponential backoff and retry. Consider upgrading your plan for higher rate limits.
502 Bad Gatewayupstream_errorThe upstream model provider returned an error or timed out.Retry the request. If the issue persists for a specific model, check the status page or try an equivalent model from a different provider.

Implementing retry with exponential backoff

Transient errors (especially 429 and 502) are best handled with a retry loop that waits longer between each attempt:

async function chatWithRetry(
  client: OpenAI,
  params: OpenAI.Chat.ChatCompletionCreateParamsNonStreaming,
  maxRetries = 4,
): Promise<OpenAI.Chat.ChatCompletion> {
  let attempt = 0;

  while (true) {
    try {
      return await client.chat.completions.create(params);
    } catch (err: unknown) {
      const status =
        err instanceof OpenAI.APIError ? err.status : undefined;

      const isRetryable = status === 429 || status === 502 || status === 503;

      if (!isRetryable || attempt >= maxRetries) {
        throw err;
      }

      const delayMs = Math.min(1000 * 2 ** attempt + Math.random() * 500, 30_000);
      console.warn(`Attempt ${attempt + 1} failed (${status}). Retrying in ${Math.round(delayMs)}ms...`);
      await new Promise((resolve) => setTimeout(resolve, delayMs));
      attempt++;
    }
  }
}

The OpenAI SDK also has a built-in maxRetries option that handles this automatically. Set it when constructing the client: new OpenAI({ maxRetries: 3, ... }).


Next steps

  • Authentication — understand API keys, scopes, and security best practices
  • Models — full model catalogue with context lengths and pricing
  • API Reference — complete request and response schemas
  • Streaming — real-time token-by-token responses via SSE
  • Errors — error codes and how to handle them