SolRouter

Python Guide

This guide shows how to use SolRouter from Python for chat, streaming, structured output, tool calling, and multimodal workflows.

SolRouter exposes an OpenAI-compatible API, which means you can use either:

the official openai Python SDK
plain httpx or requests
validation libraries like Pydantic
async frameworks such as FastAPI

Base URL

https://api.solrouter.io/ai

Installation and environment setup

Install the OpenAI Python SDK:

pip install openai

Set your API key in an environment variable:

export SOLROUTER_API_KEY=sr_your_api_key

Or load it from a .env file with python-dotenv:

pip install python-dotenv

from dotenv import load_dotenv
import os

load_dotenv()

api_key = os.environ["SOLROUTER_API_KEY"]

Basic chat completion

The simplest way to call SolRouter from Python is through the OpenAI SDK.

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://api.solrouter.io/ai",
    api_key=os.environ["SOLROUTER_API_KEY"],
)

completion = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a concise assistant."},
        {"role": "user", "content": "Explain what an API gateway does."},
    ],
)

print(completion.choices[0].message.content)
print(completion.usage)

What changed from a standard OpenAI setup

Only two things:

base_url="https://api.solrouter.io/ai"
your SolRouter key with the sr_ prefix

Everything else stays effectively the same.

Using `httpx` directly

If you prefer not to use the SDK, you can call the API with httpx.

Install it:

pip install httpx

Then send a request:

import httpx
import os

url = "https://api.solrouter.io/ai/chat/completions"

headers = {
    "Authorization": f"Bearer {os.environ['SOLROUTER_API_KEY']}",
    "Content-Type": "application/json",
}

payload = {
    "model": "anthropic/claude-sonnet-4",
    "messages": [
        {"role": "user", "content": "Write a one-sentence summary of structured output."}
    ],
}

response = httpx.post(url, headers=headers, json=payload, timeout=60.0)
response.raise_for_status()

data = response.json()

print(data["choices"][0]["message"]["content"])
print(data["usage"])

This is useful when you want full control over HTTP behavior, timeouts, retries, or custom middleware.

Inspecting the response

A typical successful response includes:

choices
message.content
finish_reason
usage

Example:

completion = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[
        {"role": "user", "content": "Say hello in one sentence."}
    ],
)

message = completion.choices[0].message.content
finish_reason = completion.choices[0].finish_reason
usage = completion.usage

print("Message:", message)
print("Finish reason:", finish_reason)
print("Prompt tokens:", usage.prompt_tokens)
print("Completion tokens:", usage.completion_tokens)
print("Total tokens:", usage.total_tokens)

If supported by the selected model and route, usage may also include a cost field in the raw response payload.

Streaming responses

Streaming is ideal for chat interfaces, long generations, and responsive UIs.

Streaming with the OpenAI SDK

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://api.solrouter.io/ai",
    api_key=os.environ["SOLROUTER_API_KEY"],
)

stream = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    stream=True,
    messages=[
        {"role": "user", "content": "Write a short paragraph about low-latency APIs."}
    ],
)

for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)

Streaming with `httpx`

import json
import os
import httpx

url = "https://api.solrouter.io/ai/chat/completions"

headers = {
    "Authorization": f"Bearer {os.environ['SOLROUTER_API_KEY']}",
    "Content-Type": "application/json",
    "Accept": "text/event-stream",
}

payload = {
    "model": "openai/gpt-4o-mini",
    "stream": True,
    "messages": [
        {"role": "user", "content": "Explain streaming in two sentences."}
    ],
}

full_text = ""

with httpx.stream("POST", url, headers=headers, json=payload, timeout=60.0) as response:
    response.raise_for_status()

    for line in response.iter_lines():
        if not line or not line.startswith("data: "):
            continue

        data = line[6:]

        if data == "[DONE]":
            break

        chunk = json.loads(data)
        delta = chunk.get("choices", [{}])[0].get("delta", {}).get("content", "")

        if delta:
            full_text += delta
            print(delta, end="", flush=True)

print("\n")
print("Final text:", full_text)

Practical streaming tips

always check the initial HTTP status before reading the stream
handle partial output gracefully
do not assume every network chunk contains a complete JSON object
expect the final usage data near the end of the stream, not the beginning

For deeper streaming behavior, see Streaming.

Structured output with Pydantic

Structured output is one of the strongest Python integration patterns because it combines model generation with runtime validation.

Install Pydantic if you do not already have it:

pip install pydantic

Example: extracting a typed contact record

from openai import OpenAI
from pydantic import BaseModel, EmailStr
import json
import os

class ContactRecord(BaseModel):
    name: str
    email: EmailStr
    company: str

client = OpenAI(
    base_url="https://api.solrouter.io/ai",
    api_key=os.environ["SOLROUTER_API_KEY"],
)

completion = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": "Extract name, email, and company from: Sarah Chen, sarah@acme.io, Acme Labs"
        }
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "contact_record",
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "email": {"type": "string"},
                    "company": {"type": "string"}
                },
                "required": ["name", "email", "company"],
                "additionalProperties": False
            }
        }
    }
)

raw = completion.choices[0].message.content or "{}"
parsed = ContactRecord.model_validate(json.loads(raw))

print(parsed)

Why this pattern is strong

the model is guided into a predictable shape
your application validates the result before use
malformed output fails early and safely
the validated object can go straight into business logic

For more detail, see Structured Output.

Tool calling in Python

Tool calling lets the model request a function that your Python application executes.

Example workflow

from openai import OpenAI
import json
import os

client = OpenAI(
    base_url="https://api.solrouter.io/ai",
    api_key=os.environ["SOLROUTER_API_KEY"],
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Returns the current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string"}
                },
                "required": ["city"],
                "additionalProperties": False
            }
        }
    }
]

def get_weather(city: str) -> dict:
    return {
        "city": city,
        "temperature_c": 18,
        "condition": "Cloudy",
        "wind_kph": 12,
    }

first = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[
        {"role": "user", "content": "What's the weather in Berlin?"}
    ],
    tools=tools,
)

assistant_message = first.choices[0].message
tool_call = assistant_message.tool_calls[0]

args = json.loads(tool_call.function.arguments)
tool_result = get_weather(args["city"])

second = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[
        {"role": "user", "content": "What's the weather in Berlin?"},
        assistant_message,
        {
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": json.dumps(tool_result),
        },
    ],
    tools=tools,
)

print(second.choices[0].message.content)

Best practices for Python tool calling

validate arguments before executing a function
never dynamically dispatch arbitrary tool names without checks
return structured JSON for tool results
keep tools focused and narrow
run tools server-side, not in public client code

For the full workflow, see Tool Calling.

Vision and multimodal requests

Python is a great fit for extraction workflows involving images, documents, and other media.

Example with a remote image

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://api.solrouter.io/ai",
    api_key=os.environ["SOLROUTER_API_KEY"],
)

completion = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Extract the invoice number and total from this image."
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/invoice.jpg",
                        "detail": "high"
                    }
                }
            ]
        }
    ]
)

print(completion.choices[0].message.content)
print(completion.usage)

Example with a local image file

import base64
from openai import OpenAI
import os

client = OpenAI(
    base_url="https://api.solrouter.io/ai",
    api_key=os.environ["SOLROUTER_API_KEY"],
)

with open("invoice.jpg", "rb") as f:
    encoded = base64.b64encode(f.read()).decode("utf-8")

data_url = f"data:image/jpeg;base64,{encoded}"

completion = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Extract the invoice number and total from this image."
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": data_url,
                        "detail": "high"
                    }
                }
            ]
        }
    ]
)

print(completion.choices[0].message.content)

When to use multimodal from Python

Python is especially strong for:

OCR pipelines
document extraction
batch processing
data enrichment
backend automation jobs
ETL-style AI workflows

For modality-specific details, see Vision & Multimodal.

Building a reusable Python client wrapper

In a real application, it helps to centralize client setup and request defaults.

from openai import OpenAI
import os

class SolRouterClient:
    def __init__(self):
        api_key = os.environ.get("SOLROUTER_API_KEY")
        if not api_key:
            raise RuntimeError("Missing SOLROUTER_API_KEY")

        self.client = OpenAI(
            base_url="https://api.solrouter.io/ai",
            api_key=api_key,
        )

    def chat(self, prompt: str, model: str = "openai/gpt-4o-mini") -> str:
        completion = self.client.chat.completions.create(
            model=model,
            messages=[
                {"role": "user", "content": prompt}
            ],
        )
        return completion.choices[0].message.content or ""

solrouter = SolRouterClient()
print(solrouter.chat("Explain what a context window is."))

You can extend this wrapper with:

retries
logging
tracing
default system prompts
schema validation helpers
multimodal utilities
rate limiting

Retry strategy with `httpx`

Transient failures like 429, 500, 502, and 503 should usually be retried with exponential backoff.

import time
import httpx
import os

def request_with_retry(payload: dict, retries: int = 3):
    url = "https://api.solrouter.io/ai/chat/completions"
    headers = {
        "Authorization": f"Bearer {os.environ['SOLROUTER_API_KEY']}",
        "Content-Type": "application/json",
    }

    attempt = 0

    while True:
        response = httpx.post(url, headers=headers, json=payload, timeout=60.0)

        if response.status_code < 400:
            return response

        retryable = response.status_code in {408, 429, 500, 502, 503}

        if not retryable:
            return response

        attempt += 1
        if attempt > retries:
            return response

        delay = 0.5 * (2 ** (attempt - 1))
        time.sleep(delay)

payload = {
    "model": "openai/gpt-4o-mini",
    "messages": [
        {"role": "user", "content": "Give me three short tips for writing CLI tools."}
    ],
}

response = request_with_retry(payload)
print(response.status_code)
print(response.json())

When not to retry

Do not automatically retry:

malformed requests
authentication failures
insufficient balance errors
invalid schemas
model-not-found errors

For more detail, see Errors.

FastAPI integration pattern

Python teams often use SolRouter inside FastAPI services.

Install FastAPI and Uvicorn:

pip install fastapi uvicorn openai

Example FastAPI endpoint

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from openai import OpenAI
import os

app = FastAPI()

client = OpenAI(
    base_url="https://api.solrouter.io/ai",
    api_key=os.environ["SOLROUTER_API_KEY"],
)

class ChatRequest(BaseModel):
    prompt: str

@app.post("/chat")
def chat(req: ChatRequest):
    try:
        completion = client.chat.completions.create(
            model="openai/gpt-4o-mini",
            messages=[
                {"role": "user", "content": req.prompt}
            ],
        )
        return {
            "content": completion.choices[0].message.content,
            "usage": completion.usage,
        }
    except Exception as exc:
        raise HTTPException(status_code=500, detail=str(exc))

Run it with:

uvicorn main:app --reload

This pattern is useful when you want:

your own backend auth
server-side API key isolation
request logging
internal rate limiting
usage metering
workflow orchestration

Common mistakes

1. Forgetting `base_url`

If you omit the SolRouter base URL, the SDK will target the default provider instead of SolRouter.

Wrong:

client = OpenAI(api_key=os.environ["SOLROUTER_API_KEY"])

Correct:

client = OpenAI(
    base_url="https://api.solrouter.io/ai",
    api_key=os.environ["SOLROUTER_API_KEY"],
)

2. Not validating structured output

Always validate JSON output before using it in your business logic.

3. Treating tool arguments as trusted

Tool call arguments are model-generated and must be validated.

4. Sending unsupported modalities to text-only models

Always verify that the selected model supports image, file, audio, or video input.

5. Logging secrets accidentally

Do not log:

raw API keys
bearer headers
private documents
user-uploaded sensitive media
full sensitive prompts unless strictly necessary

6. Retrying non-retryable failures

Do not keep retrying malformed payloads or insufficient-balance errors.

7. Ignoring token usage

Python data pipelines can quietly become expensive if you do not track usage and cost.

Recommended production pattern

A strong Python production stack often looks like this:

OpenAI SDK for API compatibility and convenience
Pydantic for validation
httpx where lower-level control is needed
FastAPI for service integration
environment variables for secrets
server-side execution only for private workflows
structured logging for error diagnostics
retry with backoff for transient failures

A practical architecture

FastAPI / worker / job runner
        ↓
  validated request payload
        ↓
  SolRouter client wrapper
        ↓
 https://api.solrouter.io/ai
        ↓
 parsed + validated response
        ↓
 database / API / UI

Minimal robust helper

from openai import OpenAI
import os
import json

class ChatFailure(Exception):
    pass

class SolRouter:
    def __init__(self):
        api_key = os.environ.get("SOLROUTER_API_KEY")
        if not api_key:
            raise RuntimeError("Missing SOLROUTER_API_KEY")

        self.client = OpenAI(
            base_url="https://api.solrouter.io/ai",
            api_key=api_key,
        )

    def chat(self, messages, model="openai/gpt-4o-mini"):
        try:
            completion = self.client.chat.completions.create(
                model=model,
                messages=messages,
            )
            return {
                "content": completion.choices[0].message.content,
                "usage": completion.usage,
            }
        except Exception as exc:
            raise ChatFailure(str(exc)) from exc

solrouter = SolRouter()

result = solrouter.chat([
    {"role": "system", "content": "You are a concise assistant."},
    {"role": "user", "content": "Explain what retry with backoff means."},
])

print(json.dumps(result, default=str, indent=2))

This gives you one clean place to:

set defaults
add metrics
normalize exceptions
inject tracing
attach retry logic
standardize logging

Next steps

API Reference — complete request and response schema
Streaming — SSE handling and partial-output patterns
Tool Calling — full function execution flow
Structured Output — JSON mode and schema-constrained parsing
Vision & Multimodal — images, files, audio, and video
Errors — retries, rate limits, and debugging

SOLROUTER