SolRouter

LangChain Guide

This guide shows how to use SolRouter with LangChain for chat applications, chains, structured extraction, agents, tool calling, streaming, and retrieval workflows.

Because SolRouter exposes an OpenAI-compatible API, LangChain integrations are straightforward: you point your model client at the SolRouter base URL and authenticate with your sr_ API key.

Base URL

https://api.solrouter.io/ai

Core integration idea

LangChain is responsible for orchestration:

prompt templates
memory
retrieval
tools
agents
output parsing

SolRouter is responsible for model routing:

unified access to many model providers
consistent API surface
token usage accounting
model fallback
one authentication scheme

That means your LangChain code can stay mostly the same while you switch models by changing the model value.

Installation

Install the packages you need:

npm install @langchain/openai @langchain/core openai

If you plan to use Zod-based structured output:

npm install zod

If you want retrieval or vector workflows, add the LangChain packages relevant to your stack.

Environment setup

Store your API key in an environment variable:

SOLROUTER_API_KEY=sr_your_api_key

Load it in your app the normal way for your runtime.

Node.js example

const apiKey = process.env.SOLROUTER_API_KEY;

if (!apiKey) {
  throw new Error("Missing SOLROUTER_API_KEY");
}

Keep your SolRouter API key server-side. Do not expose it in browser bundles or public frontend code.

Basic ChatOpenAI setup

For most LangChain use cases, the simplest integration is ChatOpenAI.

import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "openai/gpt-4o-mini",
  temperature: 0.2,
});

Why this works

LangChain's OpenAI-compatible chat model only needs:

an API key
a base URL
a model name

SolRouter provides all three:

sr_... key
https://api.solrouter.io/ai
model IDs like openai/gpt-4o-mini, anthropic/claude-sonnet-4, google/gemini-2.5-pro

First LangChain invocation

import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "openai/gpt-4o-mini",
});

const response = await llm.invoke([
  {
    role: "system",
    content: "You are a concise technical assistant.",
  },
  {
    role: "user",
    content: "Explain what a routing layer does in front of language models.",
  },
]);

console.log(response.content);

This is the best starting point for:

chat interfaces
simple assistants
prompt experimentation
quick internal tools

Switching models is easy

One of the biggest benefits of using SolRouter with LangChain is that the rest of your chain code usually stays the same while the model changes.

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "anthropic/claude-sonnet-4",
});

Or:

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "google/gemini-2.5-pro",
});

Or a free model:

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "arcee-ai/trinity-mini:free",
});

This is especially useful when you want to:

compare quality across providers
optimize for price
swap in reasoning models
test fallback strategies
route different workloads to different models

Prompt templates

LangChain prompt templates work exactly as usual.

import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "openai/gpt-4o-mini",
});

const prompt = ChatPromptTemplate.fromMessages([
  ["system", "You are a helpful assistant for software teams."],
  ["human", "Explain the following concept in simple terms: {topic}"],
]);

const chain = prompt.pipe(llm);

const result = await chain.invoke({
  topic: "context windows",
});

console.log(result.content);

This pattern is ideal for:

reusable prompt logic
strongly structured app behavior
safer prompt composition
cleaner chain definitions

Output parsing

LangChain output parsers fit naturally with SolRouter responses.

Simple string output

import { StringOutputParser } from "@langchain/core/output_parsers";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "openai/gpt-4o-mini",
});

const prompt = ChatPromptTemplate.fromMessages([
  ["system", "You are concise."],
  ["human", "Summarize the idea of tool calling in one paragraph."],
]);

const chain = prompt.pipe(llm).pipe(new StringOutputParser());

const output = await chain.invoke({});
console.log(output);

Why output parsers help

They are useful when you want:

plain text normalization
JSON extraction
structured object parsing
downstream validation

Structured output with Zod

Structured extraction is one of the strongest LangChain + SolRouter patterns.

import { z } from "zod";
import { ChatOpenAI } from "@langchain/openai";

const TicketSchema = z.object({
  category: z.enum(["billing", "technical", "account"]),
  email: z.string().email(),
  summary: z.string(),
});

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "openai/gpt-4o-mini",
  temperature: 0,
});

const structuredLlm = llm.withStructuredOutput(TicketSchema);

const result = await structuredLlm.invoke(
  "Create a support ticket from: I was billed twice for my plan. My email is alex@example.com."
);

console.log(result);

Why this pattern is powerful

You get:

model generation
schema guidance
runtime validation
typed output for your app

This is ideal for:

support ticket extraction
CRM enrichment
moderation labels
invoice parsing
classification pipelines

If your LangChain version maps structured output through provider-native schema features, SolRouter still works because it exposes an OpenAI-compatible interface.

Tool calling with LangChain

LangChain tools work well with SolRouter because tool calling is exposed through standard chat completions behavior.

Example with a simple tool

import { z } from "zod";
import { tool } from "@langchain/core/tools";
import { ChatOpenAI } from "@langchain/openai";

const getWeather = tool(
  async ({ city }) => {
    return JSON.stringify({
      city,
      temperature_c: 18,
      condition: "Cloudy",
    });
  },
  {
    name: "get_weather",
    description: "Returns the current weather for a city",
    schema: z.object({
      city: z.string().describe("City name"),
    }),
  }
);

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "openai/gpt-4o-mini",
});

const llmWithTools = llm.bindTools([getWeather]);

const result = await llmWithTools.invoke(
  "What's the weather in Berlin?"
);

console.log(result);

Important note

Binding tools enables the model to request them, but your full application flow still needs to:

inspect tool calls
execute tool logic
pass tool results back where needed

Depending on your LangChain stack, that may be handled by:

a simple manual loop
an agent executor
LangGraph
custom orchestration code

For general tool semantics, see Tool Calling.

Manual tool execution loop

If you want explicit control, a manual loop is often the safest production pattern.

import { z } from "zod";
import { tool } from "@langchain/core/tools";
import { AIMessage, HumanMessage, ToolMessage } from "@langchain/core/messages";
import { ChatOpenAI } from "@langchain/openai";

const getWeather = tool(
  async ({ city }) => {
    return JSON.stringify({
      city,
      temperature_c: 18,
      condition: "Cloudy",
    });
  },
  {
    name: "get_weather",
    description: "Returns the current weather for a city",
    schema: z.object({
      city: z.string(),
    }),
  }
);

const toolsByName = {
  get_weather: getWeather,
};

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "openai/gpt-4o-mini",
});

const llmWithTools = llm.bindTools([getWeather]);

const messages = [
  new HumanMessage("What's the weather in Berlin?"),
];

const first = await llmWithTools.invoke(messages);

messages.push(first);

if (first.tool_calls?.length) {
  for (const call of first.tool_calls) {
    const toolImpl = toolsByName[call.name as keyof typeof toolsByName];

    if (!toolImpl) {
      throw new Error(`Unknown tool: ${call.name}`);
    }

    const toolResult = await toolImpl.invoke(call.args);

    messages.push(
      new ToolMessage({
        tool_call_id: call.id ?? "",
        content: typeof toolResult === "string"
          ? toolResult
          : JSON.stringify(toolResult),
      })
    );
  }

  const final = await llmWithTools.invoke(messages);
  console.log(final.content);
}

Why this pattern is good

It gives you full control over:

authorization
tool validation
logging
retries
side effects
security boundaries

Streaming with LangChain

LangChain streaming also works cleanly with SolRouter.

import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "openai/gpt-4o-mini",
});

const stream = await llm.stream([
  ["human", "Write a short paragraph about resilient APIs."],
]);

for await (const chunk of stream) {
  const text = typeof chunk.content === "string"
    ? chunk.content
    : "";
  process.stdout.write(text);
}

Streaming tips

streaming is best for chat UX and long answers
always handle partial output safely
keep prompts concise to reduce time-to-first-token
if you need usage accounting, you may still want to capture the final response state or inspect raw lower-level behavior depending on your stack

For deeper streaming details, see Streaming.

Chains and LCEL

SolRouter works well with LangChain Expression Language (LCEL).

import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { StringOutputParser } from "@langchain/core/output_parsers";

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "anthropic/claude-sonnet-4",
});

const prompt = ChatPromptTemplate.fromMessages([
  ["system", "You are an expert documentation assistant."],
  ["human", "Write a concise explanation of: {topic}"],
]);

const chain = prompt
  .pipe(llm)
  .pipe(new StringOutputParser());

const result = await chain.invoke({
  topic: "fallback routing between models",
});

console.log(result);

This is a strong default pattern for:

internal assistants
content generation
document enrichment
reusable chain pipelines

Retrieval and RAG

LangChain retrieval pipelines also work normally with SolRouter because the LLM layer is still just a chat model.

Typical RAG flow

documents → embeddings/vector store → retriever → prompt template → SolRouter-backed LLM

Example pattern

import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "openai/gpt-4o-mini",
});

const prompt = ChatPromptTemplate.fromTemplate(`
Answer the user's question using only the context below.

Context:
{context}

Question:
{question}
`);

const context = `
SolRouter is a unified LLM API layer that exposes an OpenAI-compatible endpoint.
It supports many provider-backed models and uses API keys with the sr_ prefix.
`;

const chain = prompt.pipe(llm);

const response = await chain.invoke({
  context,
  question: "How is SolRouter authenticated?",
});

console.log(response.content);

Best practices for RAG with SolRouter

choose a fast, cost-efficient model for retrieval-heavy pipelines
use stronger models only for difficult synthesis steps
keep retrieved context focused
watch token usage on long prompts
consider fallback models for high-availability retrieval systems

Model selection strategies

One of the biggest advantages of SolRouter in LangChain workflows is model flexibility.

Good model strategy examples

Fast cheap default

model: "openai/gpt-4o-mini"

High-quality synthesis

model: "anthropic/claude-sonnet-4"

Long-context reasoning

model: "google/gemini-2.5-pro"

Budget or experimentation

model: "arcee-ai/trinity-mini:free"

Reasoning-heavy tasks

model: "openai/o3"

Practical routing approach

Use different models for different stages:

classification → cheap fast model
retrieval summary → cheap fast model
final synthesis → stronger model
hard reasoning → reasoning model
fallback → free or alternate-provider model

This works especially well in LangChain because the orchestration layer stays the same.

Common mistakes

1. Forgetting the base URL

Wrong:

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  model: "openai/gpt-4o-mini",
});

Correct:

const llm = new ChatOpenAI({
  apiKey: process.env.SOLROUTER_API_KEY,
  configuration: {
    baseURL: "https://api.solrouter.io/ai",
  },
  model: "openai/gpt-4o-mini",
});

2. Exposing the API key in frontend code

Keep the key on the server. LangChain is usually best used:

in backend services
in route handlers
in server actions
in workers
in internal APIs

3. Treating tool arguments as trusted

Even when LangChain helps with tool orchestration, you still must validate:

tool name
argument shape
authorization
side effects

4. Using huge prompts without token awareness

RAG, tool schemas, and long histories can grow prompts quickly. Watch:

context window limits
latency
usage cost

5. Assuming all models behave identically

Even with a unified API surface, model behavior still differs by:

reasoning style
verbosity
tool selection tendencies
multimodal support
structured output reliability

You should test important workflows on the actual model you plan to ship.

6. Overusing the strongest model everywhere

LangChain orchestration often benefits from mixing models by task rather than using the most expensive model for every node in the workflow.

Best practices

Keep LangChain orchestration separate from model concerns

Let LangChain handle:

prompts
chains
tools
parsing
retrieval

Let SolRouter handle:

model access
model switching
unified credentials
routing consistency

Use structured output whenever the result feeds code

If the output is going into:

a database
a queue
a workflow engine
another API

then prefer validated structured output over free-form text.

Keep tool execution under your control

Even with agent-style flows, your application should remain the final authority over:

what tools execute
what data they access
what side effects are allowed

Use model specialization

Match the model to the task:

cheap model for classification
stronger model for synthesis
reasoning model for complex analysis
multimodal model for image or file input

Build fallback-aware systems

If a specific model is important to production reliability, design your chain or workflow so you can substitute another model when needed.

Minimal reusable factory

A simple model factory makes LangChain integrations cleaner.

import { ChatOpenAI } from "@langchain/openai";

export function createSolRouterChat(model: string) {
  const apiKey = process.env.SOLROUTER_API_KEY;

  if (!apiKey) {
    throw new Error("Missing SOLROUTER_API_KEY");
  }

  return new ChatOpenAI({
    apiKey,
    configuration: {
      baseURL: "https://api.solrouter.io/ai",
    },
    model,
  });
}

Usage:

const fastLlm = createSolRouterChat("openai/gpt-4o-mini");
const strongLlm = createSolRouterChat("anthropic/claude-sonnet-4");

This makes it easier to:

centralize configuration
standardize logging
swap models cleanly
inject defaults
reuse clients across chains

Example architecture

A strong production LangChain setup with SolRouter often looks like this:

user request
   ↓
LangChain prompt / chain / retriever / tools
   ↓
SolRouter-backed ChatOpenAI model
   ↓
validated output
   ↓
application logic

For more advanced systems:

retriever → router → structured extractor → tool executor → synthesis model

Where different nodes may use different SolRouter model IDs.

Production checklist

Before shipping a LangChain integration, make sure you:

set baseURL to https://api.solrouter.io/ai
keep SOLROUTER_API_KEY server-side
validate structured outputs
validate tool inputs
test the chosen model for your exact workflow
monitor token usage and latency
choose cheaper models for cheap steps
reserve expensive models for high-value reasoning
handle retries for transient failures
support model substitution where reliability matters

Next steps

API Reference — complete SolRouter request and response schema
Tool Calling — deeper function execution patterns
Structured Output — schema-constrained JSON workflows
Streaming — incremental generation and SSE details
Vision & Multimodal — image and file-aware workflows
Errors — retries, rate limits, and operational handling