OpenRouter 免费模型追踪
实时追踪免费可用的 OpenRouter 模型,便于你快速找到更适配的开发与场景化模型。
OpenRouter 免费模型追踪
OpenRouter 上有大量模型,但核心不是“有免费模型”,而是在当前场景下哪一类免费模型更好用。 本页围绕以下场景聚焦筛选:
- 编码与工程开发
- 角色扮演 / 对话体验场景
- JSON 与结构化输出
- 长上下文处理
列表基于 OpenRouter 模型元数据生成,并过滤出 免费模型(输入与输出均为 0)。页面可通过调度任务自动刷新。
什么是 OpenRouter?
OpenRouter 是一个统一的 AI 模型 API 网关,通过单一标准化接口提供对 OpenAI、Anthropic、Google、Meta、Mistral 及众多开源模型的访问。
开发者只需一个 API Key 和一个接入点,无需逐一接入各家服务商。OpenRouter 处理路由、降级和计费。适用场景:
- 用一次 API 调用对比多家模型的质量和成本
- 构建先路由到免费或低价模型的成本优化流水线
- 访问仅在特定服务商开放的模型
- 无需申请多个 API Key 即可实验新模型
理解 OpenRouter 上的免费模型
OpenRouter 上的免费模型输入和输出定价均为 0。这不是永久保证——服务商可能随时调整定价、频率限制或取消免费配额。
免费模型对真实项目的价值
免费模型不只是爱好者的选择,许多生产场景同样受益:
- 开发与测试:在 CI 流水线和评估脚本中运行,无需消耗付费配额。
- 长尾请求:将低价值或重复性请求路由到免费模型,为高价值请求保留付费配额。
- 降级链:主模型遇到频率限制或故障时,自动降级到免费模型。
- 研究与基准测试:不受成本限制地在相同数据集上对比多个模型。
频率限制与合理使用
免费模型通常有频率限制——每分钟请求数、每日 Token 上限或并发连接数。各服务商和模型的限制不同。在生产环境依赖免费模型前,请直接在 OpenRouter 模型页面 确认限制。
如何访问这些模型
所有免费模型均通过标准 OpenRouter API 访问,基础 URL 为 https://openrouter.ai/api/v1,接口与 OpenAI SDK 兼容。
Python 基础配置:
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="YOUR_OPENROUTER_API_KEY",
)
response = client.chat.completions.create(
model="下方表格中的模型 ID",
messages=[{"role": "user", "content": "你好"}],
)
通过命令行筛选免费模型:
curl https://openrouter.ai/api/v1/models | \
jq '[.data[] | select(.pricing.prompt == "0" and .pricing.completion == "0")]'
如何选择合适的免费模型
编码任务
关注上下文长度、工具调用支持和 JSON 模式。重要基准:HumanEval、MBPP、LiveCodeBench。更长的上下文让模型一次看到更多代码库内容。
JSON 与结构化输出
优先选择明确支持 JSON 模式或工具调用的模型。在复杂嵌套 Schema 上测试——部分模型会产生语法合法但语义错误的 JSON。
长上下文任务
表格中的上下文长度是最大支持输入。实际性能通常在超过标称窗口的 50–70% 后下降。在承诺使用前用实际文档长度测试。
角色扮演与对话
指令遵循质量和个性一致性最重要。经过 RLHF 或 DPO 训练的模型通常能产生更自然的对话回复。
生产使用建议
- 固定模型版本——使用含版本后缀的完整模型 ID,避免被静默替换。
- 实现降级——免费模型可能遇到频率限制,在路由逻辑中构建降级链。
- 缓存响应——对相同输入的重复查询,应用层缓存可消除冗余 API 调用。
- 使用流式输出——流式输出减少首 Token 延迟,提升用户感知性能。
- 监控用量——即使是免费模型也计入 OpenRouter 仪表盘,持续跟踪以发现异常。
相关资源
- OpenRouter 文档 — 完整 API 参考和模型元数据 Schema
- OpenRouter 模型排行 — 社区按使用场景投票的质量评分
- MCP Hub — 浏览与 OpenRouter 集成的 MCP Server,用于 Agent 工作流
Last update
2026-05-25T19:57:36.313Z
Coding
| Model | Provider | Context Length | Price (prompt/completion) | Note |
|---|---|---|---|---|
| Owl Alpha | Unknown | 1048756 | 0/0 | Owl Alpha is a high-performance foundation model designed for agentic workloads. Natively supports tool use, and long-co |
| Qwen: Qwen3 Coder 480B A35B (free) | Unknown | 1048576 | 0/0 | Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is opt |
| Qwen: Qwen3 Next 80B A3B Instruct (free) | Unknown | 262144 | 0/0 | Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable respo |
| NVIDIA: Nemotron 3 Nano 30B A3B (free) | Unknown | 256000 | 0/0 | NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers |
| MiniMax: MiniMax M2.5 (free) | Unknown | 204800 | 0/0 | MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex |
| Baidu Qianfan: CoBuddy (free) | Unknown | 131072 | 0/0 | CoBuddy is a code generation model from Baidu, optimized for coding tasks and AI Agent workflows. It features high infer |
| Poolside: Laguna XS.2 (free) | Unknown | 131072 | 0/0 | Laguna XS.2 is the second-generation model in the XS size class from Poolside, their efficient co |
| Poolside: Laguna M.1 (free) | Unknown | 131072 | 0/0 | Laguna M.1 is the flagship coding agent model from Poolside, optimized for complex software engin |
Roleplay
| Model | Provider | Context Length | Price (prompt/completion) | Note |
|---|---|---|---|---|
| Nous: Hermes 3 405B Instruct (free) | Unknown | 131072 | 0/0 | Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, m |
JSON
| Model | Provider | Context Length | Price (prompt/completion) | Note |
|---|---|---|---|---|
| Qwen: Qwen3 Coder 480B A35B (free) | Unknown | 1048576 | 0/0 | Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is opt |
| Google: Gemma 4 31B (free) | Unknown | 262144 | 0/0 | Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. |
| NVIDIA: Nemotron 3 Nano Omni (free) | Unknown | 256000 | 0/0 | NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-age |
| Poolside: Laguna XS.2 (free) | Unknown | 131072 | 0/0 | Laguna XS.2 is the second-generation model in the XS size class from Poolside, their efficient co |
| Poolside: Laguna M.1 (free) | Unknown | 131072 | 0/0 | Laguna M.1 is the flagship coding agent model from Poolside, optimized for complex software engin |
Long Context
| Model | Provider | Context Length | Price (prompt/completion) | Note |
|---|---|---|---|---|
| Owl Alpha | Unknown | 1048756 | 0/0 | Owl Alpha is a high-performance foundation model designed for agentic workloads. Natively supports tool use, and long-co |
| DeepSeek: DeepSeek V4 Flash (free) | Unknown | 1048576 | 0/0 | DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B a |
| Qwen: Qwen3 Coder 480B A35B (free) | Unknown | 1048576 | 0/0 | Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is opt |
| Google: Gemma 4 31B (free) | Unknown | 262144 | 0/0 | Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. |
| Arcee AI: Trinity Large Thinking (free) | Unknown | 262144 | 0/0 | Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance |
| Qwen: Qwen3 Next 80B A3B Instruct (free) | Unknown | 262144 | 0/0 | Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable respo |
| NVIDIA: Nemotron 3 Nano Omni (free) | Unknown | 256000 | 0/0 | NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-age |
| Poolside: Laguna XS.2 (free) | Unknown | 131072 | 0/0 | Laguna XS.2 is the second-generation model in the XS size class from Poolside, their efficient co |
| Poolside: Laguna M.1 (free) | Unknown | 131072 | 0/0 | Laguna M.1 is the flagship coding agent model from Poolside, optimized for complex software engin |
| OpenAI: gpt-oss-120b (free) | Unknown | 131072 | 0/0 | gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-rea |
| Meta: Llama 3.2 3B Instruct (free) | Unknown | 131072 | 0/0 | Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language process |
| Nous: Hermes 3 405B Instruct (free) | Unknown | 131072 | 0/0 | Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, m |
| NVIDIA: Nemotron Nano 12B 2 VL (free) | Unknown | 128000 | 0/0 | NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and |
| NVIDIA: Nemotron Nano 9B V2 (free) | Unknown | 128000 | 0/0 | NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified mod |
| LiquidAI: LFM2.5-1.2B-Thinking (free) | Unknown | 32768 | 0/0 | LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—whil |
