OpenRouter 免费模型追踪

实时追踪免费可用的 OpenRouter 模型,便于你快速找到更适配的开发与场景化模型。

OpenRouter 免费模型追踪

OpenRouter 上有大量模型,但核心不是“有免费模型”,而是在当前场景下哪一类免费模型更好用。 本页围绕以下场景聚焦筛选:

  • 编码与工程开发
  • 角色扮演 / 对话体验场景
  • JSON 与结构化输出
  • 长上下文处理

列表基于 OpenRouter 模型元数据生成,并过滤出 免费模型(输入与输出均为 0)。页面可通过调度任务自动刷新。

什么是 OpenRouter?

OpenRouter 是一个统一的 AI 模型 API 网关,通过单一标准化接口提供对 OpenAI、Anthropic、Google、Meta、Mistral 及众多开源模型的访问。

开发者只需一个 API Key 和一个接入点,无需逐一接入各家服务商。OpenRouter 处理路由、降级和计费。适用场景:

  • 用一次 API 调用对比多家模型的质量和成本
  • 构建先路由到免费或低价模型的成本优化流水线
  • 访问仅在特定服务商开放的模型
  • 无需申请多个 API Key 即可实验新模型

理解 OpenRouter 上的免费模型

OpenRouter 上的免费模型输入和输出定价均为 0。这不是永久保证——服务商可能随时调整定价、频率限制或取消免费配额。

免费模型对真实项目的价值

免费模型不只是爱好者的选择,许多生产场景同样受益:

  • 开发与测试:在 CI 流水线和评估脚本中运行,无需消耗付费配额。
  • 长尾请求:将低价值或重复性请求路由到免费模型,为高价值请求保留付费配额。
  • 降级链:主模型遇到频率限制或故障时,自动降级到免费模型。
  • 研究与基准测试:不受成本限制地在相同数据集上对比多个模型。

频率限制与合理使用

免费模型通常有频率限制——每分钟请求数、每日 Token 上限或并发连接数。各服务商和模型的限制不同。在生产环境依赖免费模型前,请直接在 OpenRouter 模型页面 确认限制。

如何访问这些模型

所有免费模型均通过标准 OpenRouter API 访问,基础 URL 为 https://openrouter.ai/api/v1,接口与 OpenAI SDK 兼容。

Python 基础配置:

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_OPENROUTER_API_KEY",
)
response = client.chat.completions.create(
    model="下方表格中的模型 ID",
    messages=[{"role": "user", "content": "你好"}],
)

通过命令行筛选免费模型:

curl https://openrouter.ai/api/v1/models | \
  jq '[.data[] | select(.pricing.prompt == "0" and .pricing.completion == "0")]'

如何选择合适的免费模型

编码任务

关注上下文长度、工具调用支持和 JSON 模式。重要基准:HumanEval、MBPP、LiveCodeBench。更长的上下文让模型一次看到更多代码库内容。

JSON 与结构化输出

优先选择明确支持 JSON 模式或工具调用的模型。在复杂嵌套 Schema 上测试——部分模型会产生语法合法但语义错误的 JSON。

长上下文任务

表格中的上下文长度是最大支持输入。实际性能通常在超过标称窗口的 50–70% 后下降。在承诺使用前用实际文档长度测试。

角色扮演与对话

指令遵循质量和个性一致性最重要。经过 RLHF 或 DPO 训练的模型通常能产生更自然的对话回复。

生产使用建议

  1. 固定模型版本——使用含版本后缀的完整模型 ID,避免被静默替换。
  2. 实现降级——免费模型可能遇到频率限制,在路由逻辑中构建降级链。
  3. 缓存响应——对相同输入的重复查询,应用层缓存可消除冗余 API 调用。
  4. 使用流式输出——流式输出减少首 Token 延迟,提升用户感知性能。
  5. 监控用量——即使是免费模型也计入 OpenRouter 仪表盘,持续跟踪以发现异常。

相关资源

Last update

2026-05-25T19:57:36.313Z

Coding

Model Provider Context Length Price (prompt/completion) Note
Owl Alpha Unknown 1048756 0/0 Owl Alpha is a high-performance foundation model designed for agentic workloads. Natively supports tool use, and long-co
Qwen: Qwen3 Coder 480B A35B (free) Unknown 1048576 0/0 Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is opt
Qwen: Qwen3 Next 80B A3B Instruct (free) Unknown 262144 0/0 Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable respo
NVIDIA: Nemotron 3 Nano 30B A3B (free) Unknown 256000 0/0 NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers
MiniMax: MiniMax M2.5 (free) Unknown 204800 0/0 MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex
Baidu Qianfan: CoBuddy (free) Unknown 131072 0/0 CoBuddy is a code generation model from Baidu, optimized for coding tasks and AI Agent workflows. It features high infer
Poolside: Laguna XS.2 (free) Unknown 131072 0/0 Laguna XS.2 is the second-generation model in the XS size class from Poolside, their efficient co
Poolside: Laguna M.1 (free) Unknown 131072 0/0 Laguna M.1 is the flagship coding agent model from Poolside, optimized for complex software engin

Roleplay

Model Provider Context Length Price (prompt/completion) Note
Nous: Hermes 3 405B Instruct (free) Unknown 131072 0/0 Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, m

JSON

Model Provider Context Length Price (prompt/completion) Note
Qwen: Qwen3 Coder 480B A35B (free) Unknown 1048576 0/0 Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is opt
Google: Gemma 4 31B (free) Unknown 262144 0/0 Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output.
NVIDIA: Nemotron 3 Nano Omni (free) Unknown 256000 0/0 NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-age
Poolside: Laguna XS.2 (free) Unknown 131072 0/0 Laguna XS.2 is the second-generation model in the XS size class from Poolside, their efficient co
Poolside: Laguna M.1 (free) Unknown 131072 0/0 Laguna M.1 is the flagship coding agent model from Poolside, optimized for complex software engin

Long Context

Model Provider Context Length Price (prompt/completion) Note
Owl Alpha Unknown 1048756 0/0 Owl Alpha is a high-performance foundation model designed for agentic workloads. Natively supports tool use, and long-co
DeepSeek: DeepSeek V4 Flash (free) Unknown 1048576 0/0 DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B a
Qwen: Qwen3 Coder 480B A35B (free) Unknown 1048576 0/0 Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is opt
Google: Gemma 4 31B (free) Unknown 262144 0/0 Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output.
Arcee AI: Trinity Large Thinking (free) Unknown 262144 0/0 Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance
Qwen: Qwen3 Next 80B A3B Instruct (free) Unknown 262144 0/0 Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable respo
NVIDIA: Nemotron 3 Nano Omni (free) Unknown 256000 0/0 NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-age
Poolside: Laguna XS.2 (free) Unknown 131072 0/0 Laguna XS.2 is the second-generation model in the XS size class from Poolside, their efficient co
Poolside: Laguna M.1 (free) Unknown 131072 0/0 Laguna M.1 is the flagship coding agent model from Poolside, optimized for complex software engin
OpenAI: gpt-oss-120b (free) Unknown 131072 0/0 gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-rea
Meta: Llama 3.2 3B Instruct (free) Unknown 131072 0/0 Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language process
Nous: Hermes 3 405B Instruct (free) Unknown 131072 0/0 Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, m
NVIDIA: Nemotron Nano 12B 2 VL (free) Unknown 128000 0/0 NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and
NVIDIA: Nemotron Nano 9B V2 (free) Unknown 128000 0/0 NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified mod
LiquidAI: LFM2.5-1.2B-Thinking (free) Unknown 32768 0/0 LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—whil