OpenRouter 免费模型追踪

实时追踪免费可用的 OpenRouter 模型，便于你快速找到更适配的开发与场景化模型。

OpenRouter 免费模型追踪

OpenRouter 上有大量模型，但核心不是“有免费模型”，而是在当前场景下哪一类免费模型更好用。本页围绕以下场景聚焦筛选：

编码与工程开发
角色扮演 / 对话体验场景
JSON 与结构化输出
长上下文处理

列表基于 OpenRouter 模型元数据生成，并过滤出 免费模型（输入与输出均为 0）。页面可通过调度任务自动刷新。

什么是 OpenRouter？

OpenRouter 是一个统一的 AI 模型 API 网关，通过单一标准化接口提供对 OpenAI、Anthropic、Google、Meta、Mistral 及众多开源模型的访问。

开发者只需一个 API Key 和一个接入点，无需逐一接入各家服务商。OpenRouter 处理路由、降级和计费。适用场景：

用一次 API 调用对比多家模型的质量和成本
构建先路由到免费或低价模型的成本优化流水线
访问仅在特定服务商开放的模型
无需申请多个 API Key 即可实验新模型

理解 OpenRouter 上的免费模型

OpenRouter 上的免费模型输入和输出定价均为 0。这不是永久保证——服务商可能随时调整定价、频率限制或取消免费配额。

免费模型对真实项目的价值

免费模型不只是爱好者的选择，许多生产场景同样受益：

开发与测试：在 CI 流水线和评估脚本中运行，无需消耗付费配额。
长尾请求：将低价值或重复性请求路由到免费模型，为高价值请求保留付费配额。
降级链：主模型遇到频率限制或故障时，自动降级到免费模型。
研究与基准测试：不受成本限制地在相同数据集上对比多个模型。

频率限制与合理使用

免费模型通常有频率限制——每分钟请求数、每日 Token 上限或并发连接数。各服务商和模型的限制不同。在生产环境依赖免费模型前，请直接在 OpenRouter 模型页面确认限制。

如何访问这些模型

所有免费模型均通过标准 OpenRouter API 访问，基础 URL 为 https://openrouter.ai/api/v1，接口与 OpenAI SDK 兼容。

Python 基础配置：

from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_OPENROUTER_API_KEY",
)
response = client.chat.completions.create(
    model="下方表格中的模型 ID",
    messages=[{"role": "user", "content": "你好"}],
)

通过命令行筛选免费模型：

curl https://openrouter.ai/api/v1/models | \
  jq '[.data[] | select(.pricing.prompt == "0" and .pricing.completion == "0")]'

如何选择合适的免费模型

编码任务

关注上下文长度、工具调用支持和 JSON 模式。重要基准：HumanEval、MBPP、LiveCodeBench。更长的上下文让模型一次看到更多代码库内容。

JSON 与结构化输出

优先选择明确支持 JSON 模式或工具调用的模型。在复杂嵌套 Schema 上测试——部分模型会产生语法合法但语义错误的 JSON。

长上下文任务

表格中的上下文长度是最大支持输入。实际性能通常在超过标称窗口的 50–70% 后下降。在承诺使用前用实际文档长度测试。

角色扮演与对话

指令遵循质量和个性一致性最重要。经过 RLHF 或 DPO 训练的模型通常能产生更自然的对话回复。

生产使用建议

固定模型版本——使用含版本后缀的完整模型 ID，避免被静默替换。
实现降级——免费模型可能遇到频率限制，在路由逻辑中构建降级链。
缓存响应——对相同输入的重复查询，应用层缓存可消除冗余 API 调用。
使用流式输出——流式输出减少首 Token 延迟，提升用户感知性能。
监控用量——即使是免费模型也计入 OpenRouter 仪表盘，持续跟踪以发现异常。

Coding

Model	Provider	Context Length	Price (prompt/completion)	Note
Qwen: Qwen3 Coder 480B A35B (free)	Unknown	1048576	0/0	Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is opt
Poolside: Laguna XS 2.1 (free)	Unknown	262144	0/0	Laguna XS 2.1 is the latest coding agent model in the 33B-A3B category from Poolside and a step
Poolside: Laguna M.1 (free)	Unknown	262144	0/0	Laguna M.1 is the flagship coding agent model from Poolside, optimized for complex software engi
Qwen: Qwen3 Next 80B A3B Instruct (free)	Unknown	262144	0/0	Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable respo
Cohere: North Mini Code (free)	Unknown	256000	0/0	North Mini Code is Cohere's first agentic coding model and the debut of its North family. A sparse mixture-of-experts mo
NVIDIA: Nemotron 3 Nano 30B A3B (free)	Unknown	256000	0/0	NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers

Roleplay

Model	Provider	Context Length	Price (prompt/completion)	Note
Nous: Hermes 3 405B Instruct (free)	Unknown	131072	0/0	Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, m

JSON

Model	Provider	Context Length	Price (prompt/completion)	Note
Qwen: Qwen3 Coder 480B A35B (free)	Unknown	1048576	0/0	Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is opt
Poolside: Laguna M.1 (free)	Unknown	262144	0/0	Laguna M.1 is the flagship coding agent model from Poolside, optimized for complex software engi
Google: Gemma 4 31B (free)	Unknown	262144	0/0	Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output.
NVIDIA: Nemotron 3 Nano Omni (free)	Unknown	256000	0/0	NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-age

Long Context

Model	Provider	Context Length	Price (prompt/completion)	Note
Qwen: Qwen3 Coder 480B A35B (free)	Unknown	1048576	0/0	Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is opt
NVIDIA: Nemotron 3 Ultra (free)	Unknown	1000000	0/0	NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters ou
Tencent: Hy3 (free)	Unknown	262144	0/0	Hy3 is a 295B-parameter Mixture-of-Experts model from Tencent (21B active, 192 experts with top-8 routing) built for rea
Poolside: Laguna M.1 (free)	Unknown	262144	0/0	Laguna M.1 is the flagship coding agent model from Poolside, optimized for complex software engi
Google: Gemma 4 31B (free)	Unknown	262144	0/0	Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output.
Qwen: Qwen3 Next 80B A3B Instruct (free)	Unknown	262144	0/0	Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable respo
NVIDIA: Nemotron 3 Nano Omni (free)	Unknown	256000	0/0	NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-age
OpenAI: gpt-oss-120b (free)	Unknown	131072	0/0	gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-rea
Meta: Llama 3.2 3B Instruct (free)	Unknown	131072	0/0	Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language process
Nous: Hermes 3 405B Instruct (free)	Unknown	131072	0/0	Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, m
NVIDIA: Nemotron Nano 12B 2 VL (free)	Unknown	128000	0/0	NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and
NVIDIA: Nemotron Nano 9B V2 (free)	Unknown	128000	0/0	NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified mod
LiquidAI: LFM2.5-1.2B-Thinking (free)	Unknown	32768	0/0	LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—whil

OpenRouter 免费模型追踪