qdrant-horizontal-scaling
关于
This skill helps developers diagnose and implement Qdrant horizontal scaling strategies when data exceeds single-node capacity. It provides guidance on shard configuration, node counts, and replication for fault tolerance and performance. Use it when planning capacity expansion, adding nodes, or optimizing distributed Qdrant deployments.
快速安装
Claude Code
推荐npx skills add qdrant/skills -a claude-code/plugin add https://github.com/qdrant/skillsgit clone https://github.com/qdrant/skills.git ~/.claude/skills/qdrant-horizontal-scaling在 Claude Code 中复制并粘贴此命令以安装该技能
技能文档
What to Do When Qdrant Needs More Capacity
Vertical first: simpler operations, no network overhead, good up to ~100M vectors per node depending on dimensions and quantization. Horizontal when: data exceeds single node capacity, need fault tolerance, need to isolate tenants, or IOPS-bound (more nodes = more independent IOPS).
Most basic distributed configuration
- 3 nodes, 3 shards with
replication_factor: 2for zero-downtime scaling
Minimum of 3 nodes is important for consensus and fault tolerance. With 3 nodes, you can lose 1 node without downtime. With 2 nodes, losing 1 node causes downtime for collection operations.
Replication factor of 2 means each shard has 1 replica, so you have 2 copies of data. This allows for zero-downtime scaling and maintenance. With replication_factor: 1, zero-downtime is not guaranteed even for point-level operations, and cluster maintenance requires downtime.
Choosing number of shards
Shards are the unit of data distribution. More shards allows more nodes and better distribution, but adds overhead. Fewer shards reduces overhead but limits horizontal scaling.
For cluster of 3-6 nodes the recommended shard count is 6-12. This allows for 2-4 shards per node, which balances distribution and overhead.
Changing number of shards
Use when: shard count isn't evenly divisible by node count, causing uneven distribution, or need to rebalance.
Resharding is expensive and time-consuming, it should be used as a last resort if regular data distribution is not possible. Resharding is designed to be transparent for user operations, updates and searches should still work during resharding with some small performance impact.
But resharding operation itself is time-consuming and requires to move large amounts of data between nodes.
- Available in Qdrant Cloud Resharding
- Resharding is not available for self-hosted deployments.
Better alternatives: over-provision shards initially, or spin up new cluster with correct config and migrate data.
What NOT to Do
- Do not jump to horizontal before exhausting vertical (adds complexity for no gain)
- Do not set
shard_numberthat isn't a multiple of node count (uneven distribution) - Do not use
replication_factor: 1in production if you need fault tolerance - Do not add nodes without rebalancing shards (use shard move API to redistribute)
- Do not scale down RAM without load testing (cache eviction causes days-long latency incidents)
- Do not hit the collection limit by using one collection per tenant (use payload partitioning)
GitHub 仓库
相关推荐技能
executing-plans
设计该Skill用于当开发者提供完整实施计划时,以受控批次方式执行代码实现。它会先审阅计划并提出疑问,然后分批次执行任务(默认每批3个任务),并在批次间暂停等待审查。关键特性包括分批次执行、内置检查点和架构师审查机制,确保复杂系统实现的可控性。
requesting-code-review
设计该Skill可在完成任务、实现主要功能或合并代码前自动调度代码审查子代理,确保实现符合需求和计划。它支持通过指定git SHA范围进行精准的代码变更审查,帮助开发者在关键节点及时发现潜在问题。核心原则是"早审查、勤审查",适用于开发流程的各个关键阶段。
connect-mcp-server
设计这个Skill指导开发者如何将MCP服务器连接到Claude Code,支持HTTP、stdio和SSE三种传输协议。它涵盖了从安装配置到认证安全的完整流程,适用于集成GitHub、Notion、数据库等外部服务。当开发者需要添加集成、配置外部工具或提及MCP相关功能时,这个Skill能提供实用的操作指南。
web-cli-teleport
设计该Skill帮助开发者根据任务特性选择Claude Code的Web或CLI界面,并指导如何在两种环境间无缝迁移会话。它能分析任务复杂度、迭代需求等要素,推荐最优工作界面和工作流。关键特性包括会话状态管理、环境切换指导和上下文优化建议。
