scale-colony
À propos
Cette compétence propose des stratégies pour faire évoluer les systèmes distribués et les équipes en les modélisant sur le modèle des colonies biologiques, en utilisant des mécanismes tels que le bourgeonnement et la différenciation des rôles. Elle aide à identifier les phases de croissance et à mettre en œuvre des transitions architecturales pour éviter les défaillances de coordination avec l'expansion. Utilisez-la lorsque la surcharge de communication dépasse la production utile ou lorsqu'un système qui fonctionnait à petite échelle dysfonctionne en grandissant.
Installation rapide
Claude Code
Recommandénpx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/scale-colonyCopiez et collez cette commande dans Claude Code pour installer cette compétence
Documentation
Scale Colony
Scale distributed sys|teams|orgs → budding (split), role diff (age polyethism), growth-triggered arch transitions — maintain coord quality as colony grows beyond initial design.
Use When
- Worked @ 10 agents, breaks @ 50
- Comms overhead > productive output
- Implicit coord patterns need explicit
- Plan growth → scale proactive
- Coord fails correlate w/ size (lost msgs, dup work, unclear ownership)
- Existing sys needs split → semi-autonomous sub-colonies
In
- Required: Current size + target growth
- Required: Current coord mechanisms + stress points
- Optional: Structure (flat|hierarchical|clustered)
- Optional: Role diff already in place
- Optional: Growth timeline + constraints
- Optional: Inter-colony coord needs (if splitting)
Do
Step 1: Recognize Growth Phase
Identify scaling phase → apply right strategy.
- Classify phase:
Colony Growth Phases:
┌───────────┬──────────────┬───────────────────────────────────────────┐
│ Phase │ Size Range │ Characteristics │
├───────────┼──────────────┼───────────────────────────────────────────┤
│ Founding │ 1-7 agents │ Everyone does everything, direct comms, │
│ │ │ implicit coordination, high agility │
├───────────┼──────────────┼───────────────────────────────────────────┤
│ Growth │ 8-30 agents │ Roles emerge, some specialization, comms │
│ │ │ overhead increases, need for structure │
├───────────┼──────────────┼───────────────────────────────────────────┤
│ Maturity │ 30-100 agents│ Formal roles, layered coordination, │
│ │ │ sub-groups form, inter-group coordination │
├───────────┼──────────────┼───────────────────────────────────────────┤
│ Fission │ 100+ agents │ Colony too large for single coordination │
│ │ │ framework, must bud into sub-colonies │
└───────────┴──────────────┴───────────────────────────────────────────┘
- Stress signals:
- Comms overload: msgs/agent/day grows faster than colony size
- Decision latency: proposal→decision time ↑
- Coord failures: dup work, dropped tasks, conflicting actions ↑
- Knowledge dilution: newcomers slow to productive
- Identity loss: agents can't describe purpose consistently
- About to cross phase boundary or already crossed?
→ Clear phase ID + stress signals indicating approach|cross.
If err: phase unclear → measure 3 metrics: comm vol/agent, decision latency, coord fail rate. Plot over time. Inflection points = phase transitions. No metrics → likely Founding (where metrics not yet needed).
Step 2: Role Differentiation (Age Polyethism)
Progressive specialization → roles by experience + colony needs.
- Role progression:
- Newcomers: observation, learning, simple (low autonomy, high guidance)
- Workers: standard exec, signal following (mod autonomy)
- Specialists: domain expertise, complex tasks, mentor newcomers (high autonomy)
- Foragers/Scouts: exploration, innovation, external interface (see
forage-resources) - Coordinators: inter-group comms, conflict resolution, quorum mgmt
- Role transitions:
- Triggered by experience thresholds, not appointment
- Agent done threshold tasks successfully → next role (calibrate by complexity + growth rate — 5-10 simple, 20-30 specialist)
- Reverse possible (specialist → worker in new domain)
- Distribution adapts to needs:
- Growing → more newcomer slots, active mentoring
- Stable → balanced across all roles
- Threatened → more defenders, fewer scouts (see
defend-colony)
- Preserve flexibility:
- No agent permanently locked
- Emergency protocols can temp reassign any agent any role
- Cross-training → cover adjacent roles
→ Roles where agents progress simple→complex, distribution reflects needs+phase.
If err: rigid silos → ↑cross-training + rotation freq. Newcomers struggle progress → mentoring insufficient — pair w/ specialist for first N tasks. Too many in one role → triggers miscalibrated — adjust by colony-wide demand.
Step 3: Restructure Coord for Scale
Adapt mechanisms from coordinate-swarm for size.
- Replace direct comms → layered signaling:
- Founding: everyone→everyone (N×N)
- Growth: cluster squads of 5-8; direct in squad, signal between
- Maturity: squads → departments; intra-squad direct, inter-squad signal, inter-dept broadcast
- Coord layers:
- Local: in squad, direct signal exchange (stigmergy)
- Regional: between squads same dept, aggregated signals
- Colony: between depts, broadcast only for colony-wide decisions
- Inter-layer interfaces:
- Each squad has 1 designated communicator who aggregates+relays
- Communicators filter noise: not every local signal relayed up
- Colony broadcasts rare → quorum, alarm escalation, major state changes
- Comms overhead budget:
- Target: each agent <20% capacity on coord
- Measure actual; exceed → add layer or split oversized squad
→ Layered coord, comms overhead grows logarithmic (not linear) w/ size. Local fast direct; colony-wide slower but functional.
If err: layers create info bottlenecks (communicators overloaded) → add redundant communicators or ↓relay freq. Layers create isolation (squads don't know others) → ↑inter-layer signal freq or cross-squad liaison roles.
Step 4: Execute Budding (Fission)
Split colony → semi-autonomous sub-colonies when exceeds single-coord capacity.
- Fission triggers:
-
100 agents (or coord layer count >3)
- Comms overhead >30% capacity despite layering
- Decision latency exceeds time-sensitive thresholds
- Subgroups have distinct identities + can operate independent
-
- Plan fission:
- Identify natural split lines (existing clusters, domain bounds, geo)
- Each daughter has viable role distribution (can't split all specialists into one)
- Each must have: ≥1 coordinator, sufficient workers, access to shared resources
- Define inter-colony interface: what shared, what independent
- Execute split:
- Announce plan + timeline (consensus required — see
build-consensus) - Transfer agents → daughters by existing cluster membership
- Establish inter-colony channels (lightweight, async)
- Each daughter bootstraps own local coord (inheriting from parent)
- Announce plan + timeline (consensus required — see
- Post-fission stabilization:
- Monitor each for viability (sustains itself?)
- Inter-colony coord minimal (quarterly sync, not daily)
- Failed daughter → reabsorb into nearest viable
→ ≥2 viable daughters semi-autonomous w/ own coord, connected by lightweight interfaces.
If err: daughters too small → fission premature; remerge + retry larger. Inter-colony coord as heavy as pre-fission → split lines wrong, too interdependent. Re-draw on natural independence.
Step 5: Monitor Limits + Adapt
Continuous assess: structure matches size+needs?
- Scaling health metrics:
- Coord overhead ratio: time coord/time produce
- Decision throughput: decisions/time (↑ or steady w/ growth)
- Agent satisfaction: engagement, retention, purpose (drops on fail)
- Err rate: coord fails/time (not linear w/ growth)
- Limit indicators:
- Overhead ratio >25% → more automation or layer
- Throughput declining → governance needs revision
- Turnover spiking → cultural|structural issues
- Err rate accelerating → coord failing
- Trigger adapt:
- Phase transition → apply Step 1 strategy
- Limit reached → escalate (role diff → coord restructure → fission)
- External change (market, tech) → may need transformation (see
adapt-architecture)
→ Colony monitors own health + proactively adapts before stress = failure.
If err: no metrics → lacks observability — build measurement before more structure. Metrics show problems but can't adapt → resistance cultural not technical — address human factors (fear, ownership, trust) before restructure.
Check
- Phase ID'd w/ specific stress signals
- Role diff defined w/ progressive specialization
- Coord layered for size
- Comms overhead <20-25% capacity
- Fission plan exists for >single-coord capacity
- Health metrics tracked + thresholds trigger adapt
- Daughter colonies (post-fission) viable distribution
Traps
- Scale structure pre-needed: Premature layering = overhead w/o benefit. 10-team doesn't need dept coordinators. Stress signals guide.
- Preserve founding culture at all costs: 5-agent ways break @ 50. Scaling needs evolution; nostalgia prevents adaptation.
- Fission w/o independence: Sub-colonies still depend daily → worst of both — coord overhead + separation overhead.
- Uniform role distribution: Not every sub-colony needs same ratios. Research → more scouts; production → more workers.
- Ignore remerge: Sometimes fission fails; remerge best move. Treating fission irreversible prevents recovery.
→
coordinate-swarm— foundational patterns this skill scalesforage-resources— scales diff than production; role diff affects scout allocbuild-consensus— must adapt for larger groupsdefend-colony— defense scales w/ colonyadapt-architecture— morphic skill for structural transformationplan-capacity— capacity planning for growthconduct-retrospective— identify stress before failure
Dépôt GitHub
Compétences associées
llamaguard
AutreLlamaGuard est le modèle de Meta, doté de 7 à 8 milliards de paramètres, conçu pour modérer les entrées et sorties des LLM selon six catégories de sécurité comme la violence et les discours haineux. Il offre une précision de 94 à 95 % et peut être déployé avec vLLM, Hugging Face ou Amazon SageMaker. Utilisez cette compétence pour intégrer facilement le filtrage de contenu et des garde-fous de sécurité dans vos applications d'IA.
cost-optimization
AutreCette compétence de Claude aide les développeurs à optimiser les coûts du cloud grâce au redimensionnement des ressources, aux stratégies d'étiquetage et à l'analyse des dépenses. Elle fournit un cadre pour réduire les dépenses cloud et mettre en œuvre une gouvernance des coûts sur AWS, Azure et GCP. Utilisez-la lorsque vous devez analyser les coûts d'infrastructure, redimensionner les ressources ou respecter des contraintes budgétaires.
quantizing-models-bitsandbytes
AutreCette compétence quantifie les LLMs en précision 8 bits ou 4 bits à l'aide de bitsandbytes, permettant une réduction de 50 à 75 % de la mémoire utilisée avec une perte de précision minime. Elle est idéale pour exécuter des modèles plus volumineux sur une mémoire GPU limitée ou pour accélérer l'inférence, prenant en charge des formats comme INT8, NF4 et FP4. La compétence s'intègre à HuggingFace Transformers et permet l'entraînement QLoRA ainsi que l'utilisation d'optimiseurs en 8 bits.
dispatching-parallel-agents
AutreCette compétence Claude déploie plusieurs agents pour enquêter et résoudre simultanément 3 problèmes indépendants ou plus. Elle est conçue pour des scénarios impliquant des défaillances non liées qui peuvent être résolues sans état partagé ni dépendances. La capacité fondamentale est la résolution de problèmes en parallèle, en assignant un agent par domaine problématique indépendant afin de maximiser l'efficacité.
