scale-colony
정보
이 스킬은 생물 군집을 모델로 하여 출아와 역할 분화 같은 메커니즘을 활용해 분산 시스템과 팀의 확장 전략을 제공합니다. 성장 단계를 인식하고 규모 증가에 따른 조정 실패를 방지하기 위한 아키텍처 전환을 구현하는 데 도움을 줍니다. 의사소통 부담이 생산적 성과를 초과하거나, 소규모에서는 작동하던 시스템이 성장하며 고장 나는 경우에 사용하세요.
빠른 설치
Claude Code
추천npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/scale-colonyClaude Code에서 이 명령을 복사하여 붙여넣어 스킬을 설치하세요
문서
Scale Colony
Scale distributed sys|teams|orgs → budding (split), role diff (age polyethism), growth-triggered arch transitions — maintain coord quality as colony grows beyond initial design.
Use When
- Worked @ 10 agents, breaks @ 50
- Comms overhead > productive output
- Implicit coord patterns need explicit
- Plan growth → scale proactive
- Coord fails correlate w/ size (lost msgs, dup work, unclear ownership)
- Existing sys needs split → semi-autonomous sub-colonies
In
- Required: Current size + target growth
- Required: Current coord mechanisms + stress points
- Optional: Structure (flat|hierarchical|clustered)
- Optional: Role diff already in place
- Optional: Growth timeline + constraints
- Optional: Inter-colony coord needs (if splitting)
Do
Step 1: Recognize Growth Phase
Identify scaling phase → apply right strategy.
- Classify phase:
Colony Growth Phases:
┌───────────┬──────────────┬───────────────────────────────────────────┐
│ Phase │ Size Range │ Characteristics │
├───────────┼──────────────┼───────────────────────────────────────────┤
│ Founding │ 1-7 agents │ Everyone does everything, direct comms, │
│ │ │ implicit coordination, high agility │
├───────────┼──────────────┼───────────────────────────────────────────┤
│ Growth │ 8-30 agents │ Roles emerge, some specialization, comms │
│ │ │ overhead increases, need for structure │
├───────────┼──────────────┼───────────────────────────────────────────┤
│ Maturity │ 30-100 agents│ Formal roles, layered coordination, │
│ │ │ sub-groups form, inter-group coordination │
├───────────┼──────────────┼───────────────────────────────────────────┤
│ Fission │ 100+ agents │ Colony too large for single coordination │
│ │ │ framework, must bud into sub-colonies │
└───────────┴──────────────┴───────────────────────────────────────────┘
- Stress signals:
- Comms overload: msgs/agent/day grows faster than colony size
- Decision latency: proposal→decision time ↑
- Coord failures: dup work, dropped tasks, conflicting actions ↑
- Knowledge dilution: newcomers slow to productive
- Identity loss: agents can't describe purpose consistently
- About to cross phase boundary or already crossed?
→ Clear phase ID + stress signals indicating approach|cross.
If err: phase unclear → measure 3 metrics: comm vol/agent, decision latency, coord fail rate. Plot over time. Inflection points = phase transitions. No metrics → likely Founding (where metrics not yet needed).
Step 2: Role Differentiation (Age Polyethism)
Progressive specialization → roles by experience + colony needs.
- Role progression:
- Newcomers: observation, learning, simple (low autonomy, high guidance)
- Workers: standard exec, signal following (mod autonomy)
- Specialists: domain expertise, complex tasks, mentor newcomers (high autonomy)
- Foragers/Scouts: exploration, innovation, external interface (see
forage-resources) - Coordinators: inter-group comms, conflict resolution, quorum mgmt
- Role transitions:
- Triggered by experience thresholds, not appointment
- Agent done threshold tasks successfully → next role (calibrate by complexity + growth rate — 5-10 simple, 20-30 specialist)
- Reverse possible (specialist → worker in new domain)
- Distribution adapts to needs:
- Growing → more newcomer slots, active mentoring
- Stable → balanced across all roles
- Threatened → more defenders, fewer scouts (see
defend-colony)
- Preserve flexibility:
- No agent permanently locked
- Emergency protocols can temp reassign any agent any role
- Cross-training → cover adjacent roles
→ Roles where agents progress simple→complex, distribution reflects needs+phase.
If err: rigid silos → ↑cross-training + rotation freq. Newcomers struggle progress → mentoring insufficient — pair w/ specialist for first N tasks. Too many in one role → triggers miscalibrated — adjust by colony-wide demand.
Step 3: Restructure Coord for Scale
Adapt mechanisms from coordinate-swarm for size.
- Replace direct comms → layered signaling:
- Founding: everyone→everyone (N×N)
- Growth: cluster squads of 5-8; direct in squad, signal between
- Maturity: squads → departments; intra-squad direct, inter-squad signal, inter-dept broadcast
- Coord layers:
- Local: in squad, direct signal exchange (stigmergy)
- Regional: between squads same dept, aggregated signals
- Colony: between depts, broadcast only for colony-wide decisions
- Inter-layer interfaces:
- Each squad has 1 designated communicator who aggregates+relays
- Communicators filter noise: not every local signal relayed up
- Colony broadcasts rare → quorum, alarm escalation, major state changes
- Comms overhead budget:
- Target: each agent <20% capacity on coord
- Measure actual; exceed → add layer or split oversized squad
→ Layered coord, comms overhead grows logarithmic (not linear) w/ size. Local fast direct; colony-wide slower but functional.
If err: layers create info bottlenecks (communicators overloaded) → add redundant communicators or ↓relay freq. Layers create isolation (squads don't know others) → ↑inter-layer signal freq or cross-squad liaison roles.
Step 4: Execute Budding (Fission)
Split colony → semi-autonomous sub-colonies when exceeds single-coord capacity.
- Fission triggers:
-
100 agents (or coord layer count >3)
- Comms overhead >30% capacity despite layering
- Decision latency exceeds time-sensitive thresholds
- Subgroups have distinct identities + can operate independent
-
- Plan fission:
- Identify natural split lines (existing clusters, domain bounds, geo)
- Each daughter has viable role distribution (can't split all specialists into one)
- Each must have: ≥1 coordinator, sufficient workers, access to shared resources
- Define inter-colony interface: what shared, what independent
- Execute split:
- Announce plan + timeline (consensus required — see
build-consensus) - Transfer agents → daughters by existing cluster membership
- Establish inter-colony channels (lightweight, async)
- Each daughter bootstraps own local coord (inheriting from parent)
- Announce plan + timeline (consensus required — see
- Post-fission stabilization:
- Monitor each for viability (sustains itself?)
- Inter-colony coord minimal (quarterly sync, not daily)
- Failed daughter → reabsorb into nearest viable
→ ≥2 viable daughters semi-autonomous w/ own coord, connected by lightweight interfaces.
If err: daughters too small → fission premature; remerge + retry larger. Inter-colony coord as heavy as pre-fission → split lines wrong, too interdependent. Re-draw on natural independence.
Step 5: Monitor Limits + Adapt
Continuous assess: structure matches size+needs?
- Scaling health metrics:
- Coord overhead ratio: time coord/time produce
- Decision throughput: decisions/time (↑ or steady w/ growth)
- Agent satisfaction: engagement, retention, purpose (drops on fail)
- Err rate: coord fails/time (not linear w/ growth)
- Limit indicators:
- Overhead ratio >25% → more automation or layer
- Throughput declining → governance needs revision
- Turnover spiking → cultural|structural issues
- Err rate accelerating → coord failing
- Trigger adapt:
- Phase transition → apply Step 1 strategy
- Limit reached → escalate (role diff → coord restructure → fission)
- External change (market, tech) → may need transformation (see
adapt-architecture)
→ Colony monitors own health + proactively adapts before stress = failure.
If err: no metrics → lacks observability — build measurement before more structure. Metrics show problems but can't adapt → resistance cultural not technical — address human factors (fear, ownership, trust) before restructure.
Check
- Phase ID'd w/ specific stress signals
- Role diff defined w/ progressive specialization
- Coord layered for size
- Comms overhead <20-25% capacity
- Fission plan exists for >single-coord capacity
- Health metrics tracked + thresholds trigger adapt
- Daughter colonies (post-fission) viable distribution
Traps
- Scale structure pre-needed: Premature layering = overhead w/o benefit. 10-team doesn't need dept coordinators. Stress signals guide.
- Preserve founding culture at all costs: 5-agent ways break @ 50. Scaling needs evolution; nostalgia prevents adaptation.
- Fission w/o independence: Sub-colonies still depend daily → worst of both — coord overhead + separation overhead.
- Uniform role distribution: Not every sub-colony needs same ratios. Research → more scouts; production → more workers.
- Ignore remerge: Sometimes fission fails; remerge best move. Treating fission irreversible prevents recovery.
→
coordinate-swarm— foundational patterns this skill scalesforage-resources— scales diff than production; role diff affects scout allocbuild-consensus— must adapt for larger groupsdefend-colony— defense scales w/ colonyadapt-architecture— morphic skill for structural transformationplan-capacity— capacity planning for growthconduct-retrospective— identify stress before failure
GitHub 저장소
연관 스킬
llamaguard
기타LlamaGuard는 폭력 및 혐오 발언 등 6가지 안전 범주에서 LLM 입력과 출력을 조정하기 위한 Meta의 70-80억 파라미터 모델입니다. 94-95% 정확도를 제공하며 vLLM, Hugging Face 또는 Amazon SageMaker를 사용해 배포할 수 있습니다. 이 기술을 사용하여 AI 애플리케이션에 콘텐츠 필터링 및 안전 가드레일을 손쉽게 통합하세요.
cost-optimization
기타이 Claude Skill은 리소스 적정화, 태깅 전략, 지출 분석을 통해 개발자들이 클라우드 비용을 최적화할 수 있도록 지원합니다. AWS, Azure, GCP에서 클라우드 비용을 절감하고 비용 거버넌스를 구현하기 위한 프레임워크를 제공합니다. 인프라 비용을 분석하거나, 리소스를 적정화하거나, 예산 제약을 충족해야 할 때 사용하세요.
quantizing-models-bitsandbytes
기타이 스킬은 bitsandbytes를 사용하여 LLM을 8비트 또는 4비트 정밀도로 양자화하며, 최소한의 정확도 손실로 50-75%의 메모리 감소를 달성합니다. 제한된 GPU 메모리에서 더 큰 모델을 실행하거나 추론을 가속화하는 데 이상적이며, INT8, NF4, FP4와 같은 형식을 지원합니다. 이 스킬은 HuggingFace Transformers와 통합되어 QLoRA 학습 및 8비트 옵티마이저를 가능하게 합니다.
dispatching-parallel-agents
기타이 Claude Skill은 3개 이상의 독립적인 문제를 동시에 조사하고 해결하기 위해 다중 에이전트를 배치합니다. 공유 상태나 의존성 없이 해결 가능한 무관련 장애 시나리오에 맞게 설계되었습니다. 핵심 기능은 병렬 문제 해결로, 각 독립 문제 영역마다 하나의 에이전트를 할당하여 효율성을 극대화합니다.
