unleash-the-agents
について
このスキルは、明確な解決策が見えない複雑なクロスドメイン問題に対して、多様な仮説を生成するために複数のAIエージェントを並列起動します。単一エージェントアプローチが失敗する場合や、深い専門性よりも広範な視点の探索が必要な場合に理想的です。このプロセスは、収束分析と敵対的洗練を伴った、ランク付けされた仮説セットを出力します。
クイックインストール
Claude Code
推奨npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/unleash-the-agentsこのコマンドをClaude Codeにコピー&ペーストしてスキルをインストールします
ドキュメント
Unleash the Agents
Consult all available agents in parallel waves to generate diverse hypotheses for open-ended problems. Each agent reasons through its unique domain lens — a kabalist finds patterns via gematria, a martial-artist proposes conditional branching, a contemplative notices structure by sitting with the data. Convergence across independent perspectives is the primary signal that a hypothesis has merit.
When to Use
- Facing a cross-domain problem where the correct approach is unknown
- A single-agent or single-domain approach has stalled or produced no signal
- The problem benefits from genuinely diverse perspectives (not just more compute)
- You need hypothesis generation, not execution (use teams for execution)
- High-stakes decisions where missing a non-obvious angle carries real cost
Inputs
- Required: Problem brief — a clear description of the problem, 5+ concrete examples, and what counts as a solution
- Required: Verification method — how to test whether a hypothesis is correct (programmatic test, expert review, or null model comparison)
- Optional: Agent subset — specific agents to include or exclude (default: all registered agents)
- Optional: Wave size — number of agents per wave (default: 10)
- Optional: Output format — structured template for agent responses (default: hypothesis + reasoning + confidence + testable prediction)
Procedure
Step 1: Prepare the Brief
Write a problem brief that any agent can understand regardless of domain expertise. Include:
- Problem statement: What you are trying to discover or decide (1-2 sentences)
- Examples: At least 5 concrete input/output examples or data points (more is better — 3 is too few for most agents to find patterns)
- Known constraints: What you already know, what has already been tried
- Success criteria: How to recognize a correct hypothesis
- Output template: The exact format you want responses in
## Brief: [Problem Title]
**Problem**: [1-2 sentence statement]
**Examples**:
1. [Input] → [Output] (explain what's known)
2. [Input] → [Output]
3. [Input] → [Output]
4. [Input] → [Output]
5. [Input] → [Output]
**Already tried**: [List failed approaches to avoid rediscovery]
**Success looks like**: [Testable criterion]
**Respond with**:
- Hypothesis: [Your proposed mechanism in one sentence]
- Reasoning: [Why your domain expertise suggests this]
- Confidence: [low/medium/high]
- Testable prediction: [If my hypothesis is correct, then X should be true]
Got: A brief that is self-contained — an agent receiving only this text has everything needed to reason about the problem.
If fail: If you cannot articulate 5 examples or a verification method, the problem is not ready for multi-agent consultation. Narrow the scope first.
Step 2: Plan the Waves
List all available agents and divide them into waves of ~10. Ordering does not matter for the first 2 waves; for subsequent waves, inter-wave knowledge injection improves results.
# List all agents from registry
grep ' - id: ' agents/_registry.yml | sed 's/.*- id: //' | shuf
Assign agents to waves. Plan for 4 waves initially — you may not need all of them (see early stopping in Step 4).
| Wave | Agents | Brief variant |
|---|---|---|
| 1-2 | 20 agents | Standard brief |
| 3 | 10 agents + advocatus-diaboli | Brief + emerging consensus + adversarial challenge |
| 4+ | 10 agents each | Brief + "X is confirmed. Focus on edge cases and failures." |
Got: A wave assignment table with all agents allocated. Include advocatus-diaboli in Wave 3 (not later) so the adversarial pass informs subsequent waves.
If fail: If fewer than 20 agents are available, reduce to 2-3 waves. The pattern still works with as few as 10 agents, though convergence signals are weaker.
Step 3: Launch Waves
Launch each wave as parallel agents. Use sonnet model for cost efficiency (the value comes from perspective diversity, not individual depth).
Option A: TeamCreate (recommended for full unleash)
Use Claude Code's TeamCreate tool to set up a coordinated team with task tracking. TeamCreate is a deferred tool — fetch it first via ToolSearch("select:TeamCreate").
- Create the team:
TeamCreate({ team_name: "unleash-wave-1", description: "Wave 1: open-ended hypothesis generation" }) - Create a task per agent using
TaskCreatewith the brief and domain-specific framing - Spawn each agent as a teammate using the
Agenttool withteam_name: "unleash-wave-1"andsubagent_typeset to the agent's type (e.g.,kabalist,geometrist) - Assign tasks to teammates via
TaskUpdatewithowner - Monitor progress via
TaskList— teammates mark tasks completed as they finish - Between waves, shut down the current team via
SendMessage({ type: "shutdown_request" })and create the next team with the updated brief (Step 4)
This gives you built-in coordination: a shared task list tracks which agents have responded, teammates can be messaged for follow-up, and the lead manages wave transitions through task assignment.
Option B: Raw Agent spawning (simpler, for smaller runs)
For each agent in the wave, spawn it with the brief and a domain-specific framing:
Use the [agent-name] agent to analyze this problem through your domain expertise.
[Paste the brief]
Think about this from your specific perspective as a [agent-description].
[For non-technical agents: add a domain-specific framing, e.g., "What patterns
does your tradition recognize in systems that exhibit this kind of threshold behavior?"]
Respond exactly in the requested format.
Launch all agents in a wave simultaneously using the Agent tool with run_in_background: true. Wait for the wave to complete before launching the next wave (to enable inter-wave knowledge injection in Step 4).
Choosing between options
| TeamCreate | Raw Agent | |
|---|---|---|
| Best for | Tier 3 full unleash (40+ agents) | Tier 2 panel (5-10 agents) |
| Coordination | Task list, messaging, ownership | Fire-and-forget, manual collection |
| Inter-wave handoff | Task status carries over | Must track manually |
| Overhead | Higher (team setup per wave) | Lower (single tool call per agent) |
Got: Each wave returns ~10 structured responses within 2-5 minutes. Agents that fail to respond or return off-format output are noted but do not block the pipeline.
If fail: If more than 50% of a wave fails, check the brief clarity. Common cause: the output template is ambiguous, or examples are insufficient for non-domain agents to reason about.
Step 4: Inject Inter-Wave Knowledge (and Evaluate Early Stopping)
After waves 1-2, extract the emerging signal before launching the next wave.
- Scan responses from completed waves for recurring themes
- Identify the most common hypothesis family (the convergence signal)
- Check the early stopping threshold: if the top family already exceeds 3x the null model expectation after 20 agents, you have strong signal. Plan Wave 3 as an adversarial + refinement wave and consider stopping after it
- Update the brief for the next wave:
**Update from prior waves**: [N] agents independently proposed [hypothesis family].
Build on this — what explains the remaining cases where this hypothesis fails?
Do NOT simply restate this finding. Extend, challenge, or refine it.
Early stopping guidance: Not every unleash needs all agents. For well-defined problem domains (e.g., codebase analysis), convergence often stabilizes at 30-40 agents. For abstract or open-ended problems (e.g., unknown mathematical transformations), the full roster adds value because the correct domain is genuinely unpredictable. Check convergence after each wave — if the top family's count and null-model ratio have plateaued, additional waves yield diminishing returns.
This prevents rediscovery (where later waves independently re-derive what earlier waves already found) and directs later agents toward the edges of the problem.
Got: Later waves produce more nuanced, targeted hypotheses that address gaps in the emerging consensus.
If fail: If no convergence appears after 2 waves, the problem may be too unconstrained. Consider narrowing the scope or providing more examples.
Step 5: Collect and Deduplicate
After all waves complete, gather all responses into a single document. Deduplicate by grouping hypotheses into families:
- Extract all hypothesis statements
- Cluster by mechanism (not by wording — "modular arithmetic mod 94" and "cyclic group over Z_94" are the same family)
- Count independent discoveries per family
- Rank by convergence: families discovered by more agents independently rank higher
Got: A ranked list of hypothesis families with convergence counts, contributing agents, and representative testable predictions.
If fail: If every hypothesis is unique (no convergence), the signal-to-noise ratio is too low. Either the problem needs more examples, or agents need a tighter output format.
Step 6: Verify Against Null Model
Test the top hypothesis against a null model to ensure the convergence is meaningful, not an artifact of shared training data.
- Programmatic verification: If the hypothesis produces a testable formula or algorithm, run it against held-out examples
- Null model: Estimate the probability that N agents would converge on the same hypothesis family by chance (e.g., if there are K reasonable hypothesis families, random convergence probability is ~N/K)
- Threshold: Signal is meaningful if convergence exceeds 3x the null model expectation
Got: The top hypothesis family significantly exceeds chance-level convergence and/or passes programmatic verification.
If fail: If the top hypothesis fails verification, check the second-ranked family. If no family passes, the problem may require a different approach (deeper single-expert analysis, more data, or reformulated examples).
Step 7: Adversarial Refinement
Preferred timing: Wave 3, not post-synthesis. Including advocatus-diaboli in Wave 3 (alongside the inter-wave knowledge injection) is more effective than a standalone adversarial pass after all waves complete. Early challenge lets Waves 4+ refine against the critique rather than piling onto an unchallenged consensus.
If the adversarial pass was already part of Wave 3, this step becomes a final check. If not (e.g., you ran all waves without it), spawn advocatus-diaboli (or senior-researcher) now. For a structured pass, use TeamCreate to stand up a review team with both agents working in parallel against the consensus:
Here is the consensus hypothesis from [N] independent agents:
[Hypothesis]
[Supporting evidence and convergence stats]
Your job: find the strongest counterarguments. Where does this fail?
What alternative explanations are equally consistent with the evidence?
What experiment would definitively falsify this hypothesis?
Got: A set of counterarguments, edge cases, and a falsification experiment. If the hypothesis survives adversarial scrutiny, it is ready for integration. A good adversarial pass sometimes partially defends the consensus — finding that the design is better than alternatives even if imperfect.
If fail: If the adversarial agent finds a fatal flaw, feed the critique back into a targeted follow-up wave (Tier 3+ iterative mode — select 5-10 agents best positioned to address the specific critique).
Step 8: Hand Off to Teams
Unleash finds problems; teams solve them. Convert verified hypothesis families into actionable issues, then assemble focused teams to resolve each.
- Create a GitHub issue per verified hypothesis family (use the
create-github-issuesskill) - Prioritize issues by convergence strength and impact
- For each issue, assemble a small team via
TeamCreate:- If a predefined team definition in
teams/matches the problem domain, use it - If no fitting team exists, default to
opaque-team(N shapeshifters with adaptive role assignment) — it handles unknown problem shapes without requiring a custom composition - Include at least one non-technical agent (e.g.,
advocatus-diaboli,contemplative) — they catch implementation risks that technical agents miss - Use REST checkpoints between phases to prevent rushing
- If a predefined team definition in
- The pipeline is: unleash → triage → team-per-issue → resolve
Got: Each hypothesis family maps to a tracked issue with a team assigned. The unleash produced the diagnosis; the teams produce the fix.
If fail: If the team composition doesn't match the problem, reassign. Shapeshifter agents can research and design but lack write tools — the team lead must apply their code suggestions.
Validation
- All available agents were consulted (or a deliberate subset was chosen with justification)
- Responses were collected in a structured, parseable format
- Hypotheses were deduplicated and ranked by independent convergence
- The top hypothesis was verified against a null model or programmatic test
- An adversarial pass challenged the consensus
- The final hypothesis includes testable predictions and known limitations
Pitfalls
- Too few examples in the brief: Agents need 5+ examples to find patterns. With 3 examples, most agents resort to surface-level pattern matching or template echo (repeating the brief back in different words).
- No verification path: Without a way to test hypotheses, you cannot distinguish signal from noise. Convergence alone is necessary but not sufficient.
- Metaphorical responses: Domain-specialist agents (mystic, shaman, kabalist) may respond with rich metaphorical reasoning that is hard to parse programmatically. Include "Express your hypothesis as a testable formula or algorithm" in the output template.
- Rediscovery across waves: Without inter-wave knowledge injection, waves 3-7 independently rediscover what waves 1-2 already found. Always update the brief between waves.
- Over-interpreting convergence: 43% convergence on a mechanism family sounds impressive, but check the base rate. If there are only 3 plausible mechanism families, random convergence would be ~33%.
- Expecting single-family dominance: Abstract problems (pattern recognition, cryptography) tend to produce one dominant hypothesis family. Multi-dimensional problems (codebase analysis, system design) produce broader convergence across multiple valid families — this is expected and healthy, not a failure of the pattern.
- Generic framing for non-technical agents: The quality of a non-technical agent's contribution depends on how the brief frames the problem in their domain language. "What does your tradition say about systems at this threshold?" produces structural insight; a generic brief produces nothing. Invest in domain-specific framing for agents outside the problem's natural domain.
- Using this for execution: This pattern generates hypotheses, not implementations. Once you have verified hypotheses, convert them to issues and hand off to teams (Step 8). The pipeline is unleash → triage → team-per-issue.
Related Skills
forage-solutions— ant colony optimization for exploring solution spaces (complementary: narrower scope, deeper exploration)build-coherence— bee democracy for selecting among competing approaches (use after this skill to choose between top hypotheses)coordinate-reasoning— stigmergic coordination for managing information flow between agentscoordinate-swarm— broader swarm coordination patterns for distributed systemsexpand-awareness— open perception before narrowing (complementary: use as individual agent preparation)meditate— clear context noise before launching (recommended before Step 1)
GitHub リポジトリ
関連スキル
content-collections
メタこのスキルは、Content Collections(Markdown/MDXファイルを型安全なデータコレクションに変換するTypeScriptファーストのツール)の本番環境でテストされた設定を提供します。Zodバリデーションによる型安全性を実現し、ブログ、ドキュメントサイト、コンテンツ重視のVite + Reactアプリケーション構築時にご利用ください。Viteプラグインの設定、MDXコンパイルから、デプロイ最適化、スキーマバリデーションまで、すべてを網羅しています。
polymarket
メタこのスキルは、開発者がPolymarket予測市場プラットフォームを活用したアプリケーション構築を可能にします。API統合による取引や市場データの取得に加え、WebSocketを介したリアルタイムデータストリーミングにより、ライブ取引や市場活動を監視できます。取引戦略の実装や、ライブ市場更新を処理するツールの作成にご利用ください。
creating-opencode-plugins
メタこのスキルは、開発者がコマンド、ファイル、LSP操作など25種類以上のイベントタイプにフックするOpenCodeプラグインを作成することを支援します。JavaScript/TypeScriptモジュール向けに、プラグイン構造、イベントAPI仕様、および実装パターンを提供します。カスタムイベント駆動ロジックでOpenCode AIアシスタントのライフサイクルをインターセプト、監視、または拡張する必要がある場合にご利用ください。
sglang
メタSGLangは、高性能なLLMサービングフレームワークであり、RadixAttentionプレフィックスキャッシュを活用したJSON、正規表現、エージェントワークフロー向けの高速で構造化された生成を特長とします。特にプレフィックスが繰り返されるタスクにおいて、大幅に高速な推論を実現し、複雑な構造化出力やマルチターン対話に最適です。制約付きデコードが必要な場合や、広範なプレフィックス共有を伴うアプリケーションを構築する場合は、vLLMなどの代替案ではなくSGLangを選択してください。
