unleash-the-agents
关于
This skill launches multiple AI agents in parallel to generate diverse hypotheses for complex, cross-domain problems where the solution space is unclear. It is ideal when standard approaches have stalled or when broad exploration is needed over deep, narrow expertise. The output is a ranked set of hypotheses refined through convergence analysis and adversarial review.
快速安装
Claude Code
推荐npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/unleash-the-agents在 Claude Code 中复制并粘贴此命令以安装该技能
技能文档
Unleash the Agents
Consult all agents in parallel waves → diverse hypotheses for open-ended problems. Each agent reasons through unique domain lens — kabalist via gematria, martial-artist via conditional branching, contemplative by sitting w/ data. Convergence across independent perspectives = primary signal of merit.
Use When
- Cross-domain problem → correct approach unknown
- Single-agent|single-domain stalled or no signal
- Problem benefits from genuinely diverse perspectives (not more compute)
- Need hypothesis generation, not exec (use teams for exec)
- High-stakes → missing non-obvious angle costs
In
- Required: Problem brief — clear description, 5+ concrete examples, what counts as solution
- Required: Verify method — how to test hypothesis (programmatic, expert review, null model)
- Optional: Agent subset — include|exclude (default: all registered)
- Optional: Wave size — agents per wave (default: 10)
- Optional: Out format — structured template (default: hypothesis + reasoning + confidence + testable prediction)
Do
Step 1: Brief
Write brief any agent can understand regardless of domain. Include:
- Problem: Discover|decide (1-2 sent)
- Examples: 5+ concrete in/out|data points (more better — 3 too few)
- Constraints: Known + tried
- Success: Recognize correct hypothesis
- Out template: Exact format
## Brief: [Problem Title]
**Problem**: [1-2 sentence statement]
**Examples**:
1. [Input] → [Output] (explain what's known)
2. [Input] → [Output]
3. [Input] → [Output]
4. [Input] → [Output]
5. [Input] → [Output]
**Already tried**: [List failed approaches to avoid rediscovery]
**Success looks like**: [Testable criterion]
**Respond with**:
- Hypothesis: [Your proposed mechanism in one sentence]
- Reasoning: [Why your domain expertise suggests this]
- Confidence: [low/medium/high]
- Testable prediction: [If my hypothesis is correct, then X should be true]
Got: Self-contained brief — agent receiving only this has all to reason.
If err: Can't articulate 5 examples|verify method → problem not ready for multi-agent. Narrow scope first.
Step 2: Plan Waves
List all agents, divide into waves of ~10. Order doesn't matter waves 1-2; subsequent waves → inter-wave knowledge injection improves results.
# List all agents from registry
grep ' - id: ' agents/_registry.yml | sed 's/.*- id: //' | shuf
Assign agents to waves. Plan 4 waves initially → may not need all (early stop Step 4).
| Wave | Agents | Brief variant |
|---|---|---|
| 1-2 | 20 agents | Standard brief |
| 3 | 10 agents + advocatus-diaboli | Brief + emerging consensus + adversarial challenge |
| 4+ | 10 agents each | Brief + "X is confirmed. Focus on edge cases and failures." |
Got: Wave assignment table all agents allocated. advocatus-diaboli in Wave 3 (not later) → adversarial informs subsequent waves.
If err: < 20 agents → reduce 2-3 waves. Pattern works w/ as few as 10, weaker convergence signals.
Step 3: Launch Waves
Launch each wave parallel. sonnet model → cost efficiency (value from perspective diversity, not depth).
Option A: TeamCreate (recommended for full unleash)
TeamCreate for coordinated team w/ task tracking. Deferred tool → fetch via ToolSearch("select:TeamCreate").
- Create team:
TeamCreate({ team_name: "unleash-wave-1", description: "Wave 1: open-ended hypothesis generation" }) TaskCreateper agent → brief + domain-specific framing- Spawn each agent as teammate via
Agentw/team_name: "unleash-wave-1"+subagent_type(e.g.,kabalist,geometrist) - Assign tasks via
TaskUpdatew/owner - Monitor via
TaskList→ teammates mark completed - Between waves → shut down via
SendMessage({ type: "shutdown_request" })+ create next w/ updated brief (Step 4)
Built-in coord: shared task list tracks responses, teammates messaged for follow-up, lead manages wave transitions.
Option B: Raw Agent spawning (simpler, smaller runs)
Per agent in wave, spawn w/ brief + domain framing:
Use the [agent-name] agent to analyze this problem through your domain expertise.
[Paste the brief]
Think about this from your specific perspective as a [agent-description].
[For non-technical agents: add a domain-specific framing, e.g., "What patterns
does your tradition recognize in systems that exhibit this kind of threshold behavior?"]
Respond exactly in the requested format.
Launch all in wave simultaneous via Agent w/ run_in_background: true. Wait for wave complete before next (inter-wave knowledge injection Step 4).
Choosing options
| TeamCreate | Raw Agent | |
|---|---|---|
| Best for | Tier 3 full unleash (40+ agents) | Tier 2 panel (5-10 agents) |
| Coordination | Task list, messaging, ownership | Fire-and-forget, manual collection |
| Inter-wave handoff | Task status carries over | Must track manually |
| Overhead | Higher (team setup per wave) | Lower (single tool call per agent) |
Got: Each wave returns ~10 structured responses in 2-5 min. Failed|off-format noted but no block.
If err: > 50% wave fails → check brief clarity. Common: out template ambiguous|examples insufficient for non-domain agents.
Step 4: Inject Inter-Wave Knowledge (+ Eval Early Stop)
After waves 1-2, extract emerging signal before next.
- Scan responses → recurring themes
- ID most common hypothesis family (convergence signal)
- Early stop check: top family > 3x null model expectation after 20 agents → strong signal. Plan Wave 3 as adversarial + refinement, consider stop after.
- Update brief for next wave:
**Update from prior waves**: [N] agents independently proposed [hypothesis family].
Build on this — what explains the remaining cases where this hypothesis fails?
Do NOT simply restate this finding. Extend, challenge, or refine it.
Early stop guidance: Not every unleash needs all agents. Well-defined domains (codebase analysis) → convergence stabilizes 30-40 agents. Abstract|open-ended (unknown math transformations) → full roster adds value because correct domain genuinely unpredictable. Check convergence after each wave → top family count + null-model ratio plateaued → diminishing returns.
Prevents rediscovery (later waves re-deriving earlier finds) + directs later agents to edges.
Got: Later waves → more nuanced, targeted hypotheses addressing gaps in emerging consensus.
If err: No convergence after 2 waves → too unconstrained. Narrow scope or more examples.
Step 5: Collect + Dedupe
After waves complete, gather responses → single doc. Dedupe by grouping into families:
- Extract all hypothesis statements
- Cluster by mechanism (not wording — "modular arithmetic mod 94" + "cyclic group over Z_94" same family)
- Count independent discoveries per family
- Rank by convergence: families w/ more independent discoveries higher
Got: Ranked list of families w/ convergence counts, contributing agents, representative testable predictions.
If err: Every hypothesis unique (no convergence) → S/N too low. Need more examples or tighter out format.
Step 6: Verify vs Null Model
Test top hypothesis vs null model → ensure convergence meaningful, not training data artifact.
- Programmatic verify: Hypothesis produces testable formula|algo → run vs held-out examples
- Null model: Estimate prob N agents converge by chance (K reasonable families → random ~N/K)
- Threshold: Signal meaningful if convergence > 3x null model
Got: Top family significantly exceeds chance-level convergence and/or passes programmatic verify.
If err: Top fails verify → check 2nd-ranked. None passes → different approach (deeper single-expert, more data, reformulated examples).
Step 7: Adversarial Refinement
Preferred timing: Wave 3, not post-synthesis. advocatus-diaboli in Wave 3 (alongside inter-wave injection) > standalone after all waves. Early challenge → Waves 4+ refine vs critique not pile on unchallenged consensus.
If adversarial part of Wave 3, this = final check. If not (ran all waves w/o it) → spawn advocatus-diaboli (or senior-researcher) now. Structured pass → TeamCreate for review team w/ both parallel vs consensus:
Here is the consensus hypothesis from [N] independent agents:
[Hypothesis]
[Supporting evidence and convergence stats]
Your job: find the strongest counterarguments. Where does this fail?
What alternative explanations are equally consistent with the evidence?
What experiment would definitively falsify this hypothesis?
Got: Counterarguments, edge cases, falsification experiment. Survives adversarial → ready for integration. Good adversarial sometimes partially defends consensus → design better than alts even if imperfect.
If err: Adversarial finds fatal flaw → feed critique → targeted follow-up wave (Tier 3+ iterative — 5-10 agents best for critique).
Step 8: Hand Off → Teams
Unleash finds problems; teams solve. Convert verified families → actionable issues, assemble focused teams.
- GitHub issue per verified family (use
create-github-issues) - Prioritize by convergence strength + impact
- Per issue, assemble small team via
TeamCreate:- Predefined team in
teams/matches → use it - No fit → default
opaque-team(N shapeshifters, adaptive role) → handles unknown shapes w/o custom comp - Include 1+ non-tech agent (
advocatus-diaboli,contemplative) → catches implementation risks tech misses - REST checkpoints between phases → prevent rushing
- Predefined team in
- Pipeline: unleash → triage → team-per-issue → resolve
Got: Each family maps → tracked issue w/ team. Unleash diagnosed; teams fix.
If err: Team comp doesn't match → reassign. Shapeshifters can research+design but lack write tools → lead applies code suggestions.
Check
- All available agents consulted (or deliberate subset w/ justification)
- Responses collected in structured, parseable format
- Hypotheses deduped + ranked by independent convergence
- Top verified vs null model or programmatic test
- Adversarial pass challenged consensus
- Final hypothesis includes testable predictions + known limits
Traps
- Too few examples: Agents need 5+. W/ 3 → surface pattern matching|template echo (brief back in different words).
- No verify path: W/o test, can't distinguish signal from noise. Convergence necessary but not sufficient.
- Metaphorical responses: Domain specialists (mystic, shaman, kabalist) → rich metaphor hard to parse programmatically. Include "Express as testable formula or algorithm" in template.
- Rediscovery across waves: W/o inter-wave injection → waves 3-7 rediscover what 1-2 found. Update brief between.
- Over-interpret convergence: 43% on family sounds impressive → check base rate. 3 plausible families → random ~33%.
- Single-family dominance expectation: Abstract problems (pattern recog, crypto) → one dominant family. Multi-dim (codebase, sys design) → broader convergence across multiple valid → expected, healthy, not failure.
- Generic framing for non-tech: Quality depends on framing in domain language. "What does your tradition say about systems at threshold?" → structural insight; generic → nothing. Invest in domain-specific framing.
- Use for exec: Pattern generates hypotheses, not implementations. Verified → convert to issues + teams (Step 8). Pipeline: unleash → triage → team-per-issue.
→
forage-solutions— ant colony opt for solution spaces (complementary: narrower, deeper)build-coherence— bee democracy → select among competing approaches (after this skill to choose between top hypotheses)coordinate-reasoning— stigmergic coord for managing info flowcoordinate-swarm— broader swarm coord for distributed systemsexpand-awareness— open perception before narrow (complementary: individual agent prep)meditate— clear ctx noise before launch (recommended before Step 1)
GitHub 仓库
相关推荐技能
content-collections
元Content Collections 是一个 TypeScript 优先的构建工具,可将本地 Markdown/MDX 文件转换为类型安全的数据集合。它专为构建博客、文档站和内容密集型 Vite+React 应用而设计,提供基于 Zod 的自动模式验证。该工具涵盖从 Vite 插件配置、MDX 编译到生产环境部署的完整工作流。
polymarket
元这个Claude Skill为开发者提供完整的Polymarket预测市场开发支持,涵盖API调用、交易执行和市场数据分析。关键特性包括实时WebSocket数据流,可监控实时交易、订单和市场动态。开发者可用它构建预测市场应用、实施交易策略并集成实时市场预测功能。
creating-opencode-plugins
元该Skill帮助开发者创建OpenCode插件,用于接入命令、文件、LSP等25+种事件。它提供了插件结构、事件API规范和JavaScript/TypeScript实现模式,适合需要拦截操作、扩展功能或自定义事件处理的场景。开发者可通过它快速构建响应式模块来增强OpenCode AI助手的能力。
sglang
元SGLang是一个专为LLM设计的高性能推理框架,特别适用于需要结构化输出的场景。它通过RadixAttention前缀缓存技术,在处理JSON、正则表达式、工具调用等具有重复前缀的复杂工作流时,能实现极速生成。如果你正在构建智能体或多轮对话系统,并追求远超vLLM的推理性能,SGLang是理想选择。
