返回技能列表

build-coherence

pjt222
更新于 Yesterday
3 次查看
17
2
17
在 GitHub 上查看
aidesign

关于

The `build-coherence` skill helps developers choose between multiple valid solutions by applying a structured, multi-path reasoning process inspired by swarm intelligence. It independently evaluates competing approaches, uses explicit reasoning ("waggle dance") to justify choices, and employs confidence thresholds for decision-making. Use it when you need to commit to a single architectural choice, justify a tool selection, or before an irreversible action with high stakes.

快速安装

Claude Code

推荐
主要方式
npx skills add pjt222/agent-almanac -a claude-code
插件命令备选方式
/plugin add https://github.com/pjt222/agent-almanac
Git 克隆备选方式
git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/build-coherence

在 Claude Code 中复制并粘贴此命令以安装该技能

技能文档

Build Coherence

Evaluate competing approaches through independent assessment, explicit reasoning-out-loud advocacy, confidence-calibrated commitment thresholds, and structured deadlock resolution — producing coherent decisions from multiple reasoning paths.

When to Use

  • forage-solutions has identified multiple valid approaches and a selection must be made
  • Oscillating between two approaches without committing to either
  • Needing to justify a decision with structured reasoning (architecture choice, tool selection, implementation strategy)
  • When a previous decision was made by gut feeling and needs evidence-based validation
  • When internal reasoning is producing contradictory conclusions and coherence must be restored
  • Before an irreversible action (merging, deploying, deleting) where the cost of the wrong choice is high

Inputs

  • Required: Two or more competing approaches to evaluate
  • Optional: Quality assessments from prior scouting (see forage-solutions)
  • Optional: Decision stakes (reversible, moderate, irreversible) for threshold calibration
  • Optional: Time budget for the decision
  • Optional: Known failure mode (oscillation, premature commitment, groupthink)

Procedure

Step 1: Independent Evaluation

Assess each approach on its own merits before comparing them. The critical rule: do not let the assessment of approach A bias the assessment of approach B.

For each approach, evaluate independently:

Approach Evaluation Template:
┌────────────────────────┬──────────────────────────────────────────┐
│ Dimension              │ Assessment                               │
├────────────────────────┼──────────────────────────────────────────┤
│ Approach name          │                                          │
├────────────────────────┼──────────────────────────────────────────┤
│ Core mechanism         │ How does this approach solve the problem? │
├────────────────────────┼──────────────────────────────────────────┤
│ Strengths (2-3)        │ What does this approach do well?          │
├────────────────────────┼──────────────────────────────────────────┤
│ Risks (2-3)            │ What could go wrong? What is assumed?     │
├────────────────────────┼──────────────────────────────────────────┤
│ Evidence quality        │ How well-supported is this approach?      │
│                        │ (verified / inferred / speculated)        │
├────────────────────────┼──────────────────────────────────────────┤
│ Quality score (0-100)  │ Overall assessment                        │
├────────────────────────┼──────────────────────────────────────────┤
│ Confidence (0-100)     │ How confident in this assessment?         │
└────────────────────────┴──────────────────────────────────────────┘

Fill this out for each approach separately. Do not write a comparison until all individual evaluations are complete.

Got: Independent evaluations where each approach is assessed on its own terms. The evaluation of approach B does not reference approach A. Quality scores reflect genuine assessment, not ranking.

If fail: If the evaluations are contaminated (you find yourself writing "better than A" while assessing B), reset. Assess A completely, then clear the framing and assess B from scratch. If the scores are all identical, the evaluation dimensions are too coarse — add domain-specific criteria.

Step 2: Waggle Dance — Reason Out Loud

Advocate for each approach proportionally to its quality. This is the AI equivalent of the bee waggle dance: making implicit reasoning explicit and public.

  1. For each approach, state the case for it — as if presenting to a skeptical user:
    • "Approach A is strong because [evidence]. The main risk is [risk], which is mitigated by [mitigation]."
  2. Advocacy intensity should be proportional to quality score:
    • High-quality approach: detailed advocacy with specific evidence
    • Medium-quality approach: brief advocacy with acknowledged limitations
    • Low-quality approach: mentioned for completeness, not actively advocated
  3. Cross-inspection: after advocating for A, actively look for evidence that supports B instead. After advocating for B, look for evidence that supports A. This counteracts confirmation bias

The purpose of reasoning-out-loud is to make the decision auditable — to yourself and to the user. If the reasoning cannot be articulated, the assessment is shallower than the score suggests.

Got: Explicit reasoning for each approach that would be persuasive to a neutral observer. Cross-inspection reveals at least one consideration that was initially overlooked.

If fail: If advocacy feels perfunctory (going through motions), the approaches may not be genuinely different — they may be variations of the same idea. Check: do the approaches differ in mechanism, or only in implementation detail? If the latter, the decision may not matter much — pick either and move on.

Step 3: Set Quorum Threshold and Commit

Set the confidence threshold required to commit, calibrated to the decision's stakes.

Confidence Thresholds by Stakes:
┌─────────────────────┬───────────┬──────────────────────────────────┐
│ Decision Type       │ Threshold │ Rationale                        │
├─────────────────────┼───────────┼──────────────────────────────────┤
│ Easily reversible   │ 60%       │ Cost of trying and reverting is  │
│ (can undo)          │           │ low. Speed matters more than     │
│                     │           │ certainty                        │
├─────────────────────┼───────────┼──────────────────────────────────┤
│ Moderate stakes     │ 75%       │ Reverting has cost but is        │
│ (costly to reverse) │           │ possible. Worth investing in     │
│                     │           │ evaluation                       │
├─────────────────────┼───────────┼──────────────────────────────────┤
│ Irreversible or     │ 90%       │ Cannot undo. Must be confident.  │
│ high-stakes         │           │ If threshold not met, gather     │
│                     │           │ more information before deciding │
└─────────────────────┴───────────┴──────────────────────────────────┘
  1. Classify the decision stakes
  2. Check: does the leading approach's quality score × confidence reach the threshold?
  3. If yes: commit. State the decision, the reasoning, and the key risk being accepted
  4. If no: identify what additional information would raise confidence to the threshold
  5. Once committed, do not revisit unless new disqualifying evidence emerges

Got: A clear commitment moment with stated reasoning. The decision is made at an appropriate confidence level for its stakes.

If fail: If the threshold is never met (can't reach 90% on an irreversible decision), ask: is the decision truly irreversible? Can it be decomposed into a reversible test phase + an irreversible commit? Most apparently irreversible decisions can be staged. If staging is impossible, communicate the uncertainty to the user and ask for guidance.

Step 4: Resolve Deadlocks

When two or more approaches have similar scores and the quorum threshold is not met for any single one.

Deadlock Resolution:
┌────────────────────────┬──────────────────────────────────────────┐
│ Deadlock Type          │ Resolution                               │
├────────────────────────┼──────────────────────────────────────────┤
│ Genuine tie            │ The approaches are equivalent. Pick one  │
│ (scores within 5%)     │ and commit. The cost of deliberating     │
│                        │ exceeds the cost of picking the "wrong"  │
│                        │ equivalent option. Flip a coin mentally  │
├────────────────────────┼──────────────────────────────────────────┤
│ Information deficit    │ The tie exists because evaluation is     │
│ (scores uncertain)     │ incomplete. Invest one more specific     │
│                        │ investigation — a targeted file read, a  │
│                        │ quick test — then re-score               │
├────────────────────────┼──────────────────────────────────────────┤
│ Oscillation            │ Scoring keeps flip-flopping depending on │
│ (scores keep changing) │ which dimension gets attention. Time-box:│
│                        │ set a timer, evaluate once more, commit  │
│                        │ to the result regardless                 │
├────────────────────────┼──────────────────────────────────────────┤
│ Approach merge         │ The best parts of A and B can be         │
│ (compatible strengths) │ combined. Check for compatibility. If    │
│                        │ merge is coherent, use it. If forced,    │
│                        │ don't — pick one                         │
└────────────────────────┴──────────────────────────────────────────┘

Got: Deadlock resolved through the appropriate mechanism. The resolution is decisive — no lingering doubt that undermines execution.

If fail: If the deadlock persists through all resolution strategies, the decision may be premature. Ask the user: "I see two equally strong approaches: [A] and [B]. [Brief case for each.] Which aligns better with your priorities?" Delegating a genuine tie to the user is not a failure — it is acknowledging that the decision depends on values the AI cannot infer.

Step 5: Assess Coherence Quality

After committing to a decision, evaluate whether the process produced genuine coherence or just a decision.

  1. Was the decision evidence-based, or was it rubber-stamping an initial preference?
    • Test: was the preference the same before and after evaluation? If so, did the evaluation change anything?
  2. Were the losing approaches genuinely considered, or were they straw men?
    • Test: can you articulate the strongest case for the losing approach?
  3. What signal would trigger reassessment?
    • Define a specific observation that would invalidate the decision ("If I discover that the API doesn't support X, then approach B becomes better")
  4. Is there useful information from the losing approaches that should inform implementation?
    • A risk identified in approach B might apply to approach A as well

Got: A brief quality check that either confirms the decision or identifies it as weak. If weak, return to the appropriate earlier step rather than proceeding on shaky ground.

If fail: If the quality check reveals that the decision was preference-based rather than evidence-based, acknowledge it honestly. Sometimes preference is all that is available — but it should be labeled as such, not dressed up as analysis.

Validation

  • Each approach was evaluated independently before comparison
  • Advocacy was proportional to quality (not equal attention regardless of merit)
  • Cross-inspection was performed (looking for counter-evidence after advocacy)
  • Quorum threshold was calibrated to decision stakes
  • If deadlocked, a specific resolution strategy was applied
  • Post-decision quality check was performed
  • A reassessment trigger was defined

Pitfalls

  • Premature commitment: Deciding before evaluating all approaches. The first approach considered has an anchoring advantage — it gets more mental attention by being first. Evaluate all before comparing
  • Equal advocacy for unequal approaches: If approach A scored 85 and approach B scored 45, spending equal time advocating for both wastes effort and creates false equivalence
  • Rubber-stamping: Going through the evaluation process to justify a decision already made. The test is whether the evaluation could have changed the outcome. If not, the process was theater
  • Threshold avoidance: Lowering the confidence threshold to make the decision easier rather than gathering the information needed to meet the appropriate threshold
  • Ignoring the losing side: The losing approach often contains warnings that apply to the winning one. Risks identified in approach B don't disappear because approach A was chosen

Related Skills

  • build-consensus — the multi-agent consensus model that this skill adapts to single-agent reasoning
  • forage-solutions — scouts the solution space that coherence evaluates; typically precedes this skill
  • coordinate-reasoning — manages information flow during multi-path evaluation
  • center — establishes the balanced baseline needed for unbiased evaluation
  • meditate — clears assumptions between evaluating different approaches

GitHub 仓库

pjt222/agent-almanac
路径: i18n/caveman-lite/skills/build-coherence
0
agentsagentskillsai-assisted-developmentclaude-codeskillsteams

相关推荐技能

content-collections

Content Collections 是一个 TypeScript 优先的构建工具,可将本地 Markdown/MDX 文件转换为类型安全的数据集合。它专为构建博客、文档站和内容密集型 Vite+React 应用而设计,提供基于 Zod 的自动模式验证。该工具涵盖从 Vite 插件配置、MDX 编译到生产环境部署的完整工作流。

查看技能

polymarket

这个Claude Skill为开发者提供完整的Polymarket预测市场开发支持,涵盖API调用、交易执行和市场数据分析。关键特性包括实时WebSocket数据流,可监控实时交易、订单和市场动态。开发者可用它构建预测市场应用、实施交易策略并集成实时市场预测功能。

查看技能

creating-opencode-plugins

该Skill帮助开发者创建OpenCode插件,用于接入命令、文件、LSP等25+种事件。它提供了插件结构、事件API规范和JavaScript/TypeScript实现模式,适合需要拦截操作、扩展功能或自定义事件处理的场景。开发者可通过它快速构建响应式模块来增强OpenCode AI助手的能力。

查看技能

sglang

SGLang是一个专为LLM设计的高性能推理框架,特别适用于需要结构化输出的场景。它通过RadixAttention前缀缓存技术,在处理JSON、正则表达式、工具调用等具有重复前缀的复杂工作流时,能实现极速生成。如果你正在构建智能体或多轮对话系统,并追求远超vLLM的推理性能,SGLang是理想选择。

查看技能