awareness
关于
The `awareness` skill provides continuous internal threat detection for AI reasoning, focusing on hallucination risk, scope creep, and context degradation. It maps Cooper color codes to reasoning states and uses the OODA loop for real-time decisions. Developers should use it during critical tasks, in unfamiliar territory, or before high-stakes outputs to safeguard reasoning quality.
快速安装
Claude Code
推荐npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/awareness在 Claude Code 中复制并粘贴此命令以安装该技能
技能文档
Awareness
Continuous watch on reasoning quality → catch hallucination, scope creep, ctx rot, confidence-accuracy mismatch. Cooper colors + OODA loop.
Use When
- Any task reasoning matters (most)
- Unfamiliar territory (new repo, new domain)
- Early warn signs: uncertain fact, suspect tool res, confusion
- Background proc during long sessions
center/healshows drift, no specific threat ID'd- Before high-stakes out (irreversible, user-facing, arch)
In
- Required: Active task ctx (implicit)
- Optional: Specific concern ("unsure this API exists")
- Optional: Task type → threat profile (Step 5)
Do
Step 1: Cooper Colors
Calibrate awareness level.
AI Cooper Color Codes:
┌──────────┬─────────────────────┬──────────────────────────────────────────┐
│ Code │ State │ AI Application │
├──────────┼─────────────────────┼──────────────────────────────────────────┤
│ White │ Autopilot │ Generating output without monitoring │
│ │ │ quality. No self-checking. Relying │
│ │ │ entirely on pattern completion. │
│ │ │ DANGEROUS — hallucination risk highest │
├──────────┼─────────────────────┼──────────────────────────────────────────┤
│ Yellow │ Relaxed alert │ DEFAULT STATE. Monitoring output for │
│ │ │ accuracy. Checking facts against context.│
│ │ │ Noticing when confidence exceeds │
│ │ │ evidence. Sustainable indefinitely │
├──────────┼─────────────────────┼──────────────────────────────────────────┤
│ Orange │ Specific risk │ A specific threat identified: uncertain │
│ │ identified │ fact, possible hallucination, scope │
│ │ │ drift, context staleness. Forming │
│ │ │ contingency: "If this is wrong, I │
│ │ │ will..." │
├──────────┼─────────────────────┼──────────────────────────────────────────┤
│ Red │ Risk materialized │ The threat from Orange has materialized: │
│ │ │ confirmed error, user correction, tool │
│ │ │ contradiction. Execute the contingency. │
│ │ │ No hesitation — the plan was made in │
│ │ │ Orange │
├──────────┼─────────────────────┼──────────────────────────────────────────┤
│ Black │ Cascading failures │ Multiple simultaneous failures, lost │
│ │ │ context, fundamental confusion about │
│ │ │ what the task even is. STOP. Ground │
│ │ │ using `center`, then rebuild from user's │
│ │ │ original request │
└──────────┴─────────────────────┴──────────────────────────────────────────┘
ID current color. White answer = practice already won by revealing gap.
→ Honest self-assess. Yellow = normal work. White rare/brief. Long Orange unsustainable — confirm or dismiss.
If err: Assessment itself on autopilot = White in Yellow mask. Real Yellow checks out vs evidence, not just claims to.
Step 2: Threat Indicators
Scan signals that precede AI failures.
Threat Indicator Detection:
┌───────────────────────────┬──────────────────────────────────────────┐
│ Threat Category │ Warning Signals │
├───────────────────────────┼──────────────────────────────────────────┤
│ Hallucination Risk │ • Stating a fact without a source │
│ │ • High confidence about API names, │
│ │ function signatures, or file paths │
│ │ not verified by tool use │
│ │ • "I believe" or "typically" hedging │
│ │ that masks uncertainty as knowledge │
│ │ • Generating code for an API without │
│ │ reading its documentation │
├───────────────────────────┼──────────────────────────────────────────┤
│ Scope Creep │ • "While I'm at it, I should also..." │
│ │ • Adding features not in the request │
│ │ • Refactoring adjacent code │
│ │ • Adding error handling for scenarios │
│ │ that can't happen │
├───────────────────────────┼──────────────────────────────────────────┤
│ Context Degradation │ • Referencing information from early in │
│ │ a long conversation without re-reading │
│ │ • Contradicting a statement made earlier │
│ │ • Losing track of what has been done │
│ │ vs. what remains │
│ │ • Post-compression confusion │
├───────────────────────────┼──────────────────────────────────────────┤
│ Confidence-Accuracy │ • Stating conclusions with certainty │
│ Mismatch │ based on thin evidence │
│ │ • Not qualifying uncertain statements │
│ │ • Proceeding without verification when │
│ │ verification is available and cheap │
│ │ • "This should work" without testing │
└───────────────────────────┴──────────────────────────────────────────┘
Each cat: signal now? Yes → Yellow to Orange, ID specific concern.
→ One cat scanned w/ real attention. Detecting mild signal > "all clear". All clean = threshold too high.
If err: Threat detection abstract → ground in recent out: pick last factual claim, ask "How know true? Read or generated?" Catches most hallucination.
Step 3: OODA Loop
Orange state → Observe-Orient-Decide-Act.
AI OODA Loop:
┌──────────┬──────────────────────────────────────────────────────────────┐
│ Observe │ What specifically triggered the concern? Gather concrete │
│ │ evidence. Read the file, check the output, verify the fact. │
│ │ Do not assess until you have observed │
├──────────┼──────────────────────────────────────────────────────────────┤
│ Orient │ Match observation to known patterns: Is this a common │
│ │ hallucination pattern? A known tool limitation? A context │
│ │ freshness issue? Orient determines response quality │
├──────────┼──────────────────────────────────────────────────────────────┤
│ Decide │ Select the response: verify and correct, flag to user, │
│ │ adjust approach, or dismiss the concern with evidence. │
│ │ A good decision now beats a perfect decision too late │
├──────────┼──────────────────────────────────────────────────────────────┤
│ Act │ Execute the decision immediately. If the concern was valid, │
│ │ correct the error. If dismissed, note why and return to │
│ │ Yellow. Re-enter the loop if new information emerges │
└──────────┴──────────────────────────────────────────────────────────────┘
OODA fast. Goal: rapid cycling obs→action, not perfection. Long Orient = analysis paralysis = common fail.
→ Full loop fast. Threat confirmed + corrected, or dismissed w/ evidence.
If err: Stall at Orient → safe default: verify uncertain fact via tool. Direct obs resolves ambiguity faster than analysis.
Step 4: Stabilize
Red (threat hit) or Black (cascade) → stabilize before continuing.
AI Stabilization Protocol:
┌────────────────────────┬─────────────────────────────────────────────┐
│ Technique │ Application │
├────────────────────────┼─────────────────────────────────────────────┤
│ Pause │ Stop generating output. The next sentence │
│ │ produced under stress is likely to compound │
│ │ the error, not fix it │
├────────────────────────┼─────────────────────────────────────────────┤
│ Re-read user message │ Return to the original request. What did │
│ │ the user actually ask? This is the ground │
│ │ truth anchor │
├────────────────────────┼─────────────────────────────────────────────┤
│ State task in one │ "The task is: ___." If this sentence cannot │
│ sentence │ be written clearly, the confusion is deeper │
│ │ than the immediate error │
├────────────────────────┼─────────────────────────────────────────────┤
│ Enumerate concrete │ List what is definitely known (verified by │
│ facts │ tool use or user statement). Distinguish │
│ │ facts from inferences. Build only on facts │
├────────────────────────┼─────────────────────────────────────────────┤
│ Identify one next step │ Not the whole recovery plan — just one step │
│ │ that moves toward resolution. Execute it │
└────────────────────────┴─────────────────────────────────────────────┘
→ Red/Black → Yellow via deliberate stabilize. Next out more grounded than err-trigger out.
If err: Stabilize fails (still confused, still err) → structural issue, not lapse. Escalate: tell user approach needs reset, ask clarify.
Step 5: Task-Specific Threat Profiles
Diff tasks = diff dominant threats. Calibrate focus.
Task-Specific Threat Profiles:
┌─────────────────────┬─────────────────────┬───────────────────────────┐
│ Task Type │ Primary Threat │ Monitoring Focus │
├─────────────────────┼─────────────────────┼───────────────────────────┤
│ Code generation │ API hallucination │ Verify every function │
│ │ │ name, parameter, and │
│ │ │ import against actual docs│
├─────────────────────┼─────────────────────┼───────────────────────────┤
│ Architecture design │ Scope creep │ Anchor to stated │
│ │ │ requirements. Challenge │
│ │ │ every "nice to have" │
├─────────────────────┼─────────────────────┼───────────────────────────┤
│ Data analysis │ Confirmation bias │ Actively seek evidence │
│ │ │ that contradicts the │
│ │ │ emerging conclusion │
├─────────────────────┼─────────────────────┼───────────────────────────┤
│ Debugging │ Tunnel vision │ If the current hypothesis │
│ │ │ hasn't yielded results in │
│ │ │ N attempts, step back │
├─────────────────────┼─────────────────────┼───────────────────────────┤
│ Documentation │ Context staleness │ Verify that described │
│ │ │ behavior matches current │
│ │ │ code, not historical │
├─────────────────────┼─────────────────────┼───────────────────────────┤
│ Long conversation │ Context degradation │ Re-read key facts │
│ │ │ periodically. Check for │
│ │ │ compression artifacts │
└─────────────────────┴─────────────────────┴───────────────────────────┘
ID current task type, adjust focus.
→ Awareness sharp for likely threats in task type, not generic everything.
If err: Task unclear/spans cats → default to hallucination risk — most universal + most damaging when missed.
Step 6: Review
Each awareness event (threat detected, OODA done, stabilize applied) → brief review.
- What color code active at detection?
- Detection timely or already manifesting in out?
- OODA fast enough or Orient stalled?
- Response proportional (not over/under)?
- What catches earlier next time?
→ Brief calibration → better future detection. Not long post-mortem.
If err: No useful calibration → event trivial or review shallow. Big events → ask "What not monitoring that should have been?"
Step 7: Integrate — Yellow Default
Set ongoing posture.
- Yellow default all work — relaxed monitoring, not hypervigilance
- Adjust focus per task type (Step 5)
- Recurring threat patterns → note for MEMORY.md
- Return to task w/ calibrated awareness active
→ Sustainable level → better quality, not slower. Feels like peripheral vision — present, not demanding central attention.
If err: Awareness exhausting/hypervigilant (chronic Orange) → threshold too sensitive. Raise trigger. Real awareness sustainable. Drains energy = anxiety in vigilance mask.
Check
- Current color code assessed honestly (not default Yellow when White accurate)
- One threat cat scanned w/ specific evidence, not just checked off
- OODA applied to any ID'd threat (obs, orient, decide, act)
- Stabilize proc available if needed (even if not triggered)
- Awareness focus calibrated to task type
- Post-event calibration for significant events
- Yellow re-established as sustainable default
Traps
- White in Yellow mask: Claim monitoring while autopilot. Test: name last fact verified? If not → White
- Chronic Orange: Every uncertainty = threat → drains, slows. Orange = specific risks, not general anxiety. All feels risky → calibration off
- Obs w/o action: Detect threat but no OODA → detection w/o response worse than none, adds anxiety w/o correction
- Skip Orient: Observe→Act direct = reactive corrections maybe worse than orig err
- Ignore gut signal: "Feels wrong" + explicit check clean → investigate more, not dismiss. Implicit pattern-match catches before explicit analysis
- Over-stabilize: Full proc for minor issues. Quick fact-check enough for most Orange. Full stabilize = Red/Black only
→
mindfulness— human practice this skill maps to AI reasoningcenter— baseline awareness operates from; awareness w/o center = hypervigilanceredirect— handles pressures once awareness detectsheal— deeper subsystem assessment when awareness shows drift patternsmeditate— develops observational clarity awareness depends on
GitHub 仓库
相关推荐技能
executing-plans
设计该Skill用于当开发者提供完整实施计划时,以受控批次方式执行代码实现。它会先审阅计划并提出疑问,然后分批次执行任务(默认每批3个任务),并在批次间暂停等待审查。关键特性包括分批次执行、内置检查点和架构师审查机制,确保复杂系统实现的可控性。
requesting-code-review
设计该Skill可在完成任务、实现主要功能或合并代码前自动调度代码审查子代理,确保实现符合需求和计划。它支持通过指定git SHA范围进行精准的代码变更审查,帮助开发者在关键节点及时发现潜在问题。核心原则是"早审查、勤审查",适用于开发流程的各个关键阶段。
connect-mcp-server
设计这个Skill指导开发者如何将MCP服务器连接到Claude Code,支持HTTP、stdio和SSE三种传输协议。它涵盖了从安装配置到认证安全的完整流程,适用于集成GitHub、Notion、数据库等外部服务。当开发者需要添加集成、配置外部工具或提及MCP相关功能时,这个Skill能提供实用的操作指南。
web-cli-teleport
设计该Skill帮助开发者根据任务特性选择Claude Code的Web或CLI界面,并指导如何在两种环境间无缝迁移会话。它能分析任务复杂度、迭代需求等要素,推荐最优工作界面和工作流。关键特性包括会话状态管理、环境切换指导和上下文优化建议。
