observe-guidance
关于
This skill guides users through systematic observation to understand systems before acting. It coaches neutral data collection, pattern recognition, and structured reporting for debugging or research. Use it when users need to gather evidence before forming conclusions or when preparing an evidence-based analysis.
快速安装
Claude Code
推荐npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/observe-guidance在 Claude Code 中复制并粘贴此命令以安装该技能
技能文档
Observe (Guidance)
Coach human in field study: frame → protocol → witness → record → analyze → report. Separate fact from interpretation.
Use When
- Person wants understand system before intervene (debug by obs, not trial-error)
- Conducting research / evidence → needs structured method
- Person jumps to conclusions → needs obs discipline
- Preparing evidence-based report (not opinion)
- Team dynamics, user behavior, process effectiveness via direct obs
- After
meditate-guidancecultivated attention → direct it at system
In
- Required: What to observe (system, process, behavior, codebase, team, phenomenon)
- Required: Why (debug, research, audit, curiosity, improvement)
- Optional: Time available (single vs multi-day)
- Optional: Prior attempts
- Optional: Specific Qs / hypotheses
- Optional: Recording tools (notebook, screen capture, logging, metrics)
Do
Step 1: Frame
Help set bounded frame.
- Ask what: "What system/behavior trying to understand?"
- Narrow scope: "What specific aspect interests you most?"
- Purpose: understanding / debug / improve / evidence / curiosity
- Boundaries: in/out scope → prevents endless expansion
- Hypothesis? state explicit, then set aside → "look for evidence both for + against"
- Stance:
- Naturalist: no interfere (best for behavior)
- Controlled: change one var, observe effect (best for debug)
- Longitudinal: over time (best for trends)
→ Clear frame: target, scope, purpose, stance defined.
If err: can't narrow ("understand everything") → pick one entry point: "what behavior most confusing?" Already committed conclusion ("just prove X") → gently challenge: "what would disprove it?"
Step 2: Prep protocol
Systematic recording.
- Method by type:
- Codebase/system: paths, line numbers, timestamps, log entries
- Behavior/process: time-stamped notes — actor, action, context
- Team/communication: quotes, speaker IDs, non-verbal cues
- Natural/physical: sketches, measurements, env conditions
- Template:
Field Notes Template:
┌─────────────┬────────────────────────────────────────────────────────┐
│ Timestamp │ When the observation occurred │
├─────────────┼────────────────────────────────────────────────────────┤
│ Observation │ What was seen/heard/measured (fact only) │
├─────────────┼────────────────────────────────────────────────────────┤
│ Context │ What was happening around the observation │
├─────────────┼────────────────────────────────────────────────────────┤
│ Reaction │ Observer's response (thoughts, emotions, surprises) │
├─────────────┼────────────────────────────────────────────────────────┤
│ Hypothesis │ Tentative interpretation (kept separate from fact) │
└─────────────┴────────────────────────────────────────────────────────┘
- Stress separation: "obs row = fact. hypothesis row = interpretation. Never mix."
- Min count: "≥10 obs before any conclusion"
- Set up monitoring tools if applicable
→ Recording method ready. Person gets obs↔interpretation distinction. Prepared.
If err: too formal → simplify: "write what you see, separately what you think it means." Resist recording ("I'll remember") → unrecorded = memory bias; writing makes obs accurate.
Step 3: Witness
Guide actual obs session.
- Remind stance: "naturalist studying new species. No interfere — just watch"
- First 5min: pure obs no recording — just attend
- After immersion: begin recording w/ template
- Coach neutral lang: instead "system crashed" → "system stopped responding 14:32 after 47th request"
- Watch interpretation creeping: "that's interpretation — record in hypothesis row"
- Note surprises: "what surprised? surprises = most valuable data"
- Check frame: "still observing what set out, or drifted?"
- Wants to intervene: "note what + why, but don't change yet — keep observing"
→ ≥5-10 concrete obs w/ specific evidence. Experiences obs vs interpret diff. Finds harder than expected.
If err: keep interpreting → exercise: "describe as if to someone never seen this. Only verifiable facts." Run out fast → too high level → zoom in: timing, ordering, edge cases, exceptions.
Step 4: Record
Organize raw → structured.
- Review together
- Completeness: enough context for later?
- Factual accuracy: verifiable, or hidden assumptions?
- Group similar: "patterns forming?"
- Frequencies: how often?
- Absences: "what expected but not there?"
- Strong (clear evidence) vs weak (ambiguous)
→ Organized field notes cleanly separate obs from interpretation. Detailed enough another can verify.
If err: too vague ("things slow") → add specifics: "how slow? compared to what? which conditions?" Too detailed (record everything) → which relate to frame, which noise.
Step 5: Analyze
Obs → structured analysis.
- Look for patterns:
- Repetition: "happened many times — systematic?"
- Correlation: "X always w/ Y — related?"
- Sequence: "A always before B — A causes B?"
- Absence: "X never in condition Z — why?"
- Anomaly: "all follow P except this — what diff?"
- Each pattern: "alternative explanation?"
- 2-3 hypotheses
- Correlation ≠ causation: "co-occur ≠ proves cause"
- Testable + what test confirms/refutes
- Confidence levels: well-supported vs speculative
→ Raw obs → structured hypotheses, data/theory separation kept. ≥1 testable hypothesis for original Q.
If err: jumps single explanation → challenge: "one possibility. another?" No patterns → too few obs → continue. Every obs same conclusion → filtering → ask: "what would contradict your theory?"
Step 6: Report
Communicate findings.
- Structure:
- Context: what/when/why/conditions
- Method: protocol, tools, duration
- Findings: key obs w/ evidence (data, not interpretation)
- Analysis: patterns, hypotheses, confidence
- Recommendations: next steps (more obs, test, intervene)
- Limitations: not covered, potential biases
- Findings in neutral lang separating fact from interpretation
- Review for hidden assumptions / unsupported claims
- Debug? translate hypotheses → concrete tests
- Report? evidence cited specifically
- Personal? summarize insights + remaining Qs
→ Clear report communicates obs/patterns/hypotheses, distinction kept. Reader can evaluate evidence independently.
If err: buries obs in interpretation → restructure: "facts one section, theories another." No confidence ("definitely because...") → calibrate: "how sure? what would change mind?"
Check
- Frame set before obs (not wandering)
- Recording protocol established + used consistently
- Obs as facts, separate from interpretations
- ≥5 concrete evidence-backed obs
- Patterns from analysis, not assumed
- Hypotheses testable, stated confidence
- Person experienced obs-before-interpret discipline
Traps
- Confirmation bias: only obs supporting belief. Frame must include "look for evidence against your hypothesis"
- Intervention urge: see + fix immediately → masks root cause → observe first
- Recording fatigue: detail = taxing. Breaks + realistic lengths (30-60min focused = substantial)
- Over-protocol: simple obs needs notebook+timestamps. Protocol serves obs, not replaces
- Obs ≠ surveillance: ethical boundaries matter. Visible behavior, no spy. People → transparency > secrecy
- Skip frame: no target → attention scatters → unfocused. Rough frame > none
→
observe— AI self-directed variantlearn-guidance— obs feeds learninglisten-guidance— focused obs of speaker; obs broader to any systemremote-viewing-guidance— shares method adapted for non-localread-garden— garden obs uses similar CRV-adapted sensory protocols
GitHub 仓库
相关推荐技能
evaluating-llms-harness
测试该Skill通过60+个学术基准测试(如MMLU、GSM8K等)评估大语言模型质量,适用于模型对比、学术研究及训练进度追踪。它支持HuggingFace、vLLM和API接口,被EleutherAI等行业领先机构广泛采用。开发者可通过简单命令行快速对模型进行多任务批量评估。
cloudflare-cron-triggers
测试这个Claude Skill提供了关于Cloudflare Cron Triggers的完整知识库,用于通过cron表达式定时执行Workers。它支持配置周期性任务、维护作业和自动化工作流,并能处理常见的cron触发错误。开发者可以用它来设置定时任务、测试cron处理器,并集成Workflows和Green Compute功能。
webapp-testing
测试该Skill为开发者提供了基于Playwright的本地Web应用测试工具集,支持自动化测试前端功能、调试UI行为、捕获屏幕截图和查看浏览器日志。它包含管理服务器生命周期的辅助脚本,可直接作为黑盒工具运行而无需阅读源码。适用于需要快速验证本地Web应用界面和交互功能的开发场景。
finishing-a-development-branch
测试这个Skill用于开发分支完成后的集成决策,当代码实现完成且测试通过时,它会引导开发者选择合适的工作流。它首先验证测试状态,然后提供合并、创建PR或清理等结构化选项。核心价值在于确保代码质量的同时,标准化分支收尾流程。
