observe-guidance
について
このスキルは、行動する前にシステムを理解するための体系的な観察手法をユーザーに指導します。デバッグや研究のために、中立的なデータ収集、パターン認識、構造化された報告方法を指導します。結論を導く前に証拠を収集する必要がある場合や、証拠に基づいた分析を準備する際に活用してください。
クイックインストール
Claude Code
推奨npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/observe-guidanceこのコマンドをClaude Codeにコピー&ペーストしてスキルをインストールします
ドキュメント
Observe (Guidance)
Coach human in field study: frame → protocol → witness → record → analyze → report. Separate fact from interpretation.
Use When
- Person wants understand system before intervene (debug by obs, not trial-error)
- Conducting research / evidence → needs structured method
- Person jumps to conclusions → needs obs discipline
- Preparing evidence-based report (not opinion)
- Team dynamics, user behavior, process effectiveness via direct obs
- After
meditate-guidancecultivated attention → direct it at system
In
- Required: What to observe (system, process, behavior, codebase, team, phenomenon)
- Required: Why (debug, research, audit, curiosity, improvement)
- Optional: Time available (single vs multi-day)
- Optional: Prior attempts
- Optional: Specific Qs / hypotheses
- Optional: Recording tools (notebook, screen capture, logging, metrics)
Do
Step 1: Frame
Help set bounded frame.
- Ask what: "What system/behavior trying to understand?"
- Narrow scope: "What specific aspect interests you most?"
- Purpose: understanding / debug / improve / evidence / curiosity
- Boundaries: in/out scope → prevents endless expansion
- Hypothesis? state explicit, then set aside → "look for evidence both for + against"
- Stance:
- Naturalist: no interfere (best for behavior)
- Controlled: change one var, observe effect (best for debug)
- Longitudinal: over time (best for trends)
→ Clear frame: target, scope, purpose, stance defined.
If err: can't narrow ("understand everything") → pick one entry point: "what behavior most confusing?" Already committed conclusion ("just prove X") → gently challenge: "what would disprove it?"
Step 2: Prep protocol
Systematic recording.
- Method by type:
- Codebase/system: paths, line numbers, timestamps, log entries
- Behavior/process: time-stamped notes — actor, action, context
- Team/communication: quotes, speaker IDs, non-verbal cues
- Natural/physical: sketches, measurements, env conditions
- Template:
Field Notes Template:
┌─────────────┬────────────────────────────────────────────────────────┐
│ Timestamp │ When the observation occurred │
├─────────────┼────────────────────────────────────────────────────────┤
│ Observation │ What was seen/heard/measured (fact only) │
├─────────────┼────────────────────────────────────────────────────────┤
│ Context │ What was happening around the observation │
├─────────────┼────────────────────────────────────────────────────────┤
│ Reaction │ Observer's response (thoughts, emotions, surprises) │
├─────────────┼────────────────────────────────────────────────────────┤
│ Hypothesis │ Tentative interpretation (kept separate from fact) │
└─────────────┴────────────────────────────────────────────────────────┘
- Stress separation: "obs row = fact. hypothesis row = interpretation. Never mix."
- Min count: "≥10 obs before any conclusion"
- Set up monitoring tools if applicable
→ Recording method ready. Person gets obs↔interpretation distinction. Prepared.
If err: too formal → simplify: "write what you see, separately what you think it means." Resist recording ("I'll remember") → unrecorded = memory bias; writing makes obs accurate.
Step 3: Witness
Guide actual obs session.
- Remind stance: "naturalist studying new species. No interfere — just watch"
- First 5min: pure obs no recording — just attend
- After immersion: begin recording w/ template
- Coach neutral lang: instead "system crashed" → "system stopped responding 14:32 after 47th request"
- Watch interpretation creeping: "that's interpretation — record in hypothesis row"
- Note surprises: "what surprised? surprises = most valuable data"
- Check frame: "still observing what set out, or drifted?"
- Wants to intervene: "note what + why, but don't change yet — keep observing"
→ ≥5-10 concrete obs w/ specific evidence. Experiences obs vs interpret diff. Finds harder than expected.
If err: keep interpreting → exercise: "describe as if to someone never seen this. Only verifiable facts." Run out fast → too high level → zoom in: timing, ordering, edge cases, exceptions.
Step 4: Record
Organize raw → structured.
- Review together
- Completeness: enough context for later?
- Factual accuracy: verifiable, or hidden assumptions?
- Group similar: "patterns forming?"
- Frequencies: how often?
- Absences: "what expected but not there?"
- Strong (clear evidence) vs weak (ambiguous)
→ Organized field notes cleanly separate obs from interpretation. Detailed enough another can verify.
If err: too vague ("things slow") → add specifics: "how slow? compared to what? which conditions?" Too detailed (record everything) → which relate to frame, which noise.
Step 5: Analyze
Obs → structured analysis.
- Look for patterns:
- Repetition: "happened many times — systematic?"
- Correlation: "X always w/ Y — related?"
- Sequence: "A always before B — A causes B?"
- Absence: "X never in condition Z — why?"
- Anomaly: "all follow P except this — what diff?"
- Each pattern: "alternative explanation?"
- 2-3 hypotheses
- Correlation ≠ causation: "co-occur ≠ proves cause"
- Testable + what test confirms/refutes
- Confidence levels: well-supported vs speculative
→ Raw obs → structured hypotheses, data/theory separation kept. ≥1 testable hypothesis for original Q.
If err: jumps single explanation → challenge: "one possibility. another?" No patterns → too few obs → continue. Every obs same conclusion → filtering → ask: "what would contradict your theory?"
Step 6: Report
Communicate findings.
- Structure:
- Context: what/when/why/conditions
- Method: protocol, tools, duration
- Findings: key obs w/ evidence (data, not interpretation)
- Analysis: patterns, hypotheses, confidence
- Recommendations: next steps (more obs, test, intervene)
- Limitations: not covered, potential biases
- Findings in neutral lang separating fact from interpretation
- Review for hidden assumptions / unsupported claims
- Debug? translate hypotheses → concrete tests
- Report? evidence cited specifically
- Personal? summarize insights + remaining Qs
→ Clear report communicates obs/patterns/hypotheses, distinction kept. Reader can evaluate evidence independently.
If err: buries obs in interpretation → restructure: "facts one section, theories another." No confidence ("definitely because...") → calibrate: "how sure? what would change mind?"
Check
- Frame set before obs (not wandering)
- Recording protocol established + used consistently
- Obs as facts, separate from interpretations
- ≥5 concrete evidence-backed obs
- Patterns from analysis, not assumed
- Hypotheses testable, stated confidence
- Person experienced obs-before-interpret discipline
Traps
- Confirmation bias: only obs supporting belief. Frame must include "look for evidence against your hypothesis"
- Intervention urge: see + fix immediately → masks root cause → observe first
- Recording fatigue: detail = taxing. Breaks + realistic lengths (30-60min focused = substantial)
- Over-protocol: simple obs needs notebook+timestamps. Protocol serves obs, not replaces
- Obs ≠ surveillance: ethical boundaries matter. Visible behavior, no spy. People → transparency > secrecy
- Skip frame: no target → attention scatters → unfocused. Rough frame > none
→
observe— AI self-directed variantlearn-guidance— obs feeds learninglisten-guidance— focused obs of speaker; obs broader to any systemremote-viewing-guidance— shares method adapted for non-localread-garden— garden obs uses similar CRV-adapted sensory protocols
GitHub リポジトリ
関連スキル
evaluating-llms-harness
テストこのClaudeスキルは、lm-evaluation-harnessを実行し、MMLUやGSM8Kなど60以上の標準化学術タスクでLLMをベンチマークします。開発者がモデルの品質を比較し、トレーニングの進捗を追跡し、学術的な結果を報告するために設計されています。このツールはHuggingFaceやvLLMモデルを含む様々なバックエンドをサポートしています。
cloudflare-cron-triggers
テストこのスキルは、cron式を使用してWorkersをスケジュールするためのCloudflare Cron Triggersの実装に関する包括的な知識を提供します。定期的なタスクの設定、メンテナンスジョブ、自動化されたワークフローの構築を網羅し、無効なcron式やタイムゾーン問題といった一般的な課題への対処法も含みます。開発者はこれを使用して、スケジュールされたハンドラーの設定、cronトリガーのテスト、WorkflowsやGreen Computeとの連携を構成できます。
webapp-testing
テストこのClaude Skillは、Playwrightベースのツールキットを提供し、Pythonスクリプトを通じてローカルWebアプリケーションのテストを可能にします。フロントエンドの検証、UIデバッグ、スクリーンショット撮影、ログ表示を実現し、サーバーライフサイクルを管理します。ブラウザ自動化タスクにご利用いただけますが、コンテキストの汚染を避けるため、スクリプトのソースコードを読むのではなく直接実行してください。
finishing-a-development-branch
テストこのスキルは、開発者がテストの合格を確認し、構造化された統合オプションを提示することで、完成した作業を仕上げることを支援します。実装が完了した後のマージ、PR作成、ブランチの整理といったワークフローを案内します。コードが準備できてテスト済みの際に使用し、開発プロセスを体系的に完了させましょう。
