SKILL·FDE103

after-action-report

Name: after-action-report
Author: rampstackco

rampstackco

업데이트됨 1 month ago

8 조회

464

GitHub에서 보기

기타general

정보

이 스킬은 사고, 출시 또는 프로젝트에 대해 구조화된 사후 검토를 실행하여 타임라인, 근본 원인 및 교훈을 포착합니다. '사후 분석', '회고' 또는 '근본 원인 분석(RCA)'과 같은 용어에 반응하며, 기능 출시나 사고 해결 후에 사용하기에 이상적입니다. 이 과정은 단순한 문서화가 아닌 실행 가능한 개선 방안 도출에 중점을 둡니다.

빠른 설치

Claude Code

문서

After-Action Report

Run a structured retrospective on a launch, incident, or completed project. Produce actionable lessons, not just a document.

This skill is for after-the-fact analysis. For active incident response, use incident-response. For planning launches, use launch-runbook.

When to use

After any incident (any severity)
After every major launch
At the end of a project (sprint retro, quarterly retro, project closeout)
When a recurring issue has happened enough times to demand investigation
When a decision didn't work out and the team wants to learn

When NOT to use

During an active incident (use incident-response)
For pre-launch planning (use launch-runbook)
For one-off bug fixes that don't merit broad analysis

Required inputs

The event being analyzed (incident, launch, project)
A timeline reconstructed from logs, chat, tickets
Participant accounts of what they observed and did
Outcomes and impact (what actually happened to users, the business)

The framework: blameless analysis

The most important principle: blameless. Without it, retrospectives produce hidden information and theatrical lessons rather than real ones.

What blameless means

Focus on systems, not individuals
Assume everyone made reasonable decisions given what they knew at the time
The question is "why was this decision reasonable to make?" not "who screwed up?"
Fixing the system means the next person in that situation succeeds where this person didn't

What blameless does not mean

No accountability (action items still have owners)
No hard truths (sometimes the system is broken in obvious ways)
No standards (some patterns of failure are individual, not systemic)
No discomfort (real reflection is uncomfortable)

The framework: 6 sections

A complete AAR covers six sections.

1. Summary

A 2 to 3 paragraph overview. Captures:

What happened
Impact (users, business, time)
Root cause (in plain language)
Top action items

This is what executives read. Anyone who reads only this section should leave with the most important information.

2. Timeline

A reconstructed timeline of events.

For incidents:

T-0: Detection
T+X: Acknowledgment
T+Y: Severity assessed, IC assigned
T+Z: Investigation began
... mitigation, communication, resolution events
T+N: Resolution declared

For launches:

Pre-launch decisions and milestones
Launch day events
Post-launch monitoring observations

For projects:

Major milestones, decisions, pivots
Both planned and emergent

The timeline is the source of truth. Disagreements about what happened get resolved here.

3. Root cause analysis

What caused this, in plain language.

Use one or both of:

Five whys. Start with the surface symptom. Ask "why?" Repeat 5 times (or until you reach a true root). Each "why" should yield a substantive answer, not a tautology.

Example:

Why did the site go down? Database connection pool exhausted.
Why was the pool exhausted? Background job opened too many connections.
Why did the background job open too many connections? Connection cleanup code didn't run on errors.
Why didn't cleanup run on errors? Original code review didn't cover error paths.
Why didn't the review cover error paths? No checklist for error handling in our review process.

The fifth why often reveals the system fix. In this case: improve the review process.

Causal chain. Multiple contributing factors that combined.

Factor 1: Background job opened too many connections (technical)
Factor 2: Connection limit was set too low for actual traffic (configuration)
Factor 3: No alert on connection pool saturation (monitoring)
Factor 4: Recent traffic doubled without infra capacity review (process)

No single fix addresses the incident. Multiple gaps need attention.

4. Contributing factors

Factors that didn't cause the event but made it worse, or removed safety nets that would have caught it.

Monitoring gaps
Documentation gaps
Process gaps
Tooling gaps
Knowledge gaps

A "would have been caught earlier if..." factor.

5. What went well

Real lessons require capturing successes, not just failures.

What detection worked?
What response worked?
What decisions were good?
What tools or processes performed as expected?

This is not consolation. It's calibration. Things that worked here should be reinforced and replicated.

6. Action items

Specific, owned, dated.

Action	Owner	Due	Type
Add alert on connection pool saturation	[name]	[date]	Monitoring
Add error handling checklist to PR template	[name]	[date]	Process
Audit other background jobs for similar issue	[name]	[date]	Code

Action item criteria:

Specific. "Improve monitoring" is not actionable. "Add alert on connection pool saturation, threshold 80%, page on-call" is.
Owned. A name. Not "the team."
Dated. A real date. Not "soon."
Sized. Roughly hours, days, or weeks of effort.
Closeable. Definition of done is clear.

Action items that don't close in their committed timeframe should re-surface in the next AAR. Patterns of unclosed actions point to deeper organizational issues.

Workflow

1. Schedule the AAR

Within 1 to 2 weeks of the event. Long enough that emotions cooled and facts gathered. Short enough that memories are fresh.

For incidents: pre-decided in the response procedure. For launches: schedule on the runbook. For projects: schedule at project closeout.

2. Gather inputs

Before the meeting:

Reconstructed timeline (often the scribe's notes if there was one)
Logs, chat transcripts, tickets, incident updates
Individual accounts from each participant (written, before the meeting)
Impact data (users affected, duration, revenue impact, etc.)

3. Run the meeting

Typical agenda (60 to 90 minutes):

Read the summary as drafted (5 min)
Walk the timeline together. Add corrections. Resolve disagreements. (20 to 30 min)
Discuss root cause. Use five whys or causal chain. (15 to 20 min)
Discuss contributing factors. (10 min)
Discuss what went well. (10 min)
Identify action items. Owners and dates. (10 min)

A facilitator runs the meeting. Often the IC for an incident, or a project lead for a project. The facilitator is not the scribe.

4. Write the document

Within a few days of the meeting. The full AAR includes all 6 sections.

5. Distribute

Internal: post in a known location. Make searchable. Reference in onboarding.

For high-severity incidents: external summary may be appropriate (status page, customer email, public blog).

6. Track action items

Every action item should be tracked to closure. The next AAR re-surfaces unclosed ones.

Failure patterns

Skipping the AAR for "small" incidents. Patterns get missed.
Naming and shaming. Real lessons get hidden when people fear blame.
Generic action items. "Improve testing" instead of specific testing change.
Action items that never close. Filed, forgotten. Same incident recurs.
Theater retrospectives. Going through the motions without genuine reflection.
Skipping "what went well." Misses calibration on what's working.
Blame externalized. "Our vendor failed." OK, what's our system for vendor risk?
Single-person AAR. One person writes the whole thing. Misses other perspectives.
AAR only for failures. Successful launches deserve AARs too. Lessons from success are valuable.
Long delays. Memories fade. Conversations cool. Get it done within 2 weeks.

Output format

A markdown document at aar-[date]-[event-name].md.

Structure:

# AAR: [Event name]

**Date of event:** [YYYY-MM-DD]
**AAR date:** [YYYY-MM-DD]
**Severity / scope:** [SEV-1 / Major launch / Project closeout]
**Facilitator:** [Name]
**Participants:** [Names]

## Summary
[2 to 3 paragraphs]

## Impact
- Users affected: [number, segment]
- Duration: [time]
- Revenue / business impact: [if applicable]

## Timeline
[Timestamped events]

## Root cause analysis
[Five whys or causal chain]

## Contributing factors
[List]

## What went well
[List]

## Action items
| Action | Owner | Due | Type | Status |
|---|---|---|---|---|
| | | | | |

## Lessons
[Reflections that don't fit elsewhere. Often the most quotable section.]

Reference files

references/aar-template.md - Fillable AAR template covering incidents, launches, and projects.

GitHub 저장소

rampstackco/claude-skills

경로: skills/after-action-report

agent-skillsai-agentsanthropicclaudeclaude-aiclaude-code

FAQ

Frequently asked questions

What is the after-action-report skill?

after-action-report is a Claude Skill by rampstackco. Skills package instructions and resources that Claude loads on demand, so Claude can perform after-action-report-related tasks without extra prompting.

How do I install after-action-report?

Use the install commands on this page: add after-action-report to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does after-action-report belong to?

after-action-report is in the operations category, tagged general.

Is after-action-report free to use?

Yes. after-action-report is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

연관 스킬

monitoring-and-alerting

기타

이 스킬은 개발자가 모니터링 시스템을 설계하고 구현하는 데 도움을 주며, SLO 정의, 가동 시간 확인, 오류 추적을 다룹니다. 실행 가능한 알림 구성, 당직 순번 설정, 알림 피로도 해결 방법을 안내합니다. 가시성 확보를 시작할 때나 사고 발생으로 모니터링 공백이 드러났을 때 사용하세요.

스킬 보기

security-baseline

기타

보안-베이스라인 스킬은 개발자가 필수 웹 보안 구성을 수립하고 감사하는 데 도움을 줍니다. HTTPS/TLS 설정, 보안 헤더, CSP, 비밀 관리, 출시 전 강화에 대한 지침을 제공합니다. 컴플라이언스 검토, 취약점 평가, 정기적인 보안 감사에 활용하세요.

스킬 보기

media-asset-management

기타

이 스킬은 개발자가 이미지, 비디오, 다운로드 가능 에셋을 위한 미디어 파이프라인을 설계하고 최적화하는 데 도움을 줍니다. 저장소 관리, 현대적 형식 선택(WebP/AVIF 등), 반응형 이미지, 비디오 호스팅, 에셋 라이브러리 구성에 대한 지침을 제공합니다. 특히 성능이나 구성 관련 문제가 있을 때 미디어 전송 시스템을 구축, 감사 또는 개선하는 경우에 사용하세요.

스킬 보기

email-deliverability

기타

이 Claude Skill은 개발자가 SPF, DKIM, DMARC와 같은 인증 프로토콜을 구현하고 문제를 해결하여 이메일이 수신함에 도달하도록 지원합니다. 스팸 배치 문제 진단, 발신자 평판 모니터링, 도메인 스푸핑 방지 강화를 도와줍니다. 이메일 시스템을 설정할 때나 마케팅/거래 이메일이 사용자에게 전달되지 않는 경우에 사용하세요.

스킬 보기