postmortem-solo
Acerca de
Esta habilidad guía a un desarrollador a través de un análisis postmortem estructurado e individual tras un fracaso en solitario, como un plazo incumplido o un lanzamiento fallido. Impone un análisis sin culpas con límites estrictos de palabras en cinco secciones para identificar las causas raíz. El objetivo es producir uno o dos cambios concretos y accionables, en lugar de una documentación extensa.
Instalación rápida
Claude Code
Recomendadonpx skills add rockscy/solo-skills -a claude-code/plugin add https://github.com/rockscy/solo-skillsgit clone https://github.com/rockscy/solo-skills.git ~/.claude/skills/postmortem-soloCopia y pega este comando en Claude Code para instalar esta habilidad
Documentación
Postmortem Solo / 单兵复盘
When to use
- Something didn't go to plan in the last 30 days: outage, failed launch, missed deadline, customer churn spike, broken release.
- The user is the only person responsible — there's no team to interview.
- The goal is one or two concrete changes, not a 10-page report.
When NOT to use
- The incident is still active — fix it first, postmortem after.
- The user wants emotional processing, not a postmortem. Suggest a walk or a friend; this skill is for cold analysis.
- It's a routine setback (one bug in production, one slow week) — not every bump needs a postmortem.
Structure
Five sections, strict word limits:
- What happened — 3 bullets, factual, no analysis. (≤ 60 words)
- Impact — what did the failure cost? Users affected, revenue lost, hours sunk, reputation hit. (≤ 40 words)
- Timeline — bullet list with timestamps if known: trigger → detection → response → resolution. (≤ 100 words)
- Root cause(s) — apply 5 Whys but stop the moment you hit a systemic cause (process, default, missing tool), not a personal one ("I was tired"). (≤ 80 words)
- Changes — at most 2 concrete changes with owner (you), due date, and verification method. (≤ 60 words)
Total: ≤ 350 words. If you need more, you're hiding from a decision.
The "no blame, but yes accountability" rule
Solo postmortems have a trap: you're the only person who could have caused this, so it's tempting to either (a) flagellate yourself, or (b) blame externalities ("Stripe was flaky"). Neither is useful.
Correct framing: "What system / habit / default could I change so that next-month-me cannot make this mistake even if next-month-me is tired and distracted?"
This forces solutions toward process, automation, or alarms — not "be more careful next time."
Output format
# Postmortem: <one-sentence headline>
**Date:** <YYYY-MM-DD> — drafted <X> days after the incident.
## What happened
- …
- …
- …
## Impact
…
## Timeline
- HH:MM — <event>
- HH:MM — <event>
## Root cause(s)
…
## Changes
1. <change> — owner: me, due: <date>, verified by: <how I'll know it worked>
2. <change> — owner: me, due: <date>, verified by: <how>
Worked example
Input: "I deployed on Friday afternoon, broke the signup flow for ~6 hours, lost about 40 trial signups before I noticed."
Output:
# Postmortem: Friday-afternoon deploy broke signup for 6 hours
**Date:** 2026-04-30 — drafted 2 days after the incident.
## What happened
- Deployed schema migration at 16:45 local time.
- Migration removed a column the signup form still referenced.
- Signup returned 500 for ~6 hours until I checked Stripe at 22:30.
## Impact
~40 trial signups lost (Stripe shows 0 new customers in the window vs. ~7/hour baseline). Estimated revenue at risk: ~$280/mo if 10% would have converted.
## Timeline
- 16:45 — deploy pushed.
- 16:50 — first 500 (no alert; error logger was rate-limited).
- 22:30 — noticed via empty Stripe dashboard, rolled back migration in 8 min.
## Root cause(s)
The signup form references the dropped column directly, not via an abstraction. CI ran migrations against an empty test DB, so the form-vs-schema mismatch never surfaced. No deploy-time alert fires when 500 rate exceeds baseline.
## Changes
1. Add a basic uptime check that posts a Slack ping on >5x baseline 500s — owner: me, due: 2026-05-03, verified by: triggering a test 500 and confirming the ping.
2. Block deploys after 16:00 local time on Fridays (calendar reminder + git pre-push hook) — owner: me, due: 2026-05-02, verified by: trying to deploy at 17:00 Friday and getting blocked.
Anti-patterns
- "I'll be more careful next time" is not a change. It's a wish. Replace it with a system change.
- More than 2 changes per postmortem. You won't do them. Cut to two.
- Listing root causes that don't tie to a change. If a cause doesn't generate an action, omit it.
中文版
何时使用
- 最近 30 天内没按计划走:故障、发布失败、错过 deadline、客户大量流失、版本翻车。
- 用户是唯一负责人——没团队可访谈。
- 目标是一两个具体改动,不是 10 页大报告。
何时不使用
- 故障还在进行中——先修,再复盘。
- 用户想情绪处理而非复盘——建议散步或找朋友聊,这个技能是冷分析。
- 只是常规小坎坷(一个线上 bug,一周不顺)——不是每次都需要复盘。
结构
五段,严格字数限制:
- 发生了什么——3 条事实,不带分析。(≤ 60 字)
- 影响——损失了什么?(≤ 40 字)
- 时间线——触发→发现→响应→修复,带时间戳。(≤ 100 字)
- 根本原因——5 Why,但停在系统性原因(流程/默认/缺工具),不是"我累了"。(≤ 80 字)
- 改动——最多两条,要有 owner(你)、截止日、验证方式。(≤ 60 字)
总长 ≤ 350 字。再多就是在逃避决定。
"不归罪但要负责"原则
单兵复盘有个陷阱:唯一可能犯错的就是你,所以容易要么(a)自责,要么(b)甩锅外部。两者都没用。
正确框架: "我能改变什么系统/习惯/默认值,让下个月的我即使累了走神也不会再犯?"
这逼着方案朝流程、自动化、告警走——而不是"下次更小心"。
反模式
- "下次更小心"不是改动,是愿望。换成系统性改动。
- 一次复盘超过 2 条改动——你做不完。砍到两条。
- 列出不对应改动的根因——如果一个根因不会触发行动,删掉。
Repositorio GitHub
Habilidades relacionadas
llamaguard
OtroLlamaGuard es el modelo de Meta de 7-8B parámetros para moderar las entradas y salidas de LLM en seis categorías de seguridad como violencia y discurso de odio. Ofrece una precisión del 94-95% y puede implementarse usando vLLM, Hugging Face o Amazon SageMaker. Utiliza esta skill para integrar fácilmente filtrado de contenido y barreras de seguridad en tus aplicaciones de IA.
cost-optimization
OtroEsta Skill de Claude ayuda a los desarrolladores a optimizar los costes en la nube mediante el ajuste de tamaño de recursos, estrategias de etiquetado y análisis de gastos. Proporciona un marco para reducir los gastos en la nube e implementar una gobernanza de costes en AWS, Azure y GCP. Úsala cuando necesites analizar los costes de infraestructura, ajustar el tamaño de los recursos o cumplir con restricciones presupuestarias.
quantizing-models-bitsandbytes
OtroEsta habilidad cuantiza LLMs a precisión de 8 o 4 bits utilizando bitsandbytes, logrando una reducción de memoria del 50-75% con pérdida mínima de precisión. Es ideal para ejecutar modelos más grandes en memoria GPU limitada o para acelerar la inferencia, admitiendo formatos como INT8, NF4 y FP4. La habilidad se integra con HuggingFace Transformers y permite entrenamiento QLoRA y optimizadores de 8 bits.
dispatching-parallel-agents
OtroEsta Skill de Claude despliega múltiples agentes para investigar y solucionar 3 o más problemas independientes de forma concurrente. Está diseñada para escenarios que involucran fallos no relacionados que pueden resolverse sin estado compartido o dependencias. Su capacidad principal es la resolución paralela de problemas, asignando un agente por cada dominio problemático independiente para maximizar la eficiencia.
