Back to Skills

investigate-capa-root-cause

pjt222
Updated 2 days ago
8 views
17
2
17
View on GitHub
Designdesigndata

About

This skill guides developers through structured root cause analysis and CAPA management for compliance deviations. It provides method selection (5-Why, fishbone), action design, and effectiveness verification. Use it when addressing audit findings, system deviations, or recurring issues requiring a systematic investigation.

Quick Install

Claude Code

Recommended
Primary
npx skills add pjt222/agent-almanac -a claude-code
Plugin CommandAlternative
/plugin add https://github.com/pjt222/agent-almanac
Git CloneAlternative
git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/investigate-capa-root-cause

Copy and paste this command in Claude Code to install this skill

Documentation

Investigate CAPA Root Cause

Structured RCA + effective corrective/preventive actions for compliance deviations.

Use When

  • Audit finding needs CAPA
  • Deviation / incident in validated sys
  • Regulatory inspection observation needs formal response
  • Data integrity anomaly needs investigation
  • Recurring issues → systemic root

In

  • Req: Description of deviation / finding / incident
  • Req: Severity (critical, major, minor)
  • Req: Evidence from audit / investigation
  • Opt: Prior related CAPAs / investigations
  • Opt: Relevant SOPs, validation docs, sys logs
  • Opt: Interview notes

Do

Step 1: Initiate

# Root Cause Investigation
## Document ID: RCA-[CAPA-ID]
## CAPA Reference: CAPA-[YYYY]-[NNN]

### 1. Trigger
| Field | Value |
|-------|-------|
| Source | [Audit finding / Deviation / Inspection observation / Monitoring alert] |
| Reference | [Finding ID, deviation ID, or observation number] |
| System | [Affected system name and version] |
| Date discovered | [YYYY-MM-DD] |
| Severity | [Critical / Major / Minor] |
| Investigator | [Name, Title] |
| Investigation deadline | [Date — per severity: Critical 15 days, Major 30 days, Minor 60 days] |

### 2. Problem Statement
[Objective, factual description of what happened, what should have happened, and the gap between the two. No blame, no assumptions.]

### 3. Immediate Containment (if required)
| Action | Owner | Completed |
|--------|-------|-----------|
| [e.g., Restrict system access pending investigation] | [Name] | [Date] |
| [e.g., Quarantine affected batch records] | [Name] | [Date] |
| [e.g., Implement manual workaround] | [Name] | [Date] |

→ Investigation initiated w/ clear problem statement + containment w/in 24h for critical findings.

If err: Containment can't be implemented immediately → escalate QA Director + document risk of delayed containment.

Step 2: Select Method

Choose based on complexity:

### Investigation Method Selection

| Method | Best For | Complexity | Output |
|--------|----------|-----------|--------|
| **5-Why Analysis** | Single-cause problems, straightforward failures | Low | Linear cause chain |
| **Fishbone (Ishikawa)** | Multi-factor problems, process failures | Medium | Cause-and-effect diagram |
| **Fault Tree Analysis** | System failures, safety-critical events | High | Boolean logic tree |

**Selected method:** [5-Why / Fishbone / Fault Tree / Combination]
**Rationale:** [Why this method is appropriate for this problem]

→ Method matches complexity — no fault tree for simple procedural, no 5-Why for complex systemic.

If err: First method doesn't reach convincing root → apply 2nd. Convergence across methods strengthens.

Step 3: Conduct RCA

Opt A: 5-Why

### 5-Why Analysis

| Level | Question | Answer | Evidence |
|-------|----------|--------|----------|
| Why 1 | Why did [the problem] occur? | [Immediate cause] | [Evidence reference] |
| Why 2 | Why did [immediate cause] occur? | [Contributing factor] | [Evidence reference] |
| Why 3 | Why did [contributing factor] occur? | [Deeper cause] | [Evidence reference] |
| Why 4 | Why did [deeper cause] occur? | [Systemic cause] | [Evidence reference] |
| Why 5 | Why did [systemic cause] occur? | [Root cause] | [Evidence reference] |

**Root cause:** [Clear statement of the fundamental cause]

Opt B: Fishbone (Ishikawa)

### Fishbone Analysis

Analyse causes across six standard categories:

| Category | Potential Causes | Confirmed? | Evidence |
|----------|-----------------|------------|----------|
| **People** | Inadequate training, unfamiliarity with SOP, staffing shortage | [Y/N] | [Ref] |
| **Process** | SOP unclear, missing step, wrong sequence | [Y/N] | [Ref] |
| **Technology** | System misconfiguration, software bug, interface failure | [Y/N] | [Ref] |
| **Materials** | Incorrect input data, wrong version of reference document | [Y/N] | [Ref] |
| **Measurement** | Wrong metric, inadequate monitoring, missed threshold | [Y/N] | [Ref] |
| **Environment** | Organisational change, regulatory change, resource constraints | [Y/N] | [Ref] |

**Contributing causes:** [List confirmed causes]
**Root cause(s):** [The fundamental cause(s) — may be more than one]

Opt C: Fault Tree

### Fault Tree Analysis

**Top event:** [The undesired event]

Level 1 (OR gate — any of these could cause the top event):
├── [Cause A]
│   Level 2 (AND gate — both needed):
│   ├── [Sub-cause A1]
│   └── [Sub-cause A2]
├── [Cause B]
│   Level 2 (OR gate):
│   ├── [Sub-cause B1]
│   └── [Sub-cause B2]
└── [Cause C]

**Minimal cut sets:** [Smallest combinations of events that cause the top event]
**Root cause(s):** [Fundamental failures identified in the tree]

→ RCA reaches fundamental cause (not symptom) w/ evidence per step.

If err: Analysis only symptoms ("user made err") → push deeper. Ask: "Why could user make that err? What control should've prevented?"

Step 4: Design Corrective + Preventive Actions

Distinguish correction vs corrective vs preventive:

### CAPA Plan

| Category | Definition | Action | Owner | Deadline |
|----------|-----------|--------|-------|----------|
| **Correction** | Fix the immediate problem | [e.g., Re-enable audit trail for batch module] | [Name] | [Date] |
| **Corrective Action** | Eliminate the root cause | [e.g., Remove admin ability to disable audit trail; require change control for all audit trail configuration changes] | [Name] | [Date] |
| **Preventive Action** | Prevent recurrence in other areas | [e.g., Audit all systems for audit trail disable capability; add monitoring alert for audit trail configuration changes] | [Name] | [Date] |

### CAPA Details

**CAPA-[YYYY]-[NNN]-CA1: [Corrective Action Title]**
- **Root cause addressed:** [Specific root cause from Step 3]
- **Action description:** [Detailed description of what will be done]
- **Success criteria:** [Measurable outcome that proves the action worked]
- **Verification method:** [How effectiveness will be checked]
- **Verification date:** [When effectiveness will be verified — typically 3-6 months after implementation]

**CAPA-[YYYY]-[NNN]-PA1: [Preventive Action Title]**
- **Risk addressed:** [What recurrence or spread this prevents]
- **Action description:** [Detailed description]
- **Success criteria:** [Measurable outcome]
- **Verification method:** [How effectiveness will be checked]
- **Verification date:** [Date]

→ Every action traces to specific root, has measurable success criteria, + effectiveness verification plan.

If err: Success criteria vague ("improve compliance") → rewrite specific + measurable ("zero audit trail config changes outside change control for 6 consecutive months").

Step 5: Verify Effectiveness

After implementation → verify actions worked:

### Effectiveness Verification

**CAPA-[YYYY]-[NNN] — Verification Record**

| CAPA Action | Verification Date | Method | Evidence | Result |
|-------------|------------------|--------|----------|--------|
| CA1: [Action] | [Date] | [Method: audit, sampling, metric review] | [Evidence reference] | [Effective / Not Effective] |
| PA1: [Action] | [Date] | [Method] | [Evidence reference] | [Effective / Not Effective] |

### Effectiveness Criteria Check
- [ ] The original problem has not recurred since CAPA implementation
- [ ] The corrective action eliminated the root cause (evidence: [reference])
- [ ] The preventive action has been applied to similar systems/processes
- [ ] No new issues were introduced by the CAPA actions

### CAPA Closure
| Field | Value |
|-------|-------|
| Closure decision | [Closed — Effective / Closed — Not Effective / Extended] |
| Closed by | [Name, Title] |
| Closure date | [YYYY-MM-DD] |
| Next review | [If recurring, when to re-check] |

→ Verification demonstrates root eliminated, not just action completed.

If err: Verification shows CAPA not effective → reopen investigation + develop revised actions. Don't close ineffective CAPA.

Step 6: Analyse Trends

### CAPA Trend Analysis

| Period | Total CAPAs | By Source | Top 3 Root Cause Categories | Recurring? |
|--------|------------|-----------|---------------------------|------------|
| Q1 20XX | [N] | Audit: [n], Deviation: [n], Monitoring: [n] | [Cat1], [Cat2], [Cat3] | [Y/N] |
| Q2 20XX | [N] | Audit: [n], Deviation: [n], Monitoring: [n] | [Cat1], [Cat2], [Cat3] | [Y/N] |

### Systemic Issues
| Issue | Frequency | Systems Affected | Recommended Action |
|-------|-----------|-----------------|-------------------|
| [e.g., Training gaps] | [N occurrences in 12 months] | [Systems] | [Systemic programme improvement] |

→ Trend analysis IDs systemic issues individual CAPAs miss.

If err: Trending reveals recurring roots despite CAPAs → CAPAs treating symptoms. Escalate to mgmt for systemic intervention.

Check

  • Investigation initiated w/in timeline (24h critical, 72h major)
  • Problem statement factual, no blame
  • Method appropriate for complexity
  • RCA reaches fundamental cause (not symptoms)
  • Every step supported by evidence
  • CAPAs distinguish correction, corrective, preventive
  • Each CAPA has measurable success criteria + verification plan
  • Effectiveness verified w/ evidence before closure
  • Trend analysis reviewed ≥ quarterly

Traps

  • Stop at symptom: "User made err" ≠ root. Root = why sys/process allowed err.
  • CAPA = retraining: Retraining addresses only 1 possible root (knowledge). Real root = sys design flaw / unclear SOP → retraining won't prevent.
  • Close w/o verification: Completing action ≠ verifying effectiveness. Closed CAPA w/o verification = regulatory citation waiting.
  • Blame-oriented: Focus on who made err vs what allowed err → undermines quality culture, discourages reporting.
  • No trending: Individual CAPAs seem unrelated, trending reveals systemic issues (e.g., "training" roots across multi systems = broken training prog).

  • conduct-gxp-audit — audits → findings → CAPAs
  • monitor-data-integrity — monitoring detects anomalies → investigations
  • manage-change-control — CAPA-driven changes go thru change control
  • prepare-inspection-readiness — open/overdue CAPAs top inspection targets
  • design-training-program — root = training → improve prog

GitHub Repository

pjt222/agent-almanac
Path: i18n/caveman-ultra/skills/investigate-capa-root-cause
0
agentsagentskillsai-assisted-developmentclaude-codeskillsteams

Related Skills

executing-plans

Design

Use the executing-plans skill when you have a complete implementation plan to execute in controlled batches with review checkpoints. It loads and critically reviews the plan, then executes tasks in small batches (default 3 tasks) while reporting progress between each batch for architect review. This ensures systematic implementation with built-in quality control checkpoints.

View skill

requesting-code-review

Design

This skill dispatches a code-reviewer subagent to analyze code changes against requirements before proceeding. It should be used after completing tasks, implementing major features, or before merging to main. The review helps catch issues early by comparing the current implementation with the original plan.

View skill

connect-mcp-server

Design

This skill provides a comprehensive guide for developers to connect MCP servers to Claude Code using HTTP, stdio, or SSE transports. It covers installation, configuration, authentication, and security for integrating external services like GitHub, Notion, and custom APIs. Use it when setting up MCP integrations, configuring external tools, or working with Claude's Model Context Protocol.

View skill

web-cli-teleport

Design

This skill helps developers choose between Claude Code Web and CLI interfaces based on task analysis, then enables seamless session teleportation between these environments. It optimizes workflow by managing session state and context when switching between web, CLI, or mobile. Use it for complex projects requiring different tools at various stages.

View skill