plan-capacity
Über
Diese Fähigkeit prognostiziert Ressourcenbedarf anhand historischer Metriken und Wachstumsmodelle wie predict_linear. Sie identifiziert Engpässe, berechnet Kapazitätsspielraum und empfiehlt Skalierungsmaßnahmen, um eine Sättigung zu verhindern. Nutzen Sie sie vor Traffic-Spitzen, Produkteinführungen oder während Quartalsüberprüfungen, wenn die Auslastungstrends nach oben zeigen.
Schnellinstallation
Claude Code
Empfohlennpx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/plan-capacityKopieren Sie diesen Befehl und fügen Sie ihn in Claude Code ein, um diese Fähigkeit zu installieren
Dokumentation
Plan Capacity
Forecast resource needs, prevent saturation via data-driven capacity planning.
Use When
- Before seasonal traffic spikes (holidays, sales)
- New feature launches
- Quarterly capacity reviews
- Resource util trends up
- Before budget cycles
In
- Required: Historical metrics (CPU, mem, disk, net, req/s)
- Required: Time range for trend (min 4 weeks)
- Optional: Business growth projections (user growth, launches)
- Optional: Budget constraints
Do
Step 1: Collect Historical Metrics
Query Prometheus for key resources:
# CPU usage trend over 8 weeks
avg(rate(node_cpu_seconds_total{mode!="idle"}[5m])) by (instance)
# Memory usage trend
avg(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) by (instance)
# Disk usage growth
avg(node_filesystem_size_bytes - node_filesystem_free_bytes) by (instance, device)
# Request rate growth
sum(rate(http_requests_total[5m])) by (service)
# Database connection pool usage
avg(db_connection_pool_used / db_connection_pool_max) by (instance)
Export → analyze:
# Export 8 weeks of CPU data
curl -G 'http://prometheus:9090/api/v1/query_range' \
--data-urlencode 'query=avg(rate(node_cpu_seconds_total{mode!="idle"}[5m])) by (instance)' \
--data-urlencode 'start=2024-12-15T00:00:00Z' \
--data-urlencode 'end=2025-02-09T00:00:00Z' \
--data-urlencode 'step=1h' | jq '.data.result' > cpu_8weeks.json
→ Clean time series per resource, no big gaps.
If err: missing data → forecast accuracy down. Check metric retention + scrape intervals.
Step 2: Calc Growth w/ predict_linear
Prometheus predict_linear() → forecast saturation:
# Predict when CPU will hit 80% (4 weeks ahead)
predict_linear(
avg(rate(node_cpu_seconds_total{mode!="idle"}[5m]))[8w:],
4*7*24*3600 # 4 weeks in seconds
) > 0.80
# Predict disk full date (8 weeks ahead)
predict_linear(
avg(node_filesystem_size_bytes - node_filesystem_free_bytes)[8w:],
8*7*24*3600
) > 0.95 * avg(node_filesystem_size_bytes)
# Predict memory pressure (2 weeks ahead)
predict_linear(
avg(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes)[8w:],
2*7*24*3600
) / avg(node_memory_MemTotal_bytes) > 0.90
# Predict request rate capacity breach (4 weeks ahead)
predict_linear(
sum(rate(http_requests_total[5m]))[8w:],
4*7*24*3600
) > 10000 # known capacity limit
Forecast dashboard:
{
"dashboard": {
"title": "Capacity Forecast",
"panels": [
{
"title": "CPU Saturation Forecast (4 weeks)",
"targets": [
{
"expr": "predict_linear(avg(rate(node_cpu_seconds_total{mode!=\"idle\"}[5m]))[8w:], 4*7*24*3600)",
"legendFormat": "Predicted CPU"
},
{
"expr": "0.80",
"legendFormat": "Target Threshold (80%)"
}
]
},
{
"title": "Disk Full Date",
"targets": [
{
"expr": "(avg(node_filesystem_size_bytes) - predict_linear(avg(node_filesystem_free_bytes)[8w:], 8*7*24*3600)) / avg(node_filesystem_size_bytes)",
"legendFormat": "Predicted Usage %"
}
]
}
]
}
}
→ Clear viz → when resources breach thresholds.
If err: predictions wrong (negative, wild swings) → check:
- Insufficient history (need 4+ weeks)
- Step spikes (deploys, migrations) distorting trend
- Seasonal patterns not in linear model
Step 3: Calc Current Headroom
Safety margin before saturation:
# CPU headroom (percentage remaining before 80% threshold)
(0.80 - avg(rate(node_cpu_seconds_total{mode!="idle"}[5m]))) / 0.80 * 100
# Memory headroom (bytes remaining before 90% usage)
avg(node_memory_MemAvailable_bytes) - (avg(node_memory_MemTotal_bytes) * 0.10)
# Request rate headroom (requests/sec before saturation)
10000 - sum(rate(http_requests_total[5m]))
# Time until saturation (weeks until CPU hits 80%)
(0.80 - avg(rate(node_cpu_seconds_total{mode!="idle"}[5m]))) /
deriv(avg(rate(node_cpu_seconds_total{mode!="idle"}[5m]))[8w:]) /
(7*24*3600)
Headroom summary report:
cat > capacity_headroom.md <<'EOF'
# Capacity Headroom Report (2025-02-09)
## Current Utilization
- **CPU**: 45% average (target: <80%)
- **Memory**: 62% (target: <90%)
- **Disk**: 71% (target: <95%)
- **Request Rate**: 4,200 req/s (capacity: 10,000)
## Headroom Analysis
- **CPU**: 35% headroom → ~12 weeks until saturation
- **Memory**: 28% headroom → ~16 weeks until saturation
- **Disk**: 24% headroom → ~8 weeks until full
- **Request Rate**: 5,800 req/s headroom → ~20 weeks until capacity
## Priority Actions
1. **Disk**: Implement log rotation or expand volume within 4 weeks
2. **CPU**: Plan horizontal scaling in next quarter
3. **Memory**: Monitor but no immediate action needed
EOF
→ Quantified headroom per resource w/ time-to-saturation.
If err: headroom already negative → reactive mode. Immediate scaling needed.
Step 4: Model Growth Scenarios
Factor in business projections:
# Example Python script for scenario modeling
import pandas as pd
import numpy as np
# Load historical data
df = pd.read_json('cpu_8weeks.json')
# Calculate weekly growth rate
growth_rate_weekly = df['value'].pct_change(periods=7).mean()
# Scenario 1: Current trend
weeks_ahead = 12
current_trend = df['value'].iloc[-1] * (1 + growth_rate_weekly) ** weeks_ahead
# Scenario 2: 2x user growth (marketing campaign)
accelerated_trend = df['value'].iloc[-1] * (1 + growth_rate_weekly * 2) ** weeks_ahead
# Scenario 3: New feature launch (+30% baseline)
feature_launch = (df['value'].iloc[-1] * 1.30) * (1 + growth_rate_weekly) ** weeks_ahead
print(f"Current Trend (12 weeks): {current_trend:.1%} CPU")
print(f"2x Growth Scenario: {accelerated_trend:.1%} CPU")
print(f"Feature Launch Scenario: {feature_launch:.1%} CPU")
print(f"Threshold: 80%")
→ Multiple scenarios → impact of business changes on capacity.
If err: scenarios exceed capacity → prioritize scaling before event.
Step 5: Generate Scaling Recommendations
Actionable recs:
## Capacity Scaling Plan
### Immediate Actions (Next 4 Weeks)
1. **Disk Expansion** [Priority: HIGH]
- Current: 500GB, 71% used
- Projected full date: 2025-04-01 (8 weeks)
- Action: Expand to 1TB by 2025-03-15
- Cost: $50/month additional
- Justification: 5 weeks lead time needed
2. **Log Rotation Policy** [Priority: MEDIUM]
- Current: Logs retained 90 days
- Action: Reduce to 30 days, archive to S3
- Savings: ~150GB disk space
- Cost: $5/month S3 storage
### Near-Term Actions (Next Quarter)
3. **Horizontal Scaling - API Tier** [Priority: MEDIUM]
- Current: 4 instances, 45% CPU
- Projected: 65% CPU by 2025-05-01
- Action: Add 2 instances (to 6 total)
- Cost: $400/month
- Trigger: When CPU avg exceeds 60% for 7 days
4. **Database Connection Pool** [Priority: LOW]
- Current: 50 max connections, 40% used
- Projected: 55% by Q3
- Action: Increase to 75 in Q2
- Cost: None (configuration change)
### Long-Term Planning (Next 6 Months)
5. **Migration to Auto-Scaling** [Priority: MEDIUM]
- Current: Manual scaling
- Action: Implement Kubernetes HPA (Horizontal Pod Autoscaler)
- Timeline: Q3 2025
- Benefit: Automatic response to load spikes
→ Prioritized list w/ costs, timelines, trigger conditions.
If err: recs rejected on cost → revisit thresholds or accept risk.
Step 6: Set Up Capacity Alerts
Alerts on low headroom:
# capacity_alerts.yml
groups:
- name: capacity
interval: 1h
rules:
- alert: CPUCapacityLow
expr: |
(0.80 - avg(rate(node_cpu_seconds_total{mode!="idle"}[5m]))) / 0.80 < 0.20
for: 24h
labels:
severity: warning
annotations:
summary: "CPU headroom below 20%"
description: "Current CPU headroom: {{ $value | humanizePercentage }}. Scaling needed within 4 weeks."
- alert: DiskFillForecast
expr: |
predict_linear(avg(node_filesystem_free_bytes)[8w:], 4*7*24*3600) < 0.10 * avg(node_filesystem_size_bytes)
for: 1h
labels:
severity: warning
annotations:
summary: "Disk projected to fill within 4 weeks"
description: "Expand disk volume soon."
- alert: MemoryCapacityLow
expr: |
avg(node_memory_MemAvailable_bytes) < 0.15 * avg(node_memory_MemTotal_bytes)
for: 6h
labels:
severity: warning
annotations:
summary: "Memory headroom below 15%"
→ Alerts fire before saturation → time to scale proactively.
If err: alerts fire too often (alert fatigue) or too late (reactive) → tune thresholds.
Check
- Historical metrics ≥ 8 weeks
-
predict_linear()returns sensible forecasts (no negatives) - Headroom calc'd for all critical resources
- Scenarios include business projections
- Scaling recs have costs + timelines
- Capacity alerts configured + tested
- Report reviewed w/ eng leadership + finance
Traps
- Insufficient history: Linear predictions need 4+ weeks. Less → unreliable.
- Ignore step changes: Deploys, migrations, launches → spikes distort trends. Filter or annotate.
- Linear assumption: Not all growth linear. Exponential (viral) needs other models.
- Forget lead time: Cloud provision fast, but procurement, budgets, migrations take weeks. Plan early.
- No budget alignment: Planning w/o budget buy-in → last-min scramble. Involve finance early.
→
setup-prometheus-monitoring— collect metrics used for capacity planningbuild-grafana-dashboards— viz forecasts + headroomoptimize-cloud-costs— balance capacity vs cost
GitHub Repository
Verwandte Skills
executing-plans
DesignVerwenden Sie die Fähigkeit "executing-plans", wenn Sie einen vollständigen Implementierungsplan zur Ausführung in kontrollierten Batches mit Überprüfungspunkten vorliegen haben. Sie lädt den Plan und überprüft ihn kritisch, führt dann Aufgaben in kleinen Batches (standardmäßig 3 Aufgaben) aus und meldet den Fortschritt zwischen jedem Batch zur Überprüfung durch den Architekten. Dies gewährleistet eine systematische Implementierung mit integrierten Qualitätskontrollpunkten.
requesting-code-review
DesignDiese Fähigkeit sendet einen Unteragenten für Code-Review, um Codeänderungen anhand der Anforderungen zu analysieren, bevor fortgefahren wird. Sie sollte nach dem Abschließen von Aufgaben, der Implementierung größerer Funktionen oder vor dem Zusammenführen in den Hauptzweig verwendet werden. Die Überprüfung hilft dabei, Probleme frühzeitig zu erkennen, indem die aktuelle Implementierung mit dem ursprünglichen Plan verglichen wird.
connect-mcp-server
DesignDiese Fähigkeit bietet Entwicklern eine umfassende Anleitung, um MCP-Server über HTTP-, stdio- oder SSE-Transports mit Claude Code zu verbinden. Sie behandelt Installation, Konfiguration, Authentifizierung und Sicherheit für die Integration externer Dienste wie GitHub, Notion und benutzerdefinierter APIs. Nutzen Sie sie beim Einrichten von MCP-Integrationen, bei der Konfiguration externer Tools oder bei der Arbeit mit Claude's Model Context Protocol.
web-cli-teleport
DesignDiese Fähigkeit unterstützt Entwickler bei der Wahl zwischen Claude Code Web- und CLI-Schnittstellen basierend auf Aufgabenanalysen und ermöglicht nahtloses Session-Teleporting zwischen diesen Umgebungen. Sie optimiert den Workflow, indem sie den Sitzungsstatus und Kontext beim Wechsel zwischen Web, CLI oder Mobilgeräten verwaltet. Nutzen Sie sie für komplexe Projekte, die in verschiedenen Phasen unterschiedliche Werkzeuge erfordern.
