forecast-operational-metrics
Über
Diese Fähigkeit prognostiziert Infrastruktur- und Anwendungsmetriken wie CPU und Arbeitsspeicher mithilfe von Prophet oder statsmodels für Kapazitätsplanung und Kostenoptimierung. Sie ermöglicht die Visualisierung von Vorhersagen in Grafana und das Setzen von Warnungen bei prognostiziertem Ressourcenverbrauch. Nutzen Sie sie bei der Planung von Hardwarebeschaffung, der Optimierung von Cloud-Ausgaben oder der Einrichtung proaktiver Skalierungsrichtlinien basierend auf vorhergesagter Last.
Schnellinstallation
Claude Code
Empfohlennpx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/forecast-operational-metricsKopieren Sie diesen Befehl und fügen Sie ihn in Claude Code ein, um diese Fähigkeit zu installieren
Dokumentation
Forecast Operational Metrics
Predict future resource usage + system metrics for capacity plan + cost optimization.
See Extended Examples for complete configuration files and templates.
Use When
- Forecast infra capacity (CPU, memory, disk, net)
- Plan hardware/cloud procurement next quarter
- Predict cost trends + optimize cloud spending
- Setup proactive scaling policies on predicted load
- Forecast user traffic for event planning
- Predict DB storage growth for backup planning
- Estimate API usage for rate limiting config
In
- Required: Historical time series (3-12mo min)
- Required: Metric type (CPU, memory, req/sec, costs, etc.)
- Required: Forecast horizon (days, weeks, months)
- Optional: Known future events (deployments, campaigns, holidays)
- Optional: Seasonality (daily, weekly, yearly)
- Optional: External regressors (marketing spend, signups)
Do
Step 1: Setup + Load Data
Install libs + prep time series.
# Create virtual environment
python -m venv venv
source venv/bin/activate
# Install forecasting libraries
pip install prophet statsmodels pandas numpy
pip install plotly matplotlib seaborn
pip install prometheus-api-client influxdb-client
pip install grafana-api
Load + prep w/ MetricsLoader:
# forecasting/data_loader.py (abbreviated)
import pandas as pd
from datetime import datetime, timedelta
class MetricsLoader:
def load_from_prometheus(self, query: str, lookback_days: int = 90, step: str = "1h"):
"""Load historical metrics from Prometheus."""
# ... implementation (see EXAMPLES.md for complete code)
def resample_and_aggregate(self, df: pd.DataFrame, freq: str = "1H"):
"""Resample time series to regular intervals."""
# ... implementation (see EXAMPLES.md)
# Example usage
loader = MetricsLoader(prometheus_url="http://prometheus:9090")
df = loader.load_from_prometheus(
query='avg(rate(container_cpu_usage_seconds_total[5m]))',
lookback_days=90,
)
df_daily = loader.resample_and_aggregate(df, freq="1D")
See EXAMPLES.md Step 1 for complete MetricsLoader.
→ Time series loaded regular intervals, missing filled, ready forecast.
If err: gaps → forward-fill or interpolate, ensure lookback ≥90 days, verify tz consistency, check outliers (>5 sigma) skewing forecasts.
Step 2: Prophet Forecasting
FB Prophet for auto seasonality detection + forecasting.
# forecasting/prophet_forecaster.py (abbreviated)
from prophet import Prophet
class ProphetForecaster:
def __init__(self, growth: str = "linear", seasonality_mode: str = "multiplicative"):
self.growth = growth
self.prophet_params = {
"growth": growth,
"seasonality_mode": seasonality_mode,
# ... additional parameters (see EXAMPLES.md)
}
def fit(self, df: pd.DataFrame, regressors=None, holidays=None):
"""Train Prophet model on historical data."""
# ... implementation (see EXAMPLES.md)
def forecast(self, periods: int, freq: str = "D"):
"""Generate forecast for future periods."""
# ... implementation (see EXAMPLES.md)
# Example usage
forecaster = ProphetForecaster(growth="linear", seasonality_mode="multiplicative")
forecaster.fit(df_daily)
forecast = forecaster.forecast(periods=30, freq="D")
forecaster.plot_forecast(forecast, save_path="results/cpu_forecast.png")
See EXAMPLES.md Step 2 for complete ProphetForecaster.
→ Forecast 30+ days w/ CI, seasonal patterns in components plot, cross-validation MAPE < 15%.
If err: unrealistic → try diff growth (linear vs logistic), seasonality missing → adjust seasonality_mode, poor accuracy (<70% MAPE) → more data or external regressors, check data quality.
Step 3: ARIMA/SARIMAX (Alternative)
Statsmodels for traditional time series.
# forecasting/arima_forecaster.py (abbreviated)
from statsmodels.tsa.statespace.sarimax import SARIMAX
class ARIMAForecaster:
def __init__(self, order: tuple = (1, 1, 1), seasonal_order: tuple = (1, 1, 1, 7)):
self.order = order
self.seasonal_order = seasonal_order
def fit(self, df: pd.DataFrame, exog=None):
"""Train SARIMAX model."""
series = df.set_index("timestamp")["value"]
self.model = SARIMAX(series, exog=exog, order=self.order, seasonal_order=self.seasonal_order)
self.fitted_model = self.model.fit(disp=False)
# ... implementation (see EXAMPLES.md)
def forecast(self, steps: int, exog_future=None):
"""Generate forecast for future periods."""
# ... implementation (see EXAMPLES.md)
# Auto-select parameters
best_order, best_seasonal = auto_arima(series, seasonal=True)
forecaster = ARIMAForecaster(order=best_order, seasonal_order=best_seasonal)
forecaster.fit(df_hourly)
forecast = forecaster.forecast(steps=168) # 7 days
See EXAMPLES.md Step 3 for complete ARIMAForecaster + auto_arima.
→ ARIMA fit optimal params, forecast w/ CI, diagnostic plots show white noise residuals.
If err: no convergence → simplify params (reduce p, q, P, Q), wrong trend → check differencing (d, D), residuals not white noise → add more AR/MA, ensure series length >2x seasonal period.
Step 4: Capacity Thresholds + Alerts
Analyze forecast → predict exhaustion.
# forecasting/capacity_planning.py (abbreviated)
from datetime import datetime
class CapacityPlanner:
def __init__(self, capacity_limit: float, warning_threshold: float = 0.8):
self.capacity_limit = capacity_limit
self.warning_threshold = warning_threshold
def find_exhaustion_date(self, forecast: pd.DataFrame):
"""Find when forecast exceeds capacity limit."""
exceeded = forecast[forecast["yhat"] >= self.capacity_limit]
# ... implementation (see EXAMPLES.md)
def generate_capacity_report(self, forecast: pd.DataFrame):
"""Generate comprehensive capacity planning report."""
# ... implementation (see EXAMPLES.md)
# Example usage
planner = CapacityPlanner(capacity_limit=1000, warning_threshold=0.8)
report = planner.generate_capacity_report(forecast)
print(f"Warning Date: {report['warning_date']}")
print(f"Exhaustion Date: {report['exhaustion_date']}")
recommendation = planner.recommend_scaling_action(report)
See EXAMPLES.md Step 4 for complete CapacityPlanner.
→ Report shows when limits reached, recommendations w/ urgency levels, growth rates.
If err: unrealistic exhaustion date → verify capacity_limit correct, growth too high → check outliers, non-linear growth models for mature systems.
Step 5: Grafana Visualization
Push forecast data → Grafana real-time monitoring.
# forecasting/grafana_integration.py (abbreviated)
import requests
class GrafanaForecaster:
def __init__(self, grafana_url: str, api_key: str, dashboard_uid: str = None):
self.grafana_url = grafana_url.rstrip("/")
self.api_key = api_key
self.dashboard_uid = dashboard_uid
def create_annotation(self, text: str, tags: list, time: datetime = None):
"""Create annotation in Grafana for forecast events."""
# ... implementation (see EXAMPLES.md)
def create_capacity_alert_annotation(self, capacity_report: dict):
"""Create Grafana annotation for capacity warnings."""
# ... implementation (see EXAMPLES.md)
# Export to CSV for Grafana datasource
def export_forecast_to_csv(forecast: pd.DataFrame, output_path: str):
"""Export forecast in format compatible with Grafana CSV datasource."""
# ... implementation (see EXAMPLES.md)
# Example usage
grafana = GrafanaForecaster(
grafana_url="http://grafana:3000",
api_key="YOUR_API_KEY",
dashboard_uid="your-dashboard-uid",
)
grafana.create_capacity_alert_annotation(report)
export_forecast_to_csv(forecast, "grafana/forecasts/cpu_forecast.csv")
See EXAMPLES.md Step 5 for complete GrafanaForecaster.
→ Annotations in dashboards, capacity warnings visible as vertical markers, forecast accessible via CSV datasource.
If err: verify API key perms, check dashboard UID correct, ensure timestamps ms for annotations, test API w/ curl before integrating.
Step 6: Automate Generation
Scheduled jobs → forecasts regularly.
# forecasting/scheduler.py (abbreviated)
import schedule
import time
def generate_daily_forecast():
"""Generate forecast for all monitored metrics."""
logger.info("Starting daily forecast generation")
metrics_config = [
{"name": "cpu_usage", "query": "...", "capacity_limit": 0.8, "forecast_days": 30},
{"name": "memory_usage", "query": "...", "capacity_limit": 32, "forecast_days": 30},
{"name": "disk_usage", "query": "...", "capacity_limit": 500, "forecast_days": 90},
]
loader = MetricsLoader(prometheus_url="http://prometheus:9090")
for metric_config in metrics_config:
df = loader.load_from_prometheus(query=metric_config["query"], lookback_days=90)
forecaster = ProphetForecaster()
forecaster.fit(df)
forecast = forecaster.forecast(periods=metric_config["forecast_days"])
planner = CapacityPlanner(capacity_limit=metric_config["capacity_limit"])
report = planner.generate_capacity_report(forecast)
export_forecast_to_csv(forecast, f"grafana/forecasts/{metric_config['name']}_forecast.csv")
# ... (see EXAMPLES.md for complete implementation)
# Schedule daily at 2 AM
schedule.every().day.at("02:00").do(generate_daily_forecast)
while True:
schedule.run_pending()
time.sleep(60)
See EXAMPLES.md Step 6 for complete scheduler.
→ Forecasts daily all metrics, capacity reports logged, CSV exported, alerts sent critical warnings.
If err: verify scheduler runs continuously (systemd/supervisor), check Prometheus connectivity, ensure sufficient disk, retry logic for transient failures, monitor scheduler itself.
Check
- Historical data ≥90 days continuous
- Prophet captures daily/weekly seasonality in components
- Forecast CI contains 85-95% actual in validation
- Capacity exhaustion correct known scenarios
- ARIMA residuals white noise in diagnostic
- Grafana annotations at predicted warning/exhaustion
- Automated daily w/o manual intervention
- Forecast accuracy (MAPE) < 15% validation
Traps
- Insufficient data: Need 3-12mo reliable seasonality. Avoid <60 days.
- Ignore known events: Holidays, deployments, campaigns skew → add as external regressors or holidays.
- Overconfidence long-term: Accuracy degrades beyond 30-90 days. Directional guidance not exact.
- Static capacity: Infra changes. Update
capacity_limitwhen adding. - Forecast anomalies: Outliers propagate. Clean data or robust methods.
- Not updating models: Stale after system changes. Retrain weekly or after significant arch.
- Ignore CI: Point forecasts misleading. Always lower/upper bounds for planning.
- Wrong seasonality period: Daily for hourly, weekly for daily. Mismatch → poor forecasts.
→
detect-anomalies-aiops— Anomaly detection complements forecastingplan-capacity— Infra capacity planning workflowsbuild-grafana-dashboards— Visualize forecasts + capacity trends
GitHub Repository
Verwandte Skills
executing-plans
DesignVerwenden Sie die Fähigkeit "executing-plans", wenn Sie einen vollständigen Implementierungsplan zur Ausführung in kontrollierten Batches mit Überprüfungspunkten vorliegen haben. Sie lädt den Plan und überprüft ihn kritisch, führt dann Aufgaben in kleinen Batches (standardmäßig 3 Aufgaben) aus und meldet den Fortschritt zwischen jedem Batch zur Überprüfung durch den Architekten. Dies gewährleistet eine systematische Implementierung mit integrierten Qualitätskontrollpunkten.
requesting-code-review
DesignDiese Fähigkeit sendet einen Unteragenten für Code-Review, um Codeänderungen anhand der Anforderungen zu analysieren, bevor fortgefahren wird. Sie sollte nach dem Abschließen von Aufgaben, der Implementierung größerer Funktionen oder vor dem Zusammenführen in den Hauptzweig verwendet werden. Die Überprüfung hilft dabei, Probleme frühzeitig zu erkennen, indem die aktuelle Implementierung mit dem ursprünglichen Plan verglichen wird.
connect-mcp-server
DesignDiese Fähigkeit bietet Entwicklern eine umfassende Anleitung, um MCP-Server über HTTP-, stdio- oder SSE-Transports mit Claude Code zu verbinden. Sie behandelt Installation, Konfiguration, Authentifizierung und Sicherheit für die Integration externer Dienste wie GitHub, Notion und benutzerdefinierter APIs. Nutzen Sie sie beim Einrichten von MCP-Integrationen, bei der Konfiguration externer Tools oder bei der Arbeit mit Claude's Model Context Protocol.
web-cli-teleport
DesignDiese Fähigkeit unterstützt Entwickler bei der Wahl zwischen Claude Code Web- und CLI-Schnittstellen basierend auf Aufgabenanalysen und ermöglicht nahtloses Session-Teleporting zwischen diesen Umgebungen. Sie optimiert den Workflow, indem sie den Sitzungsstatus und Kontext beim Wechsel zwischen Web, CLI oder Mobilgeräten verwaltet. Nutzen Sie sie für komplexe Projekte, die in verschiedenen Phasen unterschiedliche Werkzeuge erfordern.
