Zurück zu Fähigkeiten

forecast-operational-metrics

pjt222
Aktualisiert 2 days ago
4 Ansichten
17
2
17
Auf GitHub ansehen
Designdesign

Über

Diese Fähigkeit prognostiziert Infrastruktur- und Anwendungsmetriken wie CPU und Arbeitsspeicher mithilfe von Prophet oder statsmodels für Kapazitätsplanung und Kostenoptimierung. Sie hilft Entwicklern, Ressourcenbedarf vorherzusagen, Beschaffung zu planen und proaktive Skalierungsrichtlinien einzurichten. Die Prognosen können in Grafana visualisiert werden, inklusive Warnmeldungen bei vorhergesehener Ressourcenerschöpfung.

Schnellinstallation

Claude Code

Empfohlen
Primär
npx skills add pjt222/agent-almanac -a claude-code
Plugin-BefehlAlternativ
/plugin add https://github.com/pjt222/agent-almanac
Git CloneAlternativ
git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/forecast-operational-metrics

Kopieren Sie diesen Befehl und fügen Sie ihn in Claude Code ein, um diese Fähigkeit zu installieren

Dokumentation

Forecast Operational Metrics

Predict future resource use and system metrics for capacity plan and cost tune.

See Extended Examples for full config files and templates.

When Use

  • Need to forecast infra capacity needs (CPU, memory, disk, network)
  • Planning hardware/cloud resource purchase for next quarter
  • Want to predict cost trends and tune cloud spend
  • Need to set up proactive scaling policies by predicted load
  • Forecasting user traffic for event plan
  • Predicting DB storage growth for backup plan
  • Estimating API use for rate limit config

Inputs

  • Required: Historical time series metrics (3-12 months min)
  • Required: Metric type (CPU, memory, requests/sec, costs, etc.)
  • Required: Forecast horizon (days, weeks, or months ahead)
  • Optional: Known future events (deployments, marketing campaigns, holidays)
  • Optional: Seasonality info (daily, weekly, yearly patterns)
  • Optional: External regressors (e.g., marketing spend, user signups)

Steps

Step 1: Set Up Environment and Load Data

Install forecasting libs and prep time series data.

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install forecasting libraries
pip install prophet statsmodels pandas numpy
pip install plotly matplotlib seaborn
pip install prometheus-api-client influxdb-client
pip install grafana-api

Load and prep data with MetricsLoader:

# forecasting/data_loader.py (abbreviated)
import pandas as pd
from datetime import datetime, timedelta

class MetricsLoader:
    def load_from_prometheus(self, query: str, lookback_days: int = 90, step: str = "1h"):
        """Load historical metrics from Prometheus."""
        # ... implementation (see EXAMPLES.md for complete code)

    def resample_and_aggregate(self, df: pd.DataFrame, freq: str = "1H"):
        """Resample time series to regular intervals."""
        # ... implementation (see EXAMPLES.md)

# Example usage
loader = MetricsLoader(prometheus_url="http://prometheus:9090")
df = loader.load_from_prometheus(
    query='avg(rate(container_cpu_usage_seconds_total[5m]))',
    lookback_days=90,
)
df_daily = loader.resample_and_aggregate(df, freq="1D")

See EXAMPLES.md Step 1 for full MetricsLoader impl.

Got: Time series data loaded with regular intervals, missing values filled, ready for forecast.

If fail: Data gaps exist? Use forward-fill or interpolation, check lookback period has enough data (90+ days advised), check timestamp timezone consistency, look for outliers (>5 sigma) that may skew forecasts.

Step 2: Implement Prophet Forecasting

Use Facebook Prophet for auto seasonality detect and forecast.

# forecasting/prophet_forecaster.py (abbreviated)
from prophet import Prophet

class ProphetForecaster:
    def __init__(self, growth: str = "linear", seasonality_mode: str = "multiplicative"):
        self.growth = growth
        self.prophet_params = {
            "growth": growth,
            "seasonality_mode": seasonality_mode,
            # ... additional parameters (see EXAMPLES.md)
        }

    def fit(self, df: pd.DataFrame, regressors=None, holidays=None):
        """Train Prophet model on historical data."""
        # ... implementation (see EXAMPLES.md)

    def forecast(self, periods: int, freq: str = "D"):
        """Generate forecast for future periods."""
        # ... implementation (see EXAMPLES.md)

# Example usage
forecaster = ProphetForecaster(growth="linear", seasonality_mode="multiplicative")
forecaster.fit(df_daily)
forecast = forecaster.forecast(periods=30, freq="D")
forecaster.plot_forecast(forecast, save_path="results/cpu_forecast.png")

See EXAMPLES.md Step 2 for full ProphetForecaster impl.

Got: Forecast made for 30+ days ahead with confidence intervals, seasonal patterns caught in components plot, cross-validation MAPE < 15%.

If fail: Forecast looks unreal? Try different growth model (linear vs logistic), if seasonality missing tune seasonality_mode, if accuracy poor (<70% MAPE) add more historical data or external regressors, check for data quality issues.

Step 3: Implement ARIMA/SARIMAX Forecasting (Alternative)

Use statsmodels for traditional time series forecasting.

# forecasting/arima_forecaster.py (abbreviated)
from statsmodels.tsa.statespace.sarimax import SARIMAX

class ARIMAForecaster:
    def __init__(self, order: tuple = (1, 1, 1), seasonal_order: tuple = (1, 1, 1, 7)):
        self.order = order
        self.seasonal_order = seasonal_order

    def fit(self, df: pd.DataFrame, exog=None):
        """Train SARIMAX model."""
        series = df.set_index("timestamp")["value"]
        self.model = SARIMAX(series, exog=exog, order=self.order, seasonal_order=self.seasonal_order)
        self.fitted_model = self.model.fit(disp=False)
        # ... implementation (see EXAMPLES.md)

    def forecast(self, steps: int, exog_future=None):
        """Generate forecast for future periods."""
        # ... implementation (see EXAMPLES.md)

# Auto-select parameters
best_order, best_seasonal = auto_arima(series, seasonal=True)
forecaster = ARIMAForecaster(order=best_order, seasonal_order=best_seasonal)
forecaster.fit(df_hourly)
forecast = forecaster.forecast(steps=168)  # 7 days

See EXAMPLES.md Step 3 for full ARIMAForecaster impl and auto_arima function.

Got: ARIMA model fit with best params, forecast made with confidence intervals, diagnostic plots show white noise residuals.

If fail: Model not converge? Simplify params (drop p, q, P, Q), if forecast has wrong trend check differencing order (d, D), if residuals not white noise add more AR/MA terms, make sure series length >2x seasonal period.

Step 4: Identify Capacity Thresholds and Alerts

Check forecast to predict when resources will run out.

# forecasting/capacity_planning.py (abbreviated)
from datetime import datetime

class CapacityPlanner:
    def __init__(self, capacity_limit: float, warning_threshold: float = 0.8):
        self.capacity_limit = capacity_limit
        self.warning_threshold = warning_threshold

    def find_exhaustion_date(self, forecast: pd.DataFrame):
        """Find when forecast exceeds capacity limit."""
        exceeded = forecast[forecast["yhat"] >= self.capacity_limit]
        # ... implementation (see EXAMPLES.md)

    def generate_capacity_report(self, forecast: pd.DataFrame):
        """Generate comprehensive capacity planning report."""
        # ... implementation (see EXAMPLES.md)

# Example usage
planner = CapacityPlanner(capacity_limit=1000, warning_threshold=0.8)
report = planner.generate_capacity_report(forecast)
print(f"Warning Date: {report['warning_date']}")
print(f"Exhaustion Date: {report['exhaustion_date']}")
recommendation = planner.recommend_scaling_action(report)

See EXAMPLES.md Step 4 for full CapacityPlanner impl.

Got: Report shows when capacity limits will be hit, advice given with urgency levels, growth rates counted.

If fail: Exhaustion date unreal? Check capacity_limit is right, if growth rate too high look for outliers in historical data, think non-linear growth models for mature systems.

Step 5: Visualize Forecasts in Grafana

Push forecast data to Grafana for real-time watch.

# forecasting/grafana_integration.py (abbreviated)
import requests

class GrafanaForecaster:
    def __init__(self, grafana_url: str, api_key: str, dashboard_uid: str = None):
        self.grafana_url = grafana_url.rstrip("/")
        self.api_key = api_key
        self.dashboard_uid = dashboard_uid

    def create_annotation(self, text: str, tags: list, time: datetime = None):
        """Create annotation in Grafana for forecast events."""
        # ... implementation (see EXAMPLES.md)

    def create_capacity_alert_annotation(self, capacity_report: dict):
        """Create Grafana annotation for capacity warnings."""
        # ... implementation (see EXAMPLES.md)

# Export to CSV for Grafana datasource
def export_forecast_to_csv(forecast: pd.DataFrame, output_path: str):
    """Export forecast in format compatible with Grafana CSV datasource."""
    # ... implementation (see EXAMPLES.md)

# Example usage
grafana = GrafanaForecaster(
    grafana_url="http://grafana:3000",
    api_key="YOUR_API_KEY",
    dashboard_uid="your-dashboard-uid",
)
grafana.create_capacity_alert_annotation(report)
export_forecast_to_csv(forecast, "grafana/forecasts/cpu_forecast.csv")

See EXAMPLES.md Step 5 for full GrafanaForecaster impl.

Got: Forecast annotations show in Grafana dashboards, capacity warns visible as vertical markers, forecast data open via CSV datasource.

If fail: Check Grafana API key has right perms, check dashboard UID is right, ensure timestamps in milliseconds for annotations, test API with curl before tie in.

Step 6: Automate Forecast Generation

Set up scheduled jobs to make forecasts regular.

# forecasting/scheduler.py (abbreviated)
import schedule
import time

def generate_daily_forecast():
    """Generate forecast for all monitored metrics."""
    logger.info("Starting daily forecast generation")

    metrics_config = [
        {"name": "cpu_usage", "query": "...", "capacity_limit": 0.8, "forecast_days": 30},
        {"name": "memory_usage", "query": "...", "capacity_limit": 32, "forecast_days": 30},
        {"name": "disk_usage", "query": "...", "capacity_limit": 500, "forecast_days": 90},
    ]

    loader = MetricsLoader(prometheus_url="http://prometheus:9090")

    for metric_config in metrics_config:
        df = loader.load_from_prometheus(query=metric_config["query"], lookback_days=90)
        forecaster = ProphetForecaster()
        forecaster.fit(df)
        forecast = forecaster.forecast(periods=metric_config["forecast_days"])

        planner = CapacityPlanner(capacity_limit=metric_config["capacity_limit"])
        report = planner.generate_capacity_report(forecast)

        export_forecast_to_csv(forecast, f"grafana/forecasts/{metric_config['name']}_forecast.csv")
        # ... (see EXAMPLES.md for complete implementation)

# Schedule daily at 2 AM
schedule.every().day.at("02:00").do(generate_daily_forecast)

while True:
    schedule.run_pending()
    time.sleep(60)

See EXAMPLES.md Step 6 for full scheduler impl.

Got: Forecasts made daily for all metrics, capacity reports logged, CSV files exported for Grafana, alerts sent for critical capacity warns.

If fail: Check scheduler process runs continuous (use systemd/supervisor), check Prometheus connectivity, ensure enough disk space for forecast exports, add retry logic for short-term failures, set up watch for scheduler itself.

Validation

  • Historical data loaded with 90+ days of continuous metrics
  • Prophet forecast catches daily/weekly seasonality in components plot
  • Forecast confidence intervals hold 85-95% of actual values in check
  • Capacity exhaustion dates counted right for known scenarios
  • ARIMA model residuals show as white noise in diagnostic plots
  • Grafana annotations show at predicted warn/exhaustion dates
  • Auto forecast runs daily with no manual step
  • Forecast accuracy (MAPE) < 15% on check set

Pitfalls

  • Not enough historical data: Need 3-12 months for reliable seasonality detect; avoid forecasting with <60 days
  • Ignore known events: Holidays, deployments, marketing campaigns skew forecasts; add as external regressors or holidays
  • Over-confidence in long-term forecasts: Accuracy drops beyond 30-90 days; use as directional guide, not exact predictions
  • Static capacity limits: Infra changes over time; update capacity_limit when adding resources
  • Forecasting anomalies: Outliers in training data pass to forecast; clean data or use robust methods
  • Not update models: Forecasts stale after system shifts; retrain weekly or after big arch changes
  • Ignore confidence intervals: Point forecasts misleading; always use lower/upper bounds for plan
  • Wrong seasonality period: Daily for hourly data, weekly for daily data; mismatch gives poor forecasts

See Also

  • detect-anomalies-aiops - Anomaly detect complement forecast for proactive watch
  • plan-capacity - Infra capacity plan flows
  • build-grafana-dashboards - Visualize forecasts and capacity trends

GitHub Repository

pjt222/agent-almanac
Pfad: i18n/caveman/skills/forecast-operational-metrics
0
agentsagentskillsai-assisted-developmentclaude-codeskillsteams

Verwandte Skills

executing-plans

Design

Verwenden Sie die Fähigkeit "executing-plans", wenn Sie einen vollständigen Implementierungsplan zur Ausführung in kontrollierten Batches mit Überprüfungspunkten vorliegen haben. Sie lädt den Plan und überprüft ihn kritisch, führt dann Aufgaben in kleinen Batches (standardmäßig 3 Aufgaben) aus und meldet den Fortschritt zwischen jedem Batch zur Überprüfung durch den Architekten. Dies gewährleistet eine systematische Implementierung mit integrierten Qualitätskontrollpunkten.

Skill ansehen

requesting-code-review

Design

Diese Fähigkeit sendet einen Unteragenten für Code-Review, um Codeänderungen anhand der Anforderungen zu analysieren, bevor fortgefahren wird. Sie sollte nach dem Abschließen von Aufgaben, der Implementierung größerer Funktionen oder vor dem Zusammenführen in den Hauptzweig verwendet werden. Die Überprüfung hilft dabei, Probleme frühzeitig zu erkennen, indem die aktuelle Implementierung mit dem ursprünglichen Plan verglichen wird.

Skill ansehen

connect-mcp-server

Design

Diese Fähigkeit bietet Entwicklern eine umfassende Anleitung, um MCP-Server über HTTP-, stdio- oder SSE-Transports mit Claude Code zu verbinden. Sie behandelt Installation, Konfiguration, Authentifizierung und Sicherheit für die Integration externer Dienste wie GitHub, Notion und benutzerdefinierter APIs. Nutzen Sie sie beim Einrichten von MCP-Integrationen, bei der Konfiguration externer Tools oder bei der Arbeit mit Claude's Model Context Protocol.

Skill ansehen

web-cli-teleport

Design

Diese Fähigkeit unterstützt Entwickler bei der Wahl zwischen Claude Code Web- und CLI-Schnittstellen basierend auf Aufgabenanalysen und ermöglicht nahtloses Session-Teleporting zwischen diesen Umgebungen. Sie optimiert den Workflow, indem sie den Sitzungsstatus und Kontext beim Wechsel zwischen Web, CLI oder Mobilgeräten verwaltet. Nutzen Sie sie für komplexe Projekte, die in verschiedenen Phasen unterschiedliche Werkzeuge erfordern.

Skill ansehen