SKILL·375F79

configure-log-aggregation

Name: configure-log-aggregation
Author: pjt222

pjt222

Updated 1 month ago

8 views

Otherai

About

This skill sets up centralized log aggregation using Loki/Promtail or ELK Stack for parsing, label extraction, and retention policies. It's used when you need to consolidate logs from multiple services into a searchable system, replace local log files, or correlate logs with metrics and traces. The configuration enables structured logging and cross-service analysis for production troubleshooting.

Quick Install

Claude Code

Recommended

Primary

npx skills add pjt222/agent-almanac -a claude-code

Plugin CommandAlternative

/plugin add https://github.com/pjt222/agent-almanac

Git CloneAlternative

git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/configure-log-aggregation

Copy and paste this command in Claude Code to install this skill

Documentation

name: configure-log-aggregation description: > LokiとPromtail（またはELKスタック）を使用して、ログの解析、ラベル抽出、保持ポリシー、メトリクスとの相関のための統合ログ集約をセットアップする。複数サービスのログを検索可能なシステムに統合する場合、ローカルログファイルを集中型のクエリ可能なストレージに置き換える場合、ログをメトリクスやトレースと相関させる場合、ラベル抽出による構造化ログを実装する場合、またはクロスサービスのログ分析を必要とする本番インシデントのトラブルシューティングを行う場合に使用する。 locale: ja source_locale: en source_commit: 6f65f316 translator: claude-opus-4-6 translation_date: 2026-03-16 license: MIT allowed-tools: Read Write Edit Bash Grep Glob metadata: author: Philipp Thoss version: "1.0" domain: observability complexity: intermediate language: multi tags: loki, promtail, logging, elk, log-aggregation

ログ集約の設定

運用の可視性を確保するために、Loki/PromtailまたはELKスタックを使用して、集中型のログ収集、解析、クエリを実装する。

使用タイミング

複数のサービスまたはホストからのログを検索可能なシステムに統合する場合
ローカルログファイルを集中型のクエリ可能なログストレージに置き換える場合
ログをメトリクスやトレースと相関させて完全な可観測性を実現する場合
非構造化ログからのラベル抽出を使用して構造化ログを実装する場合
ストレージとコンプライアンスのニーズに基づいてログデータの保持ポリシーを設定する場合
クロスサービスのログ分析を必要とする本番インシデントのトラブルシューティングを行う場合

入力

必須: ログソース（アプリケーションログ、システムログ、コンテナログ）
必須: ログフォーマットパターン（JSON、プレーンテキスト、syslogなど）
任意: 構造化クエリのためのラベル抽出ルール
任意: 保持と圧縮のポリシー
任意: 既存のログシッパー設定（Fluentd、Filebeat、Promtail）

手順

完全な設定ファイルとテンプレートは拡張例を参照。

ステップ1: ログ集約スタックの選択

要件に基づいてLoki（Prometheusスタイル）またはELK（Elasticsearchベース）から選択する。

Lokiの利点：

軽量で、Kubernetesとクラウドネイティブ環境向けに設計されている
ストレージのオーバーヘッドを低減するラベルベースのインデックス（Prometheusと同様）
統合ダッシュボードのためのGrafanaとのネイティブ統合
オブジェクトストレージ（S3、GCS）を使用した水平スケーラビリティ
Elasticsearchと比較してリソース消費が少ない

ELKの利点：

すべてのログコンテンツ（ラベルだけでなく）に対する全文検索
豊富なクエリDSLと集計
beats、logstashプラグインを持つ成熟したエコシステム
深い歴史的検索を必要とするコンプライアンス/監査ログに適している

このガイドでは、Loki + Promtail（ほとんどの現代的なセットアップに推奨）に焦点を当てる。

判断基準：

Use Loki if:
- You want label-based queries similar to Prometheus
- Storage costs are a concern (Loki indexes only labels)
- You already use Grafana for metrics
- Kubernetes/container-native deployment

Use ELK if:
- You need full-text search across all log content
- You have complex log parsing and enrichment requirements
- You require advanced analytics and aggregations
- Legacy systems with existing Logstash pipelines

期待結果： 要件に基づいた明確な選択が行われ、チームが適切なインストールアーティファクトをダウンロードする。

失敗時：

ストレージ要件をベンチマークする：Lokiは同じログに対してElasticsearchより約10倍少ない
クエリパターンを評価する：全文検索のニーズとラベルフィルタリング
運用オーバーヘッドを考慮する：ELKはより多くのチューニングとリソースを必要とする

ステップ2: Lokiのデプロイ

適切なストレージバックエンドでLokiをインストールして設定する。

Docker Composeデプロイ（docker-compose.yml）：

version: '3.8'

services:
  loki:
    image: grafana/loki:2.9.0
    ports:
      - "3100:3100"
    volumes:
      - ./loki-config.yml:/etc/loki/local-config.yaml
      - loki-data:/loki
    command: -config.file=/etc/loki/local-config.yaml
    restart: unless-stopped

  promtail:
    image: grafana/promtail:2.9.0
    volumes:
      - ./promtail-config.yml:/etc/promtail/config.yml
      - /var/log:/var/log:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
    command: -config.file=/etc/promtail/config.yml
    restart: unless-stopped
    depends_on:
      - loki

volumes:
  loki-data:

Loki設定（loki-config.yml）：

auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

# ... (完全な設定はEXAMPLES.mdを参照)

S3ストレージを使用した本番環境向け：

storage_config:
  aws:
    s3: s3://us-east-1/my-loki-bucket
    s3forcepathstyle: true
  boltdb_shipper:
    active_index_directory: /loki/index
    cache_location: /loki/cache
    shared_store: s3

期待結果： Lokiが正常に起動し、http://localhost:3100/readyでヘルスチェックが通過し、保持ポリシーに従ってログが保存される。

失敗時：

Lokiのログを確認する: docker logs loki
ストレージディレクトリが存在して書き込み可能であることを確認する
設定の構文をテストする: docker run grafana/loki:2.9.0 -config.file=/etc/loki/local-config.yaml -verify-config
保持設定がディスク容量を超えないことを確認する
S3の場合: IAMパーミッションとバケットアクセスを確認する

ステップ3: ログ配送のためのPromtailの設定

PromtailがログをスクレイプしてLokiにラベル抽出付きで転送するよう設定する。

Promtail設定（promtail-config.yml）：

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml
# ... (完全な設定はEXAMPLES.mdを参照)

Promtailの主要な概念：

スクレイプ設定: ログソースとそれらをディスカバリする方法を定義する
パイプラインステージ: Lokiに送信する前にログを変換してラベル付けする
リラベル設定: メタデータに基づく動的なラベル付け
ポジションファイル: ログの再処理を避けるために読み取りオフセットを追跡する

期待結果： Promtailが設定されたログファイルをスクレイプし、ラベルが正しく適用され、LogQLクエリを介してLokiでログが閲覧可能。

失敗時：

Promtailのログを確認する: docker logs promtail
ファイルパスがアクセス可能であることを確認する: docker exec promtail ls /var/log
サンプルのログ行で正規表現パターンを独立してテストする
Promtailのメトリクスを監視する: curl http://localhost:9080/metrics | grep promtail
進捗のためにポジションファイルを確認する: cat /tmp/positions.yaml

ステップ4: LogQLでログをクエリする

ログのフィルタリングと集計のためのLogQL構文を習得する。

基本クエリ：

# All logs from a job
{job="app"}

# Logs with specific label values
{job="app", level="error"}

# Regex filter on log line content
{job="app"} |~ "authentication failed"

# Case-insensitive regex
{job="app"} |~ "(?i)error"

# Line filter (doesn't parse, just includes/excludes)
{job="app"} |= "user"  # Contains "user"
{job="app"} != "debug" # Doesn't contain "debug"

解析とフィルタリング：

# JSON parsing
{job="app"} | json | level="error"

# Regex parsing with named groups
{job="app"} | regexp "user_id=(?P<user_id>\\d+)" | user_id="12345"

# Logfmt parsing (key=value format)
{job="app"} | logfmt | level="error", service="auth"

# Pattern parsing
{job="nginx"} | pattern `<ip> - <user> [<timestamp>] "<method> <path> <protocol>" <status> <size>` | status >= 500

集計（ログからのメトリクス）：

# Count log lines per level
sum by (level) (count_over_time({job="app"}[5m]))

# Rate of error logs
rate({job="app", level="error"}[5m])

# Bytes processed per service
sum by (service) (bytes_over_time({job="app"}[1h]))

# Average request duration from logs
avg_over_time({job="app"} | json | unwrap duration [5m])

# Top 10 error messages
topk(10, sum by (message) (count_over_time({level="error"} [1h])))

抽出フィールドによるフィルタリング：

# Find specific trace in logs
{job="app"} | json | trace_id="abc123def456"

# HTTP 5xx errors from nginx
{job="nginx"} | pattern `<_> "<_> <_> <_>" <status> <_>` | status >= 500

# Failed authentication attempts
{job="app"} | json | message=~"authentication failed" | user_id != ""

これらのパターンを使用してGrafana Exploreのクエリまたはダッシュボードパネルを作成する。

期待結果： クエリが期待されるログ行を返し、フィルタリングが正しく機能し、集計がログからメトリクスを生成する。

失敗時：

Grafana Exploreを使用してクエリをインタラクティブにデバッグする
ラベル名を確認する: curl http://localhost:3100/loki/api/v1/labels
ラベル値を確認する: curl http://localhost:3100/loki/api/v1/label/{label_name}/values
クエリを単純化する：基本的なラベルセレクタから始め、フィルタを段階的に追加する
時間範囲を確認する：選択したウィンドウにログが存在しない可能性がある

ステップ5: ログをメトリクスとトレースに統合する

統合された可観測性のためにログをPrometheusメトリクスと分散トレースと相関させる。

ログにトレースIDを追加する（アプリケーションのインストルメンテーション）：

# Python with OpenTelemetry
import logging
from opentelemetry import trace

logger = logging.getLogger(__name__)

def handle_request():
    span = trace.get_current_span()
    trace_id = span.get_span_context().trace_id

    logger.info(
        "Processing request",
        extra={"trace_id": format(trace_id, "032x")}
    )

// Go with OpenTelemetry
import (
    "go.opentelemetry.io/otel/trace"
    "go.uber.org/zap"
)

func handleRequest(ctx context.Context) {
    span := trace.SpanFromContext(ctx)
    traceID := span.SpanContext().TraceID().String()

    logger.Info("Processing request",
        zap.String("trace_id", traceID),
    )
}

Grafanaデータリンクを設定する（メトリクスからログへ）：

Prometheusパネルのフィールド設定：

{
  "fieldConfig": {
    "defaults": {
      "links": [
        {
          "title": "View Logs",
          "url": "/explore?left={\"datasource\":\"Loki\",\"queries\":[{\"refId\":\"A\",\"expr\":\"{job=\\\"app\\\",instance=\\\"${__field.labels.instance}\\\"} |= `${__field.labels.trace_id}`\"}],\"range\":{\"from\":\"${__from}\",\"to\":\"${__to}\"}}",
          "targetBlank": false
        }
      ]
    }
  }
}

Grafanaデータリンクを設定する（ログからトレースへ）：

LokiデータソースのYAML設定：

datasources:
  - name: Loki
    type: loki
    url: http://loki:3100
    jsonData:
      derivedFields:
        - datasourceName: Tempo
          matcherRegex: "trace_id=(\\w+)"
          name: TraceID
          url: "$${__value.raw}"

Grafana Exploreでログを相関させる：

PrometheusでメトリクスをクエリするGrafana Exploreでログを相関させる
データポイントをクリックする
コンテキストメニューから「View Logs」を選択する
関連するラベルと時間範囲でLokiクエリが自動入力される
ログ内のトレースIDをクリックする
完全な分散トレースを含むTempoのトレースビューが開く

期待結果： メトリクスをクリックすると関連ログが開き、ログのトレースIDがトレースビューにリンクし、メトリクス/ログ/トレースのナビゲーション用の単一ペインが実現する。

失敗時：

トレースIDフォーマットが派生フィールドの正規表現と一致することを確認する
Promtailパイプラインによってtrace_idラベルが抽出されていることを確認する
GrafanaでTempoデータソースが設定されていることを確認する
複雑なフィルタ式のURLエンコードをテストする
シークレット/プライベートブラウザウィンドウでデータリンクURLを検証する

ステップ6: ログ保持とコンパクションの設定

ストレージコストを管理するために保持ポリシーとコンパクションを設定する。

ストリームによる保持（Loki設定内）：

limits_config:
  retention_period: 720h  # Global default: 30 days

  # Per-tenant retention (requires multi-tenancy enabled)
  per_tenant_override_config: /etc/loki/overrides.yaml

# overrides.yaml
overrides:
  production:
    retention_period: 2160h  # 90 days for production
  staging:
    retention_period: 360h   # 15 days for staging
  development:
    retention_period: 168h   # 7 days for dev

ストリームラベルによる保持（コンパクターが必要）：

compactor:
  working_directory: /loki/compactor
  shared_store: filesystem
  compaction_interval: 10m
  retention_enabled: true
  retention_delete_delay: 2h
# ... (完全な設定はEXAMPLES.mdを参照)

複数のルールが一致する場合の優先度（数字が小さいほど優先度が高い）。

圧縮設定：

chunk_store_config:
  chunk_cache_config:
    enable_fifocache: true
    fifocache:
      max_size_bytes: 1GB
      ttl: 24h
# ... (完全な設定はEXAMPLES.mdを参照)

保持の監視：

# Check chunk stats
curl http://localhost:3100/loki/api/v1/status/chunks | jq

# Check compactor metrics
curl http://localhost:3100/metrics | grep loki_compactor

# Verify deleted chunks
curl http://localhost:3100/metrics | grep loki_boltdb_shipper_retention_deleted

期待結果： 保持ポリシーに従って古いログが自動的に削除され、ストレージ使用量が安定し、コンパクションによってインデックスサイズが削減される。

失敗時：

保持が機能していない場合は、Loki設定でコンパクターを有効にする
コンパクターのログを確認する: docker logs loki | grep compactor
retention_enabled: trueとretention_deletes_enabled: trueを確認する
ディスク使用量を監視する: du -sh /loki/
S3の場合: バケットのライフサイクルポリシーがLokiの保持と競合しないことを確認する

バリデーション

よくある落とし穴

高カーディナリティのラベル: 無制限のラベル値（ユーザーID、リクエストID）を使用するとインデックスが爆発する。固定ラベル（level、service、env）を使用し、変数はログ行に入れる。
ログ解析の欠如: ラベル抽出なしで生ログを送信するとクエリ能力が制限される。常に構造化ログ（JSON、logfmt）を解析するか、非構造化ログには正規表現を使用する。
不正なタイムスタンプ解析: タイムスタンプフォーマットが一致しないと、ログが順序外または拒否される。サンプルログでタイムスタンプ解析をテストする。
保持が機能しない: 古いデータを削除するにはコンパクターが有効でなければならない。retention_enabled: trueとretention_deletes_enabled: trueを確認する。
取り込みレート制限: デフォルトの制限（10MB/s）は高ボリュームのシステムには低すぎる場合がある。ingestion_rate_mbとingestion_burst_size_mbを調整する。
クエリのタイムアウト: 長い時間範囲にわたる広範なクエリはタイムアウトする可能性がある。より具体的なラベルセレクタと短い時間ウィンドウを使用する。
ログの重複: 同じログをスクレイプする複数のPromtailインスタンスが重複を生む。一意のラベルまたはポジションファイルの調整を使用する。

GitHub Repository

pjt222/agent-almanac

Path: i18n/ja/skills/configure-log-aggregation

agentsagentskillsai-assisted-developmentclaude-codeskillsteams

FAQ

Frequently asked questions

What is the configure-log-aggregation skill?

configure-log-aggregation is a Claude Skill by pjt222. Skills package instructions and resources that Claude loads on demand, so Claude can perform configure-log-aggregation-related tasks without extra prompting.

How do I install configure-log-aggregation?

Use the install commands on this page: add configure-log-aggregation to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does configure-log-aggregation belong to?

configure-log-aggregation is in the Other category, tagged ai.

Is configure-log-aggregation free to use?

Yes. configure-log-aggregation is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

Related Skills

llamaguard

Other

LlamaGuard is Meta's 7-8B parameter model for moderating LLM inputs and outputs across six safety categories like violence and hate speech. It offers 94-95% accuracy and can be deployed using vLLM, Hugging Face, or Amazon SageMaker. Use this skill to easily integrate content filtering and safety guardrails into your AI applications.

View skill

cost-optimization

Other

This Claude Skill helps developers optimize cloud costs through resource rightsizing, tagging strategies, and spending analysis. It provides a framework for reducing cloud expenses and implementing cost governance across AWS, Azure, and GCP. Use it when you need to analyze infrastructure costs, right-size resources, or meet budget constraints.

View skill

sports-betting-analyzer

Other

This Claude Skill analyzes sports betting markets including spreads, over/unders, and prop bets by examining historical trends and situational statistics to identify value bets. It provides structured markdown output with actionable recommendations for educational purposes. Developers should use this for sports betting analysis tools while noting it's designed for entertainment/education only.

View skill

quantizing-models-bitsandbytes

Other

This skill quantizes LLMs to 8-bit or 4-bit precision using bitsandbytes, achieving 50-75% memory reduction with minimal accuracy loss. It's ideal for running larger models on limited GPU memory or accelerating inference, supporting formats like INT8, NF4, and FP4. The skill integrates with HuggingFace Transformers and enables QLoRA training and 8-bit optimizers.

View skill

configure-log-aggregation

About

Quick Install

Claude Code

Documentation

ログ集約の設定

使用タイミング

入力

手順

ステップ1: ログ集約スタックの選択

ステップ2: Lokiのデプロイ

ステップ3: ログ配送のためのPromtailの設定

ステップ4: LogQLでログをクエリする

ステップ5: ログをメトリクスとトレースに統合する

ステップ6: ログ保持とコンパクションの設定

バリデーション

よくある落とし穴

関連スキル

GitHub Repository

Frequently asked questions

What is the configure-log-aggregation skill?

How do I install configure-log-aggregation?

What category does configure-log-aggregation belong to?

Is configure-log-aggregation free to use?

Related Skills