SKILL·2BB1C8

configure-log-aggregation

Name: configure-log-aggregation
Author: pjt222

pjt222

업데이트됨 1 month ago

20 조회

디자인aidesign

정보

이 스킬은 Loki/Promtail 또는 ELK 스택을 사용하여 중앙 집중식 로그 수집을 설정하며, 로그 파싱, 레이블 추출 및 보존 정책을 처리합니다. 여러 서비스의 로그를 검색 가능한 시스템으로 통합하고 이를 메트릭 및 트레이스와 연관시키도록 설계되었습니다. 로컬 로그 파일을 중앙 집중식 저장소로 대체하거나 서비스 간 분석이 필요한 장애를 해결할 때 사용하세요.

빠른 설치

Claude Code

문서

配日聚

以 Loki/Promtail 或 ELK 施中集日收、析、查以運可見。

用

合多服/主日於可查系
替本地日檔為集中可查存
日與度跡關聯以全可觀
結構日含標抽以析
依存與合規設日留策
產事查須跨服日析

入

必：日源（應用日、系日、容器日）
必：日格（JSON、素、syslog 等）
可：標抽規以結構查
可：留與壓策
可：現日輸配（Fluentd、Filebeat、Promtail）

行

見 Extended Examples 以全配與模。

一：擇日聚堆

於 Loki（Prometheus 式）與 ELK（Elasticsearch 基）間擇。

Loki 利：

輕，為 K8s 與雲原生設
標索引（如 Prometheus）減存負
Grafana 原生整
水平擴以物存（S3、GCS）
資耗較 Elasticsearch 少

ELK 利：

諸日全文搜（非只標）
富查 DSL 與聚
成熟生態含 beats、logstash plugin
合規/審計需深史搜為宜

此導焦 Loki + Promtail（新設多宜）。

決準：

Use Loki if:
- You want label-based queries similar to Prometheus
- Storage costs are a concern (Loki indexes only labels)
- You already use Grafana for metrics
- Kubernetes/container-native deployment

Use ELK if:
- You need full-text search across all log content
- You have complex log parsing and enrichment requirements
- You require advanced analytics and aggregations
- Legacy systems with existing Logstash pipelines

得：依求明擇，組下宜之裝構件。

敗：

基準存求：Loki 較 Elasticsearch ~10x 少
評查模：全文搜 vs 標濾
察運負：ELK 須多調與資

二：部 Loki

裝且配 Loki 含宜存後。

Docker Compose 部（docker-compose.yml）：

version: '3.8'

services:
  loki:
    image: grafana/loki:2.9.0
    ports:
      - "3100:3100"
    volumes:
      - ./loki-config.yml:/etc/loki/local-config.yaml
      - loki-data:/loki
    command: -config.file=/etc/loki/local-config.yaml
    restart: unless-stopped

  promtail:
    image: grafana/promtail:2.9.0
    volumes:
      - ./promtail-config.yml:/etc/promtail/config.yml
      - /var/log:/var/log:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
    command: -config.file=/etc/promtail/config.yml
    restart: unless-stopped
    depends_on:
      - loki

volumes:
  loki-data:

Loki 配（loki-config.yml）：

auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

# ... (see EXAMPLES.md for complete configuration)

生產含 S3 存：

storage_config:
  aws:
    s3: s3://us-east-1/my-loki-bucket
    s3forcepathstyle: true
  boltdb_shipper:
    active_index_directory: /loki/index
    cache_location: /loki/cache
    shared_store: s3

得： Loki 成啟，健察過於 http://localhost:3100/ready，日按留策存。

敗：

察 Loki 日：docker logs loki
驗存目錄在且可寫
測配語：docker run grafana/loki:2.9.0 -config.file=/etc/loki/local-config.yaml -verify-config
確留設不超盤量
S3：驗 IAM 權與桶存取

三：配 Promtail 以輸日

設 Promtail 以刮日且附標抽前轉至 Loki。

Promtail 配（promtail-config.yml）：

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml
# ... (see EXAMPLES.md for complete configuration)

Promtail 要念：

Scrape configs：定日源與如何發現
Pipeline stages：發前變與標日
Relabel configs：依元動標
Positions file：追讀偏以避重處

得： Promtail 刮已配日檔，標正施，日經 LogQL 於 Loki 可見。

敗：

察 Promtail 日：docker logs promtail
驗檔路可達：docker exec promtail ls /var/log
以樣日獨測正則
察 Promtail 度：curl http://localhost:9080/metrics | grep promtail
察 positions 檔進：cat /tmp/positions.yaml

四：以 LogQL 查日

學 LogQL 語以濾與聚日。

基查：

# All logs from a job
{job="app"}

# Logs with specific label values
{job="app", level="error"}

# Regex filter on log line content
{job="app"} |~ "authentication failed"

# Case-insensitive regex
{job="app"} |~ "(?i)error"

# Line filter (doesn't parse, just includes/excludes)
{job="app"} |= "user"  # Contains "user"
{job="app"} != "debug" # Doesn't contain "debug"

析與濾：

# JSON parsing
{job="app"} | json | level="error"

# Regex parsing with named groups
{job="app"} | regexp "user_id=(?P<user_id>\\d+)" | user_id="12345"

# Logfmt parsing (key=value format)
{job="app"} | logfmt | level="error", service="auth"

# Pattern parsing
{job="nginx"} | pattern `<ip> - <user> [<timestamp>] "<method> <path> <protocol>" <status> <size>` | status >= 500

聚（從日生度）：

# Count log lines per level
sum by (level) (count_over_time({job="app"}[5m]))

# Rate of error logs
rate({job="app", level="error"}[5m])

# Bytes processed per service
sum by (service) (bytes_over_time({job="app"}[1h]))

# Average request duration from logs
avg_over_time({job="app"} | json | unwrap duration [5m])

# Top 10 error messages
topk(10, sum by (message) (count_over_time({level="error"} [1h])))

依抽欄濾：

# Find specific trace in logs
{job="app"} | json | trace_id="abc123def456"

# HTTP 5xx errors from nginx
{job="nginx"} | pattern `<_> "<_> <_> <_>" <status> <_>` | status >= 500

# Failed authentication attempts
{job="app"} | json | message=~"authentication failed" | user_id != ""

以此模建 Grafana explore 查或板。

得：查返預期日，濾正行，聚從日生度。

敗：

用 Grafana Explore 互式調查
察標名：curl http://localhost:3100/loki/api/v1/labels
驗標值：curl http://localhost:3100/loki/api/v1/label/{label_name}/values
簡查：起於基標選，漸加濾
察時範：日或不在所選窗

五：整日於度與跡

以 Prometheus 度與分散跡關聯日為統一可觀。

於日加 trace ID（應用埋點）：

# Python with OpenTelemetry
import logging
from opentelemetry import trace

logger = logging.getLogger(__name__)

def handle_request():
    span = trace.get_current_span()
    trace_id = span.get_span_context().trace_id

    logger.info(
        "Processing request",
        extra={"trace_id": format(trace_id, "032x")}
    )

// Go with OpenTelemetry
import (
    "go.opentelemetry.io/otel/trace"
    "go.uber.org/zap"
)

func handleRequest(ctx context.Context) {
    span := trace.SpanFromContext(ctx)
    traceID := span.SpanContext().TraceID().String()

    logger.Info("Processing request",
        zap.String("trace_id", traceID),
    )
}

配 Grafana 數據鏈 從度至日：

於 Prometheus 板欄配：

{
  "fieldConfig": {
    "defaults": {
      "links": [
        {
          "title": "View Logs",
          "url": "/explore?left={\"datasource\":\"Loki\",\"queries\":[{\"refId\":\"A\",\"expr\":\"{job=\\\"app\\\",instance=\\\"${__field.labels.instance}\\\"} |= `${__field.labels.trace_id}`\"}],\"range\":{\"from\":\"${__from}\",\"to\":\"${__to}\"}}",
          "targetBlank": false
        }
      ]
    }
  }
}

配 Grafana 數據鏈 從日至跡：

於 Loki datasource 配：

datasources:
  - name: Loki
    type: loki
    url: http://loki:3100
    jsonData:
      derivedFields:
        - datasourceName: Tempo
          matcherRegex: "trace_id=(\\w+)"
          name: TraceID
          url: "$${__value.raw}"

於 Grafana Explore 關聯日：

於 Prometheus 查度
點數據
於脈菜擇「View Logs」
Loki 查自填關標與時範
於日點 trace ID
Tempo 跡視開示全分散跡

得：點度開相關日，日中 trace ID 鏈至跡視，單窗察度/日/跡。

敗：

驗 trace ID 格匹 derivedFields 正則
察 trace_id 標為 Promtail pipeline 抽
確 Tempo datasource 已於 Grafana 配
測複濾式之 URL 編碼
於匿名/私瀏窗驗數據鏈 URL

六：設日留與壓

配留策與壓以管存費。

依流留（於 Loki 配）：

limits_config:
  retention_period: 720h  # Global default: 30 days

  # Per-tenant retention (requires multi-tenancy enabled)
  per_tenant_override_config: /etc/loki/overrides.yaml

# overrides.yaml
overrides:
  production:
    retention_period: 2160h  # 90 days for production
  staging:
    retention_period: 360h   # 15 days for staging
  development:
    retention_period: 168h   # 7 days for dev

依流標留（須 compactor）：

compactor:
  working_directory: /loki/compactor
  shared_store: filesystem
  compaction_interval: 10m
  retention_enabled: true
  retention_delete_delay: 2h
# ... (see EXAMPLES.md for complete configuration)

優先決多規匹時何規施（小數＝高優）。

壓設：

chunk_store_config:
  chunk_cache_config:
    enable_fifocache: true
    fifocache:
      max_size_bytes: 1GB
      ttl: 24h
# ... (see EXAMPLES.md for complete configuration)

監留：

# Check chunk stats
curl http://localhost:3100/loki/api/v1/status/chunks | jq

# Check compactor metrics
curl http://localhost:3100/metrics | grep loki_compactor

# Verify deleted chunks
curl http://localhost:3100/metrics | grep loki_boltdb_shipper_retention_deleted

得：舊日按留策自刪，存用穩，壓減索尺。

敗：

留不行→於 Loki 配啟 compactor
察 compactor 日：docker logs loki | grep compactor
驗 retention_enabled: true 且 retention_deletes_enabled: true
察盤用：du -sh /loki/
S3：察桶生命策不衝 Loki 留

驗

忌

高基數標：無界標值（用戶 ID、求 ID）致索爆。用定標（level、service、env）變入日行。
無日析：發生日而無標抽限查。必析結構日（JSON、logfmt）或正則析無結構。
時析誤：時格不合致日序亂或拒。以樣日測時析。
留不行：compactor 須啟方刪舊。察 retention_enabled: true 與 retention_deletes_enabled: true。
入率限：默（10MB/s）或於高量低。調 ingestion_rate_mb 與 ingestion_burst_size_mb。
查超時：廣查於長期或超時。用具體標選與短時窗。
日重：多 Promtail 刮同日生重。用唯標或 positions 檔協調。

參

correlate-observability-signals - 經 trace ID 統合度、日、跡除錯
build-grafana-dashboards - 視日生度且建板之日面板
setup-prometheus-monitoring - 度供事中查日之脈
instrument-distributed-tracing - 於日加 trace ID 關聯分散跡

GitHub 저장소

pjt222/agent-almanac

경로: i18n/wenyan-ultra/skills/configure-log-aggregation

agentsagentskillsai-assisted-developmentclaude-codeskillsteams

FAQ

Frequently asked questions

What is the configure-log-aggregation skill?

configure-log-aggregation is a Claude Skill by pjt222. Skills package instructions and resources that Claude loads on demand, so Claude can perform configure-log-aggregation-related tasks without extra prompting.

How do I install configure-log-aggregation?

Use the install commands on this page: add configure-log-aggregation to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does configure-log-aggregation belong to?

configure-log-aggregation is in the Design category, tagged ai and design.

Is configure-log-aggregation free to use?

Yes. configure-log-aggregation is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

연관 스킬

executing-plans

디자인

executing-plans 스킬은 검토 체크포인트가 포함된 통제된 배치로 실행할 완전한 구현 계획이 있을 때 사용합니다. 이 스킬은 계획을 불러와 비판적으로 검토한 후, 소규모 배치(기본값 3개 작업)로 작업을 실행하면서 각 배치 사이에 진행 상황을 아키텍트 검토를 위해 보고합니다. 이를 통해 내재된 품질 관리 체크포인트를 갖춘 체계적인 구현이 보장됩니다.

스킬 보기

requesting-code-review

디자인

이 스킬은 코드 변경 사항을 요구 사항에 따라 분석하기 위해 코드 리뷰어 하위 에이전트를 호출합니다. 작업 완료 후, 주요 기능 구현 후, 또는 메인 브랜치에 병합하기 전에 사용해야 합니다. 이 리뷰는 현재 구현체와 원래 계획을 비교하여 문제를 조기에 발견하는 데 도움이 됩니다.

스킬 보기

connect-mcp-server

디자인

이 스킬은 개발자들이 HTTP, stdio 또는 SSE 전송 방식을 통해 MCP 서버를 Claude Code에 연결하는 포괄적인 가이드를 제공합니다. GitHub, Notion 및 사용자 정의 API와 같은 외부 서비스를 통합하기 위한 설치, 구성, 인증 및 보안을 다룹니다. MCP 통합 설정, 외부 도구 구성 또는 Claude의 모델 컨텍스트 프로토콜 작업 시 활용하세요.

스킬 보기

web-cli-teleport

디자인

이 스킬은 작업 분석을 기반으로 개발자가 Claude Code 웹 인터페이스와 CLI 인터페이스 중 선택할 수 있도록 돕고, 두 환경 간 원활한 세션 텔레포트를 가능하게 합니다. 웹, CLI 또는 모바일 환경 전환 시 세션 상태와 컨텍스트를 관리하여 워크플로를 최적화합니다. 다양한 단계에서 서로 다른 도구가 필요한 복잡한 프로젝트에 사용하세요.

스킬 보기