返回技能列表

setup-service-mesh

pjt222
更新于 2 days ago
4 次查看
17
2
17
在 GitHub 上查看
设计design

关于

This skill automates the deployment and configuration of a service mesh (Istio or Linkerd) in a Kubernetes environment. It enables secure service-to-service communication with mTLS, advanced traffic management, and observability without requiring application code changes. Use it when your microservices need encrypted communication, fine-grained traffic control like canary releases, or consistent circuit-breaking and retry policies.

快速安装

Claude Code

推荐
主要方式
npx skills add pjt222/agent-almanac -a claude-code
插件命令备选方式
/plugin add https://github.com/pjt222/agent-almanac
Git 克隆备选方式
git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/setup-service-mesh

在 Claude Code 中复制并粘贴此命令以安装该技能

技能文档

Setup Service Mesh

Deploy+configure mesh → secure svc-to-svc + advanced traffic mgmt.

Use When

  • Microservices arch needs encrypted svc-to-svc
  • Fine traffic ctrl (canary, A/B, splitting)
  • Observability across all svc interactions w/o app changes
  • Enforce security policies (mTLS, authz) at infra level
  • Impl circuit break, retries, timeouts consistent
  • Distributed tracing + svc dependency mapping

In

  • Required: K8s cluster w/ admin
  • Required: Mesh choice (Istio|Linkerd)
  • Required: Namespace(s) to enable
  • Optional: Monitoring stack (Prometheus, Grafana, Jaeger)
  • Optional: Custom traffic mgmt reqs
  • Optional: CA config for mTLS

Do

See Extended Examples for complete config + templates.

Step 1: Install Control Plane

Istio:

curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.20.2 sh -
istioctl install --set profile=production -y
kubectl get pods -n istio-system

Linkerd:

curl -sL https://run.linkerd.io/install | sh
linkerd check --pre
linkerd install --ha | kubectl apply -f -
linkerd check

Mesh config w/ resource limits + tracing:

# service-mesh-config.yaml (abbreviated)
spec:
  profile: production
  meshConfig:
    enableTracing: true
  components:
    pilot:
      k8s:
        resources: { requests: { cpu: 500m, memory: 2Gi } }
# See EXAMPLES.md Step 1 for complete configuration

→ Control plane pods running in istio-system|linkerd ns. istioctl version|linkerd version shows matching client+server.

If err:

  • Cluster has resources (≥4 CPU, 8GB RAM prod)
  • K8s ver compat (check mesh docs)
  • Logs: kubectl logs -n istio-system -l app=istiod|kubectl logs -n linkerd -l linkerd.io/control-plane-component=controller
  • Conflicting CRDs: kubectl get crd | grep istio|grep linkerd

Step 2: Auto Sidecar Injection

Istio:

# Label namespace for automatic injection
kubectl label namespace default istio-injection=enabled
kubectl get namespace -L istio-injection

Linkerd:

# Annotate namespace for injection
kubectl annotate namespace default linkerd.io/inject=enabled

Test:

# test-deployment.yaml (abbreviated)
apiVersion: apps/v1
kind: Deployment
spec:
  replicas: 2
  template:
    spec:
      containers:
      - name: app
        image: nginx:alpine
# See EXAMPLES.md Step 2 for complete test deployment
kubectl apply -f test-deployment.yaml
kubectl get pods -n default
# Expect 2/2 containers (app + proxy)

→ New pods 2/2 (app + sidecar). Describe shows istio-proxy|linkerd-proxy. Logs show successful proxy startup.

If err:

  • Labels|annotations: kubectl get ns default -o yaml
  • Webhook active: kubectl get mutatingwebhookconfiguration
  • Inject logs: kubectl logs -n istio-system -l app=sidecar-injector (Istio)
  • Manual inject test: kubectl get deploy test-app -o yaml | istioctl kube-inject -f - | kubectl apply -f -

Step 3: mTLS Policy

Istio:

# mtls-policy.yaml (abbreviated)
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: istio-system
spec:
  mtls:
    mode: STRICT
# See EXAMPLES.md Step 3 for per-namespace and permissive mode examples

Linkerd:

# Linkerd enforces mTLS by default for meshed pods
linkerd viz tap deploy/test-app -n default
# Check for 🔒 (lock) symbol

Apply + verify:

kubectl apply -f mtls-policy.yaml
# Istio: verify mTLS status
istioctl authn tls-check $(kubectl get pod -n default -l app=test-app -o jsonpath='{.items[0].metadata.name}') -n default

→ All meshed conns mTLS enabled. Istio tls-check STATUS "OK". Linkerd tap 🔒 all conns. No TLS errs in logs.

If err:

  • Cert issuance: kubectl get certificates -A (cert-manager)
  • CA healthy: kubectl logs -n istio-system -l app=istiod | grep -i cert
  • PERMISSIVE first → STRICT
  • Svcs w/o sidecars: kubectl get pods --all-namespaces -o json | jq '.items[] | select(.spec.containers | length == 1) | .metadata.name'

Step 4: Traffic Mgmt Rules

# traffic-management.yaml (abbreviated)
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
spec:
  http:
  - match:
    - uri: { prefix: /api/v2 }
    route:
    - destination: { host: api-service, subset: v2 }
      weight: 10
    - destination: { host: api-service, subset: v1 }
      weight: 90
    retries: { attempts: 3, perTryTimeout: 2s }
# See EXAMPLES.md Step 4 for complete routing, circuit breaker, and gateway configs

Linkerd traffic split:

apiVersion: split.smi-spec.io/v1alpha2
kind: TrafficSplit
spec:
  service: api-service
  backends:
  - service: api-service-v1
    weight: 900
  - service: api-service-v2
    weight: 100

Apply + test:

kubectl apply -f traffic-management.yaml
# Test traffic distribution
for i in {1..100}; do curl -s http://api.example.com/api/v2 | grep version; done | sort | uniq -c
# Monitor: istioctl dashboard kiali or linkerd viz dashboard

→ Splits per weights. Circuit breaker trips after consecutive errs. Retries on transient. Kiali|Linkerd dashboard shows flow viz.

If err:

  • Dest hosts resolve: kubectl get svc -n production
  • Subset labels match pod: kubectl get pods -n production --show-labels
  • Pilot logs: kubectl logs -n istio-system -l app=istiod
  • Test w/o circuit breaker first → add incrementally
  • istioctl analyze -n production

Step 5: Observability Integration

Install addons:

# Istio: Prometheus, Grafana, Kiali, Jaeger
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/prometheus.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/grafana.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/kiali.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/jaeger.yaml

# Linkerd
linkerd viz install | kubectl apply -f -
linkerd jaeger install | kubectl apply -f -

Custom metrics + dashboards:

# service-monitor.yaml (abbreviated)
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: istio-mesh-metrics
spec:
  selector: { matchLabels: { app: istiod } }
  endpoints:
  - port: http-monitoring
    interval: 30s
# See EXAMPLES.md Step 5 for Grafana dashboards and telemetry config

Access:

istioctl dashboard grafana  # or: linkerd viz dashboard
istioctl dashboard kiali
istioctl dashboard jaeger

→ Dashboards show topology, request rates, latency percentiles, err rates. Distributed traces in Jaeger. Prometheus scraping mesh metrics. Custom metrics in queries.

If err:

  • Prometheus scraping: kubectl get servicemonitor -A
  • Addon pods running: kubectl get pods -n istio-system
  • Telemetry config: istioctl proxy-config log <pod-name> -n <namespace>
  • Mesh config has tracing: kubectl get configmap istio -n istio-system -o yaml | grep -A 5 enableTracing
  • Port conflicts if port-forward fails

Step 6: Validate + Monitor Mesh Health

# Istio validation
istioctl analyze --all-namespaces
istioctl verify-install
istioctl proxy-status

# Linkerd validation
linkerd check
linkerd viz check
linkerd diagnostics policy

# Check proxy sync status
kubectl get pods -n production -o json | \
  jq '.items[] | {name: .metadata.name, proxy: .status.containerStatuses[] | select(.name=="istio-proxy").ready}'

# Monitor control plane health
kubectl get pods -n istio-system -w
kubectl top pods -n istio-system

Health check + alerts:

#!/bin/bash
# mesh-health-check.sh (abbreviated)
echo "=== Service Mesh Health Check ==="
kubectl get pods -n istio-system
istioctl analyze --all-namespaces
# See EXAMPLES.md Step 6 for complete health check script and alert configs

→ All checks pass no warns. Proxy-status all synced. mTLS check confirms encryption. Metrics show traffic. Control plane stable, low resource use.

If err:

  • Address istioctl analyze output
  • Proxy logs per pod: kubectl logs <pod> -c istio-proxy -n <namespace>
  • Net policies not blocking mesh
  • Control plane logs: kubectl logs -n istio-system deploy/istiod --tail=100
  • Restart problematic: kubectl rollout restart deploy/<deployment> -n <namespace>

Check

  • Control plane pods running healthy (istiod|linkerd-controller)
  • Sidecars injected all app pods (2/2)
  • mTLS enabled+functioning (tls-check|tap verified)
  • Traffic rules route correctly (curl tests)
  • Circuit breaker trips on repeated fails (fault inject)
  • Observability dashboards show metrics (Grafana|Kiali|Linkerd Viz)
  • Distributed traces in Jaeger
  • No warnings from istioctl analyze|linkerd check
  • Proxy sync status all in sync
  • Svc-to-svc encrypted (logs|dashboards verified)

Traps

  • Resource exhaustion: Mesh adds 100-200MB/pod for sidecars. Cluster needs capacity. Set limits in inject config.
  • Config conflicts: Multi VirtualServices same host = undefined behavior. Single VS per host w/ multi match conditions.
  • Cert expiration: mTLS auto-rotate but CA root managed. Monitor expiry: kubectl get certificate -A + alerts.
  • Sidecar not injected: Pods pre-label won't have sidecars. Recreate: kubectl rollout restart deploy/<name> -n <namespace>.
  • DNS issues: Mesh intercepts DNS. Use FQ names (service.namespace.svc.cluster.local) cross-ns.
  • Port naming req: Istio needs named ports protocol-name pattern (http-web, tcp-db). Unnamed → TCP passthrough.
  • Gradual rollout req: Don't enable STRICT mTLS immediate prod. PERMISSIVE during migration → verify all meshed → STRICT.
  • Observability overhead: 100% tracing sampling = perf issues. Use 1-10% prod: sampling: 1.0 in mesh config.
  • Gateway vs VS confusion: Gateway = ingress (LB), VS = routing. Both needed for external.
  • Ver compat: Mesh ver compat w/ K8s. Istio supports n-1 minor; Linkerd typically last 3 K8s vers.

  • configure-ingress-networking — Gateway complements mesh ingress
  • deploy-to-kubernetes — app deploy patterns w/ mesh
  • setup-prometheus-monitoring — Prometheus integ for mesh metrics
  • manage-kubernetes-secrets — cert mgmt for mTLS
  • enforce-policy-as-code — OPA policies alongside mesh authz

GitHub 仓库

pjt222/agent-almanac
路径: i18n/caveman-ultra/skills/setup-service-mesh
0
agentsagentskillsai-assisted-developmentclaude-codeskillsteams

相关推荐技能

executing-plans

设计

该Skill用于当开发者提供完整实施计划时,以受控批次方式执行代码实现。它会先审阅计划并提出疑问,然后分批次执行任务(默认每批3个任务),并在批次间暂停等待审查。关键特性包括分批次执行、内置检查点和架构师审查机制,确保复杂系统实现的可控性。

查看技能

requesting-code-review

设计

该Skill可在完成任务、实现主要功能或合并代码前自动调度代码审查子代理,确保实现符合需求和计划。它支持通过指定git SHA范围进行精准的代码变更审查,帮助开发者在关键节点及时发现潜在问题。核心原则是"早审查、勤审查",适用于开发流程的各个关键阶段。

查看技能

connect-mcp-server

设计

这个Skill指导开发者如何将MCP服务器连接到Claude Code,支持HTTP、stdio和SSE三种传输协议。它涵盖了从安装配置到认证安全的完整流程,适用于集成GitHub、Notion、数据库等外部服务。当开发者需要添加集成、配置外部工具或提及MCP相关功能时,这个Skill能提供实用的操作指南。

查看技能

web-cli-teleport

设计

该Skill帮助开发者根据任务特性选择Claude Code的Web或CLI界面,并指导如何在两种环境间无缝迁移会话。它能分析任务复杂度、迭代需求等要素,推荐最优工作界面和工作流。关键特性包括会话状态管理、环境切换指导和上下文优化建议。

查看技能