MCP HubMCP Hub
スキル一覧に戻る

moai-domain-devops

modu-ai
更新日 Today
57 閲覧
424
78
424
GitHubで表示
ドキュメントai

について

このClaudeスキルは、Kubernetes 1.31、Docker 27.x、Terraform 1.9を使用したクラウドネイティブインフラストラクチャによるエンタープライズDevOps機能を提供します。GitHub Actionsによる自動CI/CDパイプライン、ArgoCDを介したGitOpsデプロイ、PrometheusとGrafanaによる包括的な監視を実現します。本番環境レベルのコンテナオーケストレーション、インフラストラクチャとしてのコード、および監視スタックの実装にこのスキルをご利用ください。

クイックインストール

Claude Code

推奨
プラグインコマンド推奨
/plugin add https://github.com/modu-ai/moai-adk
Git クローン代替
git clone https://github.com/modu-ai/moai-adk.git ~/.claude/skills/moai-domain-devops

このコマンドをClaude Codeにコピー&ペーストしてスキルをインストールします

ドキュメント

Enterprise DevOps Architect - Production-Grade v4.0

Technology Stack (2025 Stable)

  • Kubernetes 1.31.x (container orchestration)
  • Docker 27.x (container runtime)
  • GitHub Actions (CI/CD automation)
  • Terraform 1.9.x (infrastructure as code)
  • Prometheus 2.55.x (monitoring & observability)
  • Grafana 11.x (visualization dashboards)
  • ArgoCD 2.13.x (GitOps deployments)

Level 1: Quick Reference

Core DevOps Patterns

Multi-Stage Docker Build:

# syntax=docker/dockerfile:1
ARG NODE_VERSION=20
ARG ALPINE_VERSION=3.21

# Stage 1: Dependencies
FROM node:${NODE_VERSION}-alpine${ALPINE_VERSION} AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force

# Stage 2: Build
FROM node:${NODE_VERSION}-alpine${ALPINE_VERSION} AS build
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build && npm run test

# Stage 3: Production
FROM node:${NODE_VERSION}-alpine${ALPINE_VERSION}@sha256:1e7902618558e51428d31e6c06c2531e3170417018a45148a1f3d7305302b211
WORKDIR /app

# Security: Non-root user
RUN addgroup -g 1001 -S nodejs && adduser -S nodejs -u 1001
COPY --from=deps --chown=nodejs:nodejs /app/node_modules ./node_modules
COPY --from=build --chown=nodejs:nodejs /app/dist ./dist
COPY --chown=nodejs:nodejs package*.json ./

USER nodejs
EXPOSE 3000
ENV NODE_ENV=production
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s CMD node healthcheck.js
CMD ["node", "dist/index.js"]

Kubernetes Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      securityContext:
        runAsNonRoot: true
        runAsUser: 1001
      containers:
      - name: web-app
        image: myapp:v1.0.0@sha256:abc123...
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 500m
            memory: 512Mi
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 30
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10

GitHub Actions CI/CD:

name: Production CI/CD

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

jobs:
  test:
    name: Test
    runs-on: ubuntu-latest
    strategy:
      matrix:
        node-version: [18, 20, 22]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: ${{ matrix.node-version }}
          cache: 'npm'
      - run: npm ci
      - run: npm test -- --coverage

  build:
    name: Build and Push
    needs: test
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-buildx-action@v3
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

Terraform Infrastructure:

terraform {
  required_version = ">= 1.9.0"
  required_providers {
    aws = { source = "hashicorp/aws", version = "~> 5.0" }
  }
  
  backend "s3" {
    bucket = "terraform-state-prod"
    key = "production/terraform.tfstate"
    region = "us-west-2"
    encrypt = true
    dynamodb_table = "terraform-locks"
  }
}

provider "aws" {
  region = var.aws_region
  default_tags {
    tags = {
      Environment = var.environment
      ManagedBy = "Terraform"
      Project = var.project_name
    }
  }
}

module "vpc" {
  source = "terraform-aws-modules/vpc/aws"
  version = "5.0.0"
  name = "${var.project_name}-vpc"
  cidr = "10.0.0.0/16"
  azs = ["us-west-2a", "us-west-2b", "us-west-2c"]
  private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
  enable_nat_gateway = true
  enable_dns_hostnames = true
  enable_dns_support = true
}

Level 2: Core Implementation

Advanced Patterns

Horizontal Pod Autoscaler:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Prometheus Alert Rules:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: web-app-alerts
  namespace: production
spec:
  groups:
  - name: web-app.rules
    interval: 30s
    rules:
    - alert: HighErrorRate
      expr: |
        sum(rate(http_requests_total{job="web-app",status=~"5.."}[5m])) 
        / sum(rate(http_requests_total{job="web-app"}[5m])) > 0.05
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "High error rate detected"
    
    - alert: HighLatency
      expr: |
        histogram_quantile(0.95, 
          sum(rate(http_request_duration_seconds_bucket{job="web-app"}[5m])) by (le)
        ) > 1
      for: 10m
      labels:
        severity: warning

ArgoCD GitOps:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: web-app
  namespace: argocd
spec:
  project: production
  source:
    repoURL: https://github.com/org/web-app-k8s
    targetRevision: main
    path: overlays/production
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
      allowEmpty: false
    syncOptions:
    - CreateNamespace=true
    retry:
      limit: 5
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m

Level 3: Advanced Integration

Enterprise Production Patterns

External Secrets:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-credentials
  namespace: production
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets-manager
    kind: SecretStore
  target:
    name: db-credentials
    creationPolicy: Owner
    template:
      engineVersion: v2
      data:
        url: "postgresql://{{ .username }}:{{ .password }}@{{ .host }}:{{ .port }}/{{ .database }}"
  dataFrom:
  - extract:
      key: production/database

Helm Chart:

# Chart.yaml
apiVersion: v2
name: web-app
description: Production-grade web application
version: 1.0.0
dependencies:
  - name: postgresql
    version: 12.x.x
    repository: https://charts.bitnami.com/bitnami

# values.yaml
replicaCount: 3
image:
  repository: myapp
  tag: "v1.0.0"
service:
  type: ClusterIP
  port: 80
  targetPort: 8080
resources:
  requests: { cpu: 100m, memory: 128Mi }
  limits: { cpu: 500m, memory: 512Mi }
autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

Blue-Green Deployment:

#!/bin/bash
set -e
NAMESPACE="production"
SERVICE="web-app"
NEW_VERSION="green"

echo "Validating $NEW_VERSION deployment..."
kubectl rollout status deployment/web-app-$NEW_VERSION -n $NAMESPACE

echo "Running smoke tests..."
kubectl run smoke-test --rm -i --restart=Never \
  --image=curlimages/curl -- \
  http://web-app-$NEW_VERSION.$NAMESPACE.svc.cluster.local/health

echo "Switching traffic to $NEW_VERSION..."
kubectl patch service $SERVICE -n $NAMESPACE \
  -p "{\"spec\":{\"selector\":{\"version\":\"$NEW_VERSION\"}}}"

echo "Monitor for 10 minutes..."
sleep 600

echo "Scale down old version..."
kubectl scale deployment/web-app-$([ "$NEW_VERSION" = "blue" ] && echo "green" || echo "blue") \
  -n $NAMESPACE --replicas=0

Level 4: Reference & Integration

Best Practices Summary

Container Security:

  • ✅ Use minimal base images (Alpine, distroless)
  • ✅ Pin image digests for reproducibility
  • ✅ Run as non-root user
  • ✅ Scan images with Trivy/Snyk
  • ✅ Multi-stage builds reduce attack surface

Kubernetes Production:

  • ✅ Resource requests/limits for all containers
  • ✅ Liveness and readiness probes
  • ✅ HPA for automatic scaling
  • ✅ PodDisruptionBudget for availability
  • ✅ Network policies for security

CI/CD Optimization:

  • ✅ Matrix builds for multi-version testing
  • ✅ Caching strategies (Docker layers, npm/pip)
  • ✅ Security scanning at every stage
  • ✅ Parallel jobs for faster feedback
  • ✅ Workflow timeouts (30 minutes recommended)

Infrastructure as Code:

  • ✅ Remote state with locking (S3 + DynamoDB)
  • ✅ State encryption with KMS
  • ✅ Modular architecture for reusability
  • ✅ Input validation with constraints
  • ✅ Version pinning for providers

Monitoring & Observability:

  • ✅ Scrape intervals: 15-30 seconds for most apps
  • ✅ Consistent label naming, avoid high cardinality
  • ✅ Alert thresholds based on SLIs/SLOs
  • ✅ Template variables in dashboards
  • ✅ OpenTelemetry for distributed tracing

GitOps Workflow:

  • ✅ Git as single source of truth
  • ✅ Automated sync with self-healing
  • ✅ Declarative configuration (YAML in Git)
  • ✅ RBAC for access control
  • ✅ Audit trail via Git history

Related Skills

  • Skill("moai-security-backend") for security patterns
  • Skill("moai-essentials-perf") for performance optimization
  • Skill("moai-domain-cloud") for cloud architecture

Version: 4.0.0 Enterprise
Last Updated: 2025-11-13
Status: Production Ready
Tech Stack: Kubernetes 1.31, Docker 27.x, Terraform 1.9, Prometheus 2.55, Grafana 11.x

GitHub リポジトリ

modu-ai/moai-adk
パス: .claude/skills/moai-domain-devops
agentic-aiagentic-codingagentic-workflowclaudeclaudecodevibe-coding

関連スキル

evaluating-llms-harness

テスト

This Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.

スキルを見る

sglang

メタ

SGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.

スキルを見る

cloudflare-turnstile

メタ

This skill provides comprehensive guidance for implementing Cloudflare Turnstile as a CAPTCHA-alternative bot protection system. It covers integration for forms, login pages, API endpoints, and frameworks like React/Next.js/Hono, while handling invisible challenges that maintain user experience. Use it when migrating from reCAPTCHA, debugging error codes, or implementing token validation and E2E tests.

スキルを見る

langchain

メタ

LangChain is a framework for building LLM applications using agents, chains, and RAG pipelines. It supports multiple LLM providers, offers 500+ integrations, and includes features like tool calling and memory management. Use it for rapid prototyping and deploying production systems like chatbots, autonomous agents, and question-answering services.

スキルを見る