Back to Skills

kubernetes-deployment

aj-geddes
Updated Today
21 views
7
7
View on GitHub
Otherai

About

This skill enables developers to deploy and manage containerized applications on Kubernetes using production-ready configurations. It handles scaling, rolling updates, resource management, and multi-environment deployments. Use it for orchestrating microservices with built-in best practices for health checks, service discovery, and security policies.

Documentation

Kubernetes Deployment

Overview

Master Kubernetes deployments for managing containerized applications at scale, including multi-container services, resource allocation, health checks, and rolling deployment strategies.

When to Use

  • Container orchestration and management
  • Multi-environment deployments (dev, staging, prod)
  • Auto-scaling microservices
  • Rolling updates and blue-green deployments
  • Service discovery and load balancing
  • Resource quota and limit management
  • Pod networking and security policies

Implementation Examples

1. Complete Deployment with Resource Management

# kubernetes-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-service
  namespace: production
  labels:
    app: api-service
    version: v1
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: api-service
  template:
    metadata:
      labels:
        app: api-service
        version: v1
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
    spec:
      # Service account for RBAC
      serviceAccountName: api-service-sa

      # Security context
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000

      # Pod scheduling
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - api-service
                topologyKey: kubernetes.io/hostname

      # Pod termination grace period
      terminationGracePeriodSeconds: 30

      # Init containers
      initContainers:
        - name: wait-for-db
          image: busybox:1.35
          command: ['sh', '-c', 'until nc -z postgres-service 5432; do echo waiting for db; sleep 2; done']

      containers:
        - name: api-service
          image: myrepo/api-service:1.2.3
          imagePullPolicy: IfNotPresent

          # Ports
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
            - name: metrics
              containerPort: 9090
              protocol: TCP

          # Environment variables
          env:
            - name: NODE_ENV
              value: "production"
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: api-secrets
                  key: database-url
            - name: LOG_LEVEL
              valueFrom:
                configMapKeyRef:
                  name: api-config
                  key: log-level
            - name: REPLICA_NUM
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name

          # Resource requests and limits
          resources:
            requests:
              memory: "256Mi"
              cpu: "100m"
            limits:
              memory: "512Mi"
              cpu: "500m"

          # Liveness probe
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
              scheme: HTTP
            initialDelaySeconds: 30
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 3

          # Readiness probe
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 5
            timeoutSeconds: 3
            failureThreshold: 2

          # Volume mounts
          volumeMounts:
            - name: config
              mountPath: /etc/config
              readOnly: true
            - name: cache
              mountPath: /var/cache
            - name: logs
              mountPath: /var/log

          # Security context
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop:
                - ALL

      # Volumes
      volumes:
        - name: config
          configMap:
            name: api-config
        - name: cache
          emptyDir:
            sizeLimit: 1Gi
        - name: logs
          emptyDir:
            sizeLimit: 2Gi

---
apiVersion: v1
kind: Service
metadata:
  name: api-service
  namespace: production
spec:
  type: ClusterIP
  selector:
    app: api-service
  ports:
    - name: http
      port: 80
      targetPort: 8080
      protocol: TCP
    - name: metrics
      port: 9090
      targetPort: 9090
      protocol: TCP

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: api-config
  namespace: production
data:
  log-level: "INFO"
  max-connections: "100"

---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-service-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-service
  minReplicas: 3
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

2. Deployment Script

#!/bin/bash
# deploy-k8s.sh - Deploy to Kubernetes cluster

set -euo pipefail

NAMESPACE="${1:-production}"
DEPLOYMENT="${2:-api-service}"
IMAGE="${3:-myrepo/api-service:latest}"

echo "Deploying $DEPLOYMENT to namespace $NAMESPACE..."

# Check cluster connectivity
kubectl cluster-info

# Create namespace if not exists
kubectl create namespace "$NAMESPACE" --dry-run=client -o yaml | kubectl apply -f -

# Apply configuration
kubectl apply -f kubernetes-deployment.yaml -n "$NAMESPACE"

# Wait for rollout
echo "Waiting for deployment to rollout..."
kubectl rollout status deployment/"$DEPLOYMENT" -n "$NAMESPACE" --timeout=5m

# Verify pods are running
echo "Verification:"
kubectl get pods -n "$NAMESPACE" -l "app=$DEPLOYMENT"

# Check service
kubectl get svc -n "$NAMESPACE" -l "app=$DEPLOYMENT"

echo "Deployment complete!"

3. Service Account and RBAC

apiVersion: v1
kind: ServiceAccount
metadata:
  name: api-service-sa
  namespace: production

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: api-service-role
  namespace: production
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["get", "list"]
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: api-service-rolebinding
  namespace: production
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: api-service-role
subjects:
  - kind: ServiceAccount
    name: api-service-sa
    namespace: production

Deployment Patterns

Rolling Update

  • Gradually replace old pods with new ones
  • Zero downtime deployments
  • Automatic rollback on failure

Blue-Green

  • Maintain two identical environments
  • Switch traffic instantly
  • Easier rollback capability

Canary

  • Deploy to subset of users first
  • Monitor metrics before full rollout
  • Reduce risk of bad deployments

Best Practices

✅ DO

  • Use resource requests and limits
  • Implement health checks (liveness, readiness)
  • Use ConfigMaps for configuration
  • Apply security context restrictions
  • Use service accounts and RBAC
  • Implement pod anti-affinity
  • Use namespaces for isolation
  • Enable pod security policies

❌ DON'T

  • Use latest image tags in production
  • Run containers as root
  • Set unlimited resource usage
  • Skip readiness probes
  • Deploy without resource limits
  • Mix configurations in container images
  • Use default service accounts

Resources

Quick Install

/plugin add https://github.com/aj-geddes/useful-ai-prompts/tree/main/kubernetes-deployment

Copy and paste this command in Claude Code to install this skill

GitHub 仓库

aj-geddes/useful-ai-prompts
Path: skills/kubernetes-deployment

Related Skills

sglang

Meta

SGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.

View skill

evaluating-llms-harness

Testing

This Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.

View skill

llamaguard

Other

LlamaGuard is Meta's 7-8B parameter model for moderating LLM inputs and outputs across six safety categories like violence and hate speech. It offers 94-95% accuracy and can be deployed using vLLM, Hugging Face, or Amazon SageMaker. Use this skill to easily integrate content filtering and safety guardrails into your AI applications.

View skill

langchain

Meta

LangChain is a framework for building LLM applications using agents, chains, and RAG pipelines. It supports multiple LLM providers, offers 500+ integrations, and includes features like tool calling and memory management. Use it for rapid prototyping and deploying production systems like chatbots, autonomous agents, and question-answering services.

View skill