返回技能列表

enforce-policy-as-code

pjt222
更新于 2 days ago
8 次查看
17
2
17
在 GitHub 上查看
ai

关于

This skill implements policy-as-code enforcement in Kubernetes using OPA Gatekeeper or Kyverno to validate and mutate resources against organizational policies. It covers admission control, audit mode, and CI/CD integration for shift-left validation. Use it to enforce configuration standards, prevent security misconfigurations, and ensure compliance before deployment.

快速安装

Claude Code

推荐
主要方式
npx skills add pjt222/agent-almanac -a claude-code
插件命令备选方式
/plugin add https://github.com/pjt222/agent-almanac
Git 克隆备选方式
git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/enforce-policy-as-code

在 Claude Code 中复制并粘贴此命令以安装该技能

技能文档

Enforce Policy as Code

Set declarative policy enforcement with OPA Gatekeeper or Kyverno for Kubernetes resource check and mutate.

When Use

  • Enforce org rules for resource config (labels, annotations, limits)
  • Block security mis-configs (privileged containers, host namespaces, bad images)
  • Ensure compliance rules met before resource deploy
  • Standard resource name and meta
  • Auto-fix through mutation policy
  • Audit existing cluster resources vs policy with no block
  • Wire policy check into CI/CD for shift-left

Inputs

  • Required: Kubernetes cluster with admin access
  • Required: Pick policy engine (OPA Gatekeeper or Kyverno)
  • Required: List of policies to enforce (security, compliance, ops)
  • Optional: Existing resources to audit
  • Optional: Exempt patterns for specific namespaces or resources
  • Optional: CI/CD config for pre-deploy check

Steps

See Extended Examples for full config files and templates.

Step 1: Install Policy Engine

Deploy OPA Gatekeeper or Kyverno as admission controller.

For OPA Gatekeeper:

# Install Gatekeeper using Helm
helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
helm repo update

# Install with audit enabled
helm install gatekeeper gatekeeper/gatekeeper \
  --namespace gatekeeper-system \
  --create-namespace \
  --set audit.replicas=2 \
  --set replicas=3 \
  --set validatingWebhookFailurePolicy=Fail \
  --set auditInterval=60

# Verify installation
kubectl get pods -n gatekeeper-system
kubectl get crd | grep gatekeeper

# Check webhook configuration
kubectl get validatingwebhookconfigurations gatekeeper-validating-webhook-configuration -o yaml

For Kyverno:

# Install Kyverno using Helm
helm repo add kyverno https://kyverno.github.io/kyverno/
helm repo update

# Install with HA setup
helm install kyverno kyverno/kyverno \
  --namespace kyverno \
  --create-namespace \
  --set replicaCount=3 \
  --set admissionController.replicas=3 \
  --set backgroundController.replicas=2 \
  --set cleanupController.replicas=2

# Verify installation
kubectl get pods -n kyverno
kubectl get crd | grep kyverno

# Check webhook configurations
kubectl get validatingwebhookconfigurations kyverno-resource-validating-webhook-cfg
kubectl get mutatingwebhookconfigurations kyverno-resource-mutating-webhook-cfg

Make namespace excludes:

# gatekeeper-config.yaml
apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
  name: config
  namespace: gatekeeper-system
spec:
  match:
    - excludedNamespaces:
      - kube-system
      - kube-public
      - kube-node-lease
      - gatekeeper-system
      processes:
      - audit
      - webhook
  validation:
    traces:
      - user: system:serviceaccount:gatekeeper-system:gatekeeper-admin
        kind:
          group: ""
          version: v1
          kind: Namespace

Got: Policy engine pods running with many replicas. CRDs in (ConstraintTemplate, Constraint for Gatekeeper; ClusterPolicy, Policy for Kyverno). Validating/mutating webhooks on. Audit controller running.

If fail:

  • Check pod logs: kubectl logs -n gatekeeper-system -l app=gatekeeper --tail=50
  • Check webhook endpoints reach: kubectl get endpoints -n gatekeeper-system
  • Look for port clash or cert issue in webhook logs
  • Cluster needs enough resources (policy engines need ~500MB per replica)
  • Review RBAC: kubectl auth can-i create constrainttemplates --as=system:serviceaccount:gatekeeper-system:gatekeeper-admin

Step 2: Define Constraint Templates and Policies

Make reusable policy templates and specific constraints.

OPA Gatekeeper Constraint Template:

# required-labels-template.yaml
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
  annotations:
# ... (see EXAMPLES.md for complete configuration)

Kyverno ClusterPolicy:

# kyverno-policies.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-labels
  annotations:
# ... (see EXAMPLES.md for complete configuration)

Apply policies:

# Apply Gatekeeper templates and constraints
kubectl apply -f required-labels-template.yaml

# Apply Kyverno policies
kubectl apply -f kyverno-policies.yaml

# Verify constraint/policy status
kubectl get constraints
kubectl get clusterpolicies

# Check for any policy errors
kubectl describe k8srequiredlabels require-app-labels
kubectl describe clusterpolicy require-labels

Got: ConstraintTemplates/ClusterPolicies made OK. Constraints show status "True" for enforce. No errors in policy definitions. Webhook start checking new resources vs policies.

If fail:

  • Check Rego syntax (Gatekeeper): use opa test local or check constraint status
  • Check policy YAML: kubectl apply --dry-run=client -f policy.yaml
  • Review constraint status: kubectl get constraint -o yaml | grep -A 10 status
  • Test with simple policy first, add more later
  • Check match rules (kinds, namespaces) right

Step 3: Test Policy Enforcement

Check policies block bad resources and allow good ones.

Make test manifests:

# test-non-compliant.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-no-labels
  namespace: production
# ... (see EXAMPLES.md for complete configuration)

Test policies:

# Attempt to create non-compliant resource (should fail)
kubectl apply -f test-non-compliant.yaml
# Expected: Error with policy violation message

# Create compliant resource (should succeed)
kubectl apply -f test-compliant.yaml
# Expected: deployment.apps/test-compliant created

# Test with dry-run for validation
kubectl apply -f test-non-compliant.yaml --dry-run=server
# Shows policy violations without actually creating resource

# Clean up
kubectl delete -f test-compliant.yaml

Test with policy reporting (Kyverno):

# Check policy reports
kubectl get policyreports -A
kubectl get clusterpolicyreports

# View detailed report
kubectl get policyreport -n production -o yaml

# Check policy rule results
kubectl get policyreport -n production -o jsonpath='{.items[0].results}' | jq .

Got: Bad resources blocked with clear violation msg. Good resources made OK. Policy reports show pass/fail. Dry-run check works no resource made.

If fail:

  • Check if policy in audit mode vs enforce: validationFailureAction: audit
  • Check webhook handling req: kubectl logs -n gatekeeper-system -l app=gatekeeper
  • Look for namespace excludes that exempt test namespace
  • Test webhook reach: kubectl run test --rm -it --image=busybox --restart=Never
  • Review webhook failure policy (Ignore vs Fail)

Step 4: Implement Mutation Policies

Set auto-fix through mutation.

Gatekeeper mutation:

# gatekeeper-mutations.yaml
apiVersion: mutations.gatekeeper.sh/v1beta1
kind: Assign
metadata:
  name: add-default-labels
spec:
# ... (see EXAMPLES.md for complete configuration)

Kyverno mutation policies:

# kyverno-mutations.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-default-labels
spec:
# ... (see EXAMPLES.md for complete configuration)

Apply and test mutations:

# Apply mutation policies
kubectl apply -f gatekeeper-mutations.yaml
# OR
kubectl apply -f kyverno-mutations.yaml

# Test mutation with a deployment
# ... (see EXAMPLES.md for complete configuration)

Got: Mutations auto add labels, resources, or change images. Deployed resources show mutated values. Mutations logged in policy engine logs. No errors during mutation.

If fail:

  • Check mutation webhook on: kubectl get mutatingwebhookconfiguration
  • Check mutation policy syntax: JSON paths and conditions
  • Review logs: kubectl logs -n kyverno deploy/kyverno-admission-controller
  • Test mutations not clash (many mutations on same field)
  • Make mutation happen before validation (order matters)

Step 5: Enable Audit Mode and Reporting

Set audit to spot violations in existing resources with no block.

Gatekeeper audit:

# Audit runs automatically based on auditInterval setting
# Check audit results
kubectl get constraints -o json | \
  jq '.items[] | {name: .metadata.name, violations: .status.totalViolations}'

# Get detailed violation information
# ... (see EXAMPLES.md for complete configuration)

Kyverno audit and reporting:

# Generate policy reports for existing resources
kubectl create job --from=cronjob/kyverno-cleanup-controller -n kyverno manual-report-gen

# View policy reports
kubectl get policyreport -A
kubectl get clusterpolicyreport
# ... (see EXAMPLES.md for complete configuration)

Make dashboard for policy compliance:

# prometheus-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: policy-alerts
  namespace: monitoring
# ... (see EXAMPLES.md for complete configuration)

Got: Audit spot violations in existing resources with no block on deploy. Policy reports made with pass/fail counts. Violations export for review. Metrics out for watch. Alerts fire on rising violations.

If fail:

  • Check audit controller running: kubectl get pods -n gatekeeper-system -l gatekeeper.sh/operation=audit
  • Check audit interval in install
  • Review audit logs for errors: kubectl logs -n gatekeeper-system -l gatekeeper.sh/operation=audit
  • RBAC perms must let read all resource types for audit
  • Check CRD status field filling: kubectl get constraint -o yaml | grep -A 20 status

Step 6: Integrate with CI/CD Pipeline

Add pre-deploy policy check for shift-left enforce.

CI/CD integration script:

#!/bin/bash
# validate-policies.sh

set -e

echo "=== Policy Validation for CI/CD ==="
# ... (see EXAMPLES.md for complete configuration)

GitHub Actions workflow:

# .github/workflows/policy-validation.yaml
name: Policy Validation

on:
  pull_request:
    paths:
# ... (see EXAMPLES.md for complete configuration)

Pre-commit hook:

#!/bin/bash
# .git/hooks/pre-commit

# Validate Kubernetes manifests against policies
if git diff --cached --name-only | grep -E 'manifests/.*\.yaml$'; then
  echo "Validating Kubernetes manifests against policies..."
# ... (see EXAMPLES.md for complete configuration)

Got: CI/CD pipeline check manifests before deploy. Policy violations fail pipeline with clear msg. Policy reports pinned to PR. Pre-commit hooks catch violations early. Devs know policy issues before cluster.

If fail:

  • Check CLI tools in PATH
  • Check kubeconfig creds good for fetching policies
  • Test policy check local first: kyverno apply policy.yaml --resource manifest.yaml
  • Policies synced from cluster must be full
  • Review policy CLI logs for specific check errors

Validation

  • Policy engine pods running with HA config
  • Validating and mutating webhooks on and reachable
  • Constraint templates and policies made with no errors
  • Bad resources blocked with clear violation msg
  • Good resources deploy OK
  • Mutation policies auto fix resources
  • Audit mode spots violations in existing resources
  • Policy reports made and open
  • Metrics out for policy compliance watch
  • CI/CD pipeline checks manifests pre-deploy
  • Pre-commit hooks stop policy violations
  • Namespace excludes set right

Pitfalls

  • Webhook Failure Policy: failurePolicy: Fail blocks all resources if webhook down. Use Ignore for low-risk policies, but know security impact. Test webhook up before enforce.

  • Too Strict Initial Policies: Start with enforce mode on strict policies break existing workloads. Begin with audit mode, review violations, talk with teams, then enforce step by step.

  • Missing Resource Specs: Policies must set API groups, versions, kinds right. Use kubectl api-resources to find exact values. Wildcards (*) easy but can cause speed issues.

  • Mutation Order: Mutations run before validations. Make mutations not clash and validations know mutated values. Test mutation+validation together.

  • Namespace Excludes: Exempt system namespaces needed, but watch not over-exempt. Review excludes often as policies grow.

  • Rego Complex (Gatekeeper): Complex Rego policies hard to debug. Start simple, test with opa test local, add logging with trace(), use gator for offline tests.

  • Speed Hit: Policy check adds lag to admission. Keep policies tight, use right match rules, watch webhook lag metrics.

  • Policy Clash: Many policies changing same field cause issues. Team up policies across teams, use policy libs for common patterns, test mixes.

  • Background Scan: Background audit scans full cluster. Can be heavy on big clusters. Tune audit interval by cluster size and policy count.

  • Version Compat: Policy CRD versions change. Gatekeeper v3 uses v1beta1 constraints, Kyverno v1.11 uses kyverno.io/v1. Check docs for your version.

See Also

  • manage-kubernetes-secrets - Secret check policies
  • security-audit-codebase - Complement security scan
  • deploy-to-kubernetes - App deploy with policy check
  • setup-service-mesh - Service mesh authz policies complement admission policies
  • configure-api-gateway - Gateway policies work with admission policies
  • implement-gitops-workflow - GitOps with policy check in pipeline

GitHub 仓库

pjt222/agent-almanac
路径: i18n/caveman/skills/enforce-policy-as-code
0
agentsagentskillsai-assisted-developmentclaude-codeskillsteams

相关推荐技能

content-collections

Content Collections 是一个 TypeScript 优先的构建工具,可将本地 Markdown/MDX 文件转换为类型安全的数据集合。它专为构建博客、文档站和内容密集型 Vite+React 应用而设计,提供基于 Zod 的自动模式验证。该工具涵盖从 Vite 插件配置、MDX 编译到生产环境部署的完整工作流。

查看技能

polymarket

这个Claude Skill为开发者提供完整的Polymarket预测市场开发支持,涵盖API调用、交易执行和市场数据分析。关键特性包括实时WebSocket数据流,可监控实时交易、订单和市场动态。开发者可用它构建预测市场应用、实施交易策略并集成实时市场预测功能。

查看技能

creating-opencode-plugins

该Skill帮助开发者创建OpenCode插件,用于接入命令、文件、LSP等25+种事件。它提供了插件结构、事件API规范和JavaScript/TypeScript实现模式,适合需要拦截操作、扩展功能或自定义事件处理的场景。开发者可通过它快速构建响应式模块来增强OpenCode AI助手的能力。

查看技能

sglang

SGLang是一个专为LLM设计的高性能推理框架,特别适用于需要结构化输出的场景。它通过RadixAttention前缀缓存技术,在处理JSON、正则表达式、工具调用等具有重复前缀的复杂工作流时,能实现极速生成。如果你正在构建智能体或多轮对话系统,并追求远超vLLM的推理性能,SGLang是理想选择。

查看技能