HyperShift AWS Provider
About
The HyperShift AWS Provider skill enables developers to deploy HyperShift clusters on AWS infrastructure using the `/hcp:generate aws` command. It handles AWS-specific requirements including STS credentials, IAM roles, and VPC configuration. This provides implementation guidance for setting up clusters with proper security and regional best practices.
Quick Install
Claude Code
Recommended/plugin add https://github.com/openshift-eng/ai-helpersgit clone https://github.com/openshift-eng/ai-helpers.git ~/.claude/skills/HyperShift AWS ProviderCopy and paste this command in Claude Code to install this skill
Documentation
HyperShift AWS Provider
This skill provides implementation guidance for creating HyperShift clusters on AWS, handling AWS-specific requirements including STS credentials, IAM roles, VPC configuration, and regional best practices.
When to Use This Skill
This skill is automatically invoked by the /hcp:generate aws command to guide the AWS provider cluster creation process.
Prerequisites
- AWS CLI configured with appropriate credentials
- HyperShift operator installed and configured
- STS credentials file for the target AWS account
- IAM role with required permissions for HyperShift
- Pull secret for accessing OpenShift images
AWS Provider Overview
AWS Provider Peculiarities
- Requires AWS credentials (STS): Must have valid STS credentials file
- Region selection affects availability zones: Different regions have different AZ availability
- Instance types vary by region: Not all instance types available in all regions
- VPC CIDR must not conflict: Must not overlap with existing infrastructure
- IAM roles: Can be auto-created or use pre-existing roles
Common AWS Configurations
Development Environment:
- Single replica control plane (cost-effective)
- m5.large instances (balanced performance/cost)
- 2 availability zones (basic redundancy)
- Basic networking (public endpoints)
Production Environment:
- Highly available control plane
- m5.xlarge+ instances (better performance)
- 3+ availability zones (high availability)
- Custom VPC configuration
- KMS encryption enabled
Cost-Optimized Environment:
- Single NAT gateway
- Smaller instance types
- Minimal replicas
- Spot instances (where applicable)
Implementation Steps
Step 1: Analyze Cluster Description
Parse the natural language description for AWS-specific requirements:
Environment Type Detection:
- Development: "dev", "development", "testing", "demo", "sandbox"
- Production: "prod", "production", "critical", "enterprise"
- Cost-Optimized: "cheap", "cost", "minimal", "budget", "demo"
Performance Indicators:
- High Performance: "performance", "fast", "high-compute", "intensive"
- Standard: Default moderate configuration
- Minimal: "small", "minimal", "basic", "simple"
Security/Compliance:
- FIPS: "fips", "compliance", "security", "regulated"
- Private: "private", "isolated", "secure", "internal"
Special Requirements:
- Multi-AZ: "highly available", "ha", "multi-zone", "resilient"
- Single-AZ: "single zone", "simple", "minimal"
Step 2: Apply AWS Provider Defaults
Required Parameters:
--region: AWS region (default: us-east-1)--pull-secret: Path to pull secret file--release-image: OpenShift release image--sts-creds: REQUIRED - Path to STS credentials file--role-arn: REQUIRED - ARN of the IAM role to assume--base-domain: REQUIRED - Base domain for the cluster
Smart Defaults by Environment:
Development Environment:
--instance-type m5.large
--node-pool-replicas 2
--control-plane-availability-policy SingleReplica
--endpoint-access Public
--root-volume-size 120
--zones auto-select 2 zones based on region
Production Environment:
--instance-type m5.xlarge
--node-pool-replicas 3
--control-plane-availability-policy HighlyAvailable
--endpoint-access PublicAndPrivate
--root-volume-size 120
--auto-repair true
--zones auto-select 3+ zones based on region
Cost-Optimized Environment:
--instance-type m5.large
--node-pool-replicas 2
--control-plane-availability-policy SingleReplica
--endpoint-access Public
--root-volume-size 120
--zones auto-select 2 zones (minimal redundancy)
Step 3: Interactive Parameter Collection
Required Information Collection:
-
Cluster Name
๐น **Cluster Name**: What would you like to name your cluster? - Must be DNS-compatible (lowercase, hyphens allowed) - Used for AWS resource naming - Example: dev-cluster, prod-app, demo-env -
AWS Region
๐น **AWS Region**: Which AWS region should host your cluster? - Consider latency to your users - Verify desired instance types are available - [Press Enter for default: us-east-1] Popular regions: - us-east-1 (N. Virginia) - Largest service availability - us-west-2 (Oregon) - West coast, latest services - eu-west-1 (Ireland) - Europe - ap-southeast-1 (Singapore) - Asia Pacific -
STS Credentials
๐น **STS Credentials**: Path to your AWS STS credentials file? - Required for AWS authentication - Generate using: aws sts get-session-token - Example: /home/user/.aws/sts-creds.json - Format: {"AccessKeyId": "...", "SecretAccessKey": "...", "SessionToken": "..."} -
IAM Role ARN
๐น **IAM Role ARN**: ARN of the IAM role for HyperShift? - Role must have required HyperShift permissions - Example: arn:aws:iam::123456789012:role/hypershift-operator-role - See: https://hypershift.openshift.io/aws-setup/ -
Base Domain
๐น **Base Domain**: What base domain should be used for cluster DNS? - Must be a domain you control in Route53 - Used for cluster API and application routes - Example: example.com, clusters.mycompany.com -
Pull Secret
๐น **Pull Secret**: Path to your OpenShift pull secret file? - Required for accessing OpenShift container images - Download from: https://console.redhat.com/openshift/install/pull-secret - Example: /home/user/pull-secret.json -
OpenShift Version
๐น **OpenShift Version**: Which OpenShift version do you want to use? ๐ **Check supported versions**: https://amd64.ocp.releases.ci.openshift.org/ - Enter release image URL: quay.io/openshift-release-dev/ocp-release:X.Y.Z-multi - [Press Enter for default: quay.io/openshift-release-dev/ocp-release:4.18.0-multi]
Optional Configuration (based on description analysis):
-
Instance Type (if performance requirements detected)
๐น **Instance Type**: Select instance type based on your performance needs: - m5.large (2 vCPU, 8GB RAM) - Development, light workloads - m5.xlarge (4 vCPU, 16GB RAM) - Production, balanced workloads - m5.2xlarge (8 vCPU, 32GB RAM) - High-performance workloads - c5.xlarge (4 vCPU, 8GB RAM) - Compute-optimized - [Press Enter for default based on environment type] -
Node Pool Replicas
๐น **Node Pool Replicas**: How many worker nodes do you need? - Minimum: 2 (for basic redundancy) - Production recommended: 3+ - [Press Enter for default based on environment type] -
Availability Zones (auto-selected, but confirmed)
๐น **Availability Zones**: Detected region: us-east-1 Auto-selecting zones for optimal distribution: - Development: us-east-1a, us-east-1b (2 zones) - Production: us-east-1a, us-east-1b, us-east-1c (3 zones) Modify zone selection? [y/N]
Step 4: Advanced Configuration (Conditional)
For FIPS Compliance (if detected):
๐น **FIPS Mode**: Enable FIPS mode for compliance?
- Required for government/regulated workloads
- May impact performance
- [yes/no] [Press Enter for default: no]
For High-Performance Workloads:
๐น **Root Volume Size**: Increase root volume size?
- Default: 120GB
- High-performance workloads: 200GB+
- [Press Enter for default: 120]
For Production Environments:
๐น **Auto-Repair**: Enable automatic node repair?
- Automatically replaces unhealthy nodes
- Recommended for production
- [yes/no] [Press Enter for default: yes for production]
Step 5: Generate Command
Basic AWS Cluster Command:
hypershift create cluster aws \
--name <cluster-name> \
--namespace <cluster-name>-ns \
--region <region> \
--instance-type <instance-type> \
--pull-secret <pull-secret-path> \
--node-pool-replicas <replica-count> \
--zones <zone-list> \
--control-plane-availability-policy <policy> \
--sts-creds <sts-creds-path> \
--role-arn <role-arn> \
--base-domain <base-domain> \
--release-image <release-image>
Development Configuration Example:
hypershift create cluster aws \
--name dev-cluster \
--namespace dev-cluster-ns \
--region us-east-1 \
--instance-type m5.large \
--pull-secret /path/to/pull-secret.json \
--node-pool-replicas 2 \
--zones us-east-1a,us-east-1b \
--control-plane-availability-policy SingleReplica \
--endpoint-access Public \
--root-volume-size 120 \
--sts-creds /path/to/sts-creds.json \
--role-arn arn:aws:iam::123456789012:role/hypershift-role \
--base-domain example.com \
--release-image quay.io/openshift-release-dev/ocp-release:4.18.0-multi
Production Configuration Example:
hypershift create cluster aws \
--name production-cluster \
--namespace production-cluster-ns \
--region us-west-2 \
--instance-type m5.xlarge \
--pull-secret /path/to/pull-secret.json \
--node-pool-replicas 3 \
--zones us-west-2a,us-west-2b,us-west-2c \
--control-plane-availability-policy HighlyAvailable \
--endpoint-access PublicAndPrivate \
--root-volume-size 120 \
--auto-repair \
--sts-creds /path/to/sts-creds.json \
--role-arn arn:aws:iam::123456789012:role/hypershift-prod-role \
--base-domain clusters.company.com \
--release-image quay.io/openshift-release-dev/ocp-release:4.18.0-multi
FIPS-Enabled Configuration:
hypershift create cluster aws \
--name compliance-cluster \
--namespace compliance-cluster-ns \
--region us-gov-east-1 \
--instance-type m5.xlarge \
--pull-secret /path/to/pull-secret.json \
--node-pool-replicas 3 \
--zones us-gov-east-1a,us-gov-east-1b,us-gov-east-1c \
--control-plane-availability-policy HighlyAvailable \
--fips \
--sts-creds /path/to/sts-creds.json \
--role-arn arn:aws-us-gov:iam::123456789012:role/hypershift-fips-role \
--base-domain secure.gov.example.com \
--release-image quay.io/openshift-release-dev/ocp-release:4.18.0-multi
Step 6: Pre-Flight Validation
Provide validation commands:
## Pre-Flight Checks
Before creating the cluster, verify your setup:
1. **AWS Credentials:**
aws sts get-caller-identity
2. **STS Credentials File:**
cat /path/to/sts-creds.json | jq .
3. **IAM Role Access:**
aws iam get-role --role-name hypershift-role
4. **Route53 Domain:**
aws route53 list-hosted-zones --query "HostedZones[?Name=='example.com.']"
5. **Region Availability:**
aws ec2 describe-availability-zones --region us-east-1
6. **Instance Type Availability:**
aws ec2 describe-instance-type-offerings --location-type availability-zone --filters Name=instance-type,Values=m5.large --region us-east-1
Step 7: Post-Generation Instructions
Next Steps:
## Next Steps
1. **Verify prerequisites are met:**
- AWS credentials configured
- STS credentials file exists and is valid
- IAM role has required permissions
- Base domain exists in Route53
2. **Run the generated command:**
Copy and paste the command above
3. **Monitor cluster creation:**
kubectl get hostedcluster -n <cluster-namespace>
kubectl get nodepool -n <cluster-namespace>
4. **Check AWS resources:**
- EC2 instances in AWS console
- Load balancers created
- VPC and networking resources
5. **Access cluster when ready:**
hypershift create kubeconfig --name <cluster-name> --namespace <cluster-namespace>
export KUBECONFIG=<cluster-name>-kubeconfig
oc get nodes
Error Handling
Invalid AWS Credentials
Scenario: AWS credentials are invalid or expired.
Action:
AWS credentials validation failed.
Please check:
1. AWS CLI configuration: aws configure list
2. STS credentials file validity
3. IAM permissions
Regenerate STS credentials:
aws sts get-session-token --duration-seconds 3600
IAM Role Not Found
Scenario: Specified IAM role doesn't exist or can't be assumed.
Action:
IAM role "arn:aws:iam::123456789012:role/hypershift-role" not found or inaccessible.
Please verify:
1. Role exists: aws iam get-role --role-name hypershift-role
2. Role has required permissions
3. Trust relationship allows your account to assume the role
See HyperShift AWS setup guide: https://hypershift.openshift.io/aws-setup/
Region/Zone Issues
Scenario: Instance type not available in selected region/zones.
Action:
Instance type "m5.large" not available in zone "us-east-1f".
Checking alternative zones in us-east-1:
โ
us-east-1a (available)
โ
us-east-1b (available)
โ us-east-1f (not available)
Suggested zones: us-east-1a,us-east-1b
Would you like me to update the command?
Route53 Domain Issues
Scenario: Base domain not found in Route53 or not accessible.
Action:
Base domain "example.com" not found in Route53.
Please ensure:
1. Domain exists in Route53: aws route53 list-hosted-zones
2. Account has access to the hosted zone
3. Domain spelling is correct
Alternative: Use a subdomain you control (e.g., clusters.mydomain.com)
Resource Limits
Scenario: AWS account limits would be exceeded.
Action:
AWS service limits may be exceeded:
- EC2 instances: Current: 18/20, Requested: 5 more
- Elastic IPs: Current: 4/5, Requested: 2 more
Consider:
1. Request limit increases via AWS Support
2. Choose smaller instance types
3. Reduce node count
4. Clean up unused resources
Best Practices
Cost Optimization
- Right-size instances: Don't over-provision for development
- Use Spot instances: Where appropriate for non-critical workloads
- Monitor resource usage: Regularly review AWS costs
- Clean up unused clusters: Delete development clusters when not needed
Security
- Least privilege IAM: Use minimal required permissions
- STS credentials: Use short-lived credentials when possible
- Private networking: Use PrivateAndPublic endpoints for production
- KMS encryption: Enable for sensitive workloads
High Availability
- Multi-AZ deployment: Use 3+ availability zones for production
- Instance distribution: Spread nodes across zones
- Auto-repair: Enable for automatic recovery
- Monitoring: Set up CloudWatch monitoring
Network Planning
- VPC design: Plan CIDR ranges carefully
- Subnet strategy: Use public/private subnet design
- Load balancer: Configure appropriate load balancer types
- DNS: Ensure proper Route53 configuration
Anti-Patterns to Avoid
โ Using root AWS credentials
Never use root account credentials for HyperShift
โ Use IAM roles and STS credentials
โ Single availability zone for production
--zones us-east-1a # Single point of failure
โ
Use multiple zones: --zones us-east-1a,us-east-1b,us-east-1c
โ Over-provisioning for development
--instance-type m5.8xlarge --node-pool-replicas 10 # Expensive for dev
โ
Use appropriate sizing: --instance-type m5.large --node-pool-replicas 2
โ Ignoring region-specific limitations
Choosing regions without checking instance type availability
โ Verify instance types and services are available in target region
Example Workflows
Startup Development Environment
Input: "cheap AWS cluster for testing our new microservice"
Analysis:
- Environment: Development
- Cost focus: High priority
- Scale: Minimal
Generated Command:
hypershift create cluster aws \
--name dev-microservice \
--namespace dev-microservice-ns \
--region us-east-1 \
--instance-type m5.large \
--node-pool-replicas 2 \
--control-plane-availability-policy SingleReplica \
--endpoint-access Public
Enterprise Production
Input: "highly available AWS production cluster for customer-facing applications"
Analysis:
- Environment: Production
- Availability: High priority
- Scale: Enterprise
Generated Command:
hypershift create cluster aws \
--name prod-customer-apps \
--namespace prod-customer-apps-ns \
--region us-west-2 \
--instance-type m5.xlarge \
--node-pool-replicas 5 \
--zones us-west-2a,us-west-2b,us-west-2c \
--control-plane-availability-policy HighlyAvailable \
--endpoint-access PublicAndPrivate \
--auto-repair
See Also
GitHub Repository
Related Skills
subagent-driven-development
DevelopmentThis skill executes implementation plans by dispatching a fresh subagent for each independent task, with code review between tasks. It enables fast iteration while maintaining quality gates through this review process. Use it when working on mostly independent tasks within the same session to ensure continuous progress with built-in quality checks.
algorithmic-art
MetaThis Claude Skill creates original algorithmic art using p5.js with seeded randomness and interactive parameters. It generates .md files for algorithmic philosophies, plus .html and .js files for interactive generative art implementations. Use it when developers need to create flow fields, particle systems, or other computational art while avoiding copyright issues.
executing-plans
DesignUse the executing-plans skill when you have a complete implementation plan to execute in controlled batches with review checkpoints. It loads and critically reviews the plan, then executes tasks in small batches (default 3 tasks) while reporting progress between each batch for architect review. This ensures systematic implementation with built-in quality control checkpoints.
cost-optimization
OtherThis Claude Skill helps developers optimize cloud costs through resource rightsizing, tagging strategies, and spending analysis. It provides a framework for reducing cloud expenses and implementing cost governance across AWS, Azure, and GCP. Use it when you need to analyze infrastructure costs, right-size resources, or meet budget constraints.
