flox-cuda
关于
flox-cuda is a Claude Skill for managing CUDA and GPU development environments using Flox. It helps developers set up NVIDIA CUDA toolkits, GPU computing libraries, and deep learning frameworks like cuDNN across platforms. Use this skill to search, install, and verify CUDA packages specifically for Linux environments.
技能文档
Flox CUDA Development Guide
Prerequisites & Authentication
- Sign up for early access at https://flox.dev
- Authenticate with
flox auth login - Linux-only: CUDA packages only work on
["aarch64-linux", "x86_64-linux"] - All CUDA packages are prefixed with
flox-cuda/in the catalog - No macOS support: Use Metal alternatives on Darwin
Core Commands
# Search for CUDA packages
flox search cudatoolkit --all | grep flox-cuda
flox search nvcc --all | grep 12_8
# Show available versions
flox show flox-cuda/cudaPackages.cudatoolkit
# Install CUDA packages
flox install flox-cuda/cudaPackages_12_8.cuda_nvcc
flox install flox-cuda/cudaPackages.cuda_cudart
# Verify installation
nvcc --version
nvidia-smi
Package Discovery
# Search for CUDA toolkit
flox search cudatoolkit --all | grep flox-cuda
# Search for specific versions
flox search nvcc --all | grep 12_8
# Show all available versions
flox show flox-cuda/cudaPackages.cudatoolkit
# Search for CUDA libraries
flox search libcublas --all | grep flox-cuda
flox search cudnn --all | grep flox-cuda
Essential CUDA Packages
| Package Pattern | Purpose | Example |
|---|---|---|
cudaPackages_X_Y.cudatoolkit | Main CUDA Toolkit | cudaPackages_12_8.cudatoolkit |
cudaPackages_X_Y.cuda_nvcc | NVIDIA C++ Compiler | cudaPackages_12_8.cuda_nvcc |
cudaPackages.cuda_cudart | CUDA Runtime API | cuda_cudart |
cudaPackages_X_Y.libcublas | Linear algebra | cudaPackages_12_8.libcublas |
cudaPackages_X_Y.libcufft | Fast Fourier Transform | cudaPackages_12_8.libcufft |
cudaPackages_X_Y.libcurand | Random number generation | cudaPackages_12_8.libcurand |
cudaPackages_X_Y.cudnn_9_11 | Deep neural networks | cudaPackages_12_8.cudnn_9_11 |
cudaPackages_X_Y.nccl | Multi-GPU communication | cudaPackages_12_8.nccl |
Critical: Conflict Resolution
CUDA packages have LICENSE file conflicts requiring explicit priorities:
[install]
cuda_nvcc.pkg-path = "flox-cuda/cudaPackages_12_8.cuda_nvcc"
cuda_nvcc.systems = ["aarch64-linux", "x86_64-linux"]
cuda_nvcc.priority = 1 # Highest priority
cuda_cudart.pkg-path = "flox-cuda/cudaPackages.cuda_cudart"
cuda_cudart.systems = ["aarch64-linux", "x86_64-linux"]
cuda_cudart.priority = 2
cudatoolkit.pkg-path = "flox-cuda/cudaPackages_12_8.cudatoolkit"
cudatoolkit.systems = ["aarch64-linux", "x86_64-linux"]
cudatoolkit.priority = 3 # Lower for LICENSE conflicts
gcc.pkg-path = "gcc"
gcc-unwrapped.pkg-path = "gcc-unwrapped" # For libstdc++
gcc-unwrapped.priority = 5
CUDA Version Selection
CUDA 12.x (Current)
[install]
cuda_nvcc.pkg-path = "flox-cuda/cudaPackages_12_8.cuda_nvcc"
cuda_nvcc.priority = 1
cuda_nvcc.systems = ["aarch64-linux", "x86_64-linux"]
cudatoolkit.pkg-path = "flox-cuda/cudaPackages_12_8.cudatoolkit"
cudatoolkit.priority = 3
cudatoolkit.systems = ["aarch64-linux", "x86_64-linux"]
CUDA 11.x (Legacy Support)
[install]
cuda_nvcc.pkg-path = "flox-cuda/cudaPackages_11_8.cuda_nvcc"
cuda_nvcc.priority = 1
cuda_nvcc.systems = ["aarch64-linux", "x86_64-linux"]
cudatoolkit.pkg-path = "flox-cuda/cudaPackages_11_8.cudatoolkit"
cudatoolkit.priority = 3
cudatoolkit.systems = ["aarch64-linux", "x86_64-linux"]
Cross-Platform GPU Development
Dual CUDA/CPU packages for portability (Linux gets CUDA, macOS gets CPU fallback):
[install]
## CUDA packages (Linux only)
cuda-pytorch.pkg-path = "flox-cuda/python3Packages.torch"
cuda-pytorch.systems = ["x86_64-linux", "aarch64-linux"]
cuda-pytorch.priority = 1
## Non-CUDA packages (macOS + Linux fallback)
pytorch.pkg-path = "python313Packages.pytorch"
pytorch.systems = ["x86_64-darwin", "aarch64-darwin"]
pytorch.priority = 6 # Lower priority
GPU Detection Pattern
Dynamic CPU/GPU package installation in hooks:
setup_gpu_packages() {
venv="$FLOX_ENV_CACHE/venv"
if [ ! -f "$FLOX_ENV_CACHE/.deps_installed" ]; then
if lspci 2>/dev/null | grep -E 'NVIDIA|AMD' > /dev/null; then
echo "GPU detected, installing CUDA packages"
uv pip install --python "$venv/bin/python" \
torch torchvision --index-url https://download.pytorch.org/whl/cu129
else
echo "No GPU detected, installing CPU packages"
uv pip install --python "$venv/bin/python" \
torch torchvision --index-url https://download.pytorch.org/whl/cpu
fi
touch "$FLOX_ENV_CACHE/.deps_installed"
fi
}
Complete CUDA Environment Examples
Basic CUDA Development
[install]
cuda_nvcc.pkg-path = "flox-cuda/cudaPackages_12_8.cuda_nvcc"
cuda_nvcc.priority = 1
cuda_nvcc.systems = ["aarch64-linux", "x86_64-linux"]
cuda_cudart.pkg-path = "flox-cuda/cudaPackages.cuda_cudart"
cuda_cudart.priority = 2
cuda_cudart.systems = ["aarch64-linux", "x86_64-linux"]
gcc.pkg-path = "gcc"
gcc-unwrapped.pkg-path = "gcc-unwrapped"
gcc-unwrapped.priority = 5
[vars]
CUDA_VERSION = "12.8"
CUDA_HOME = "$FLOX_ENV"
[hook]
echo "CUDA $CUDA_VERSION environment ready"
echo "nvcc: $(nvcc --version | grep release)"
Deep Learning with PyTorch
[install]
cuda_nvcc.pkg-path = "flox-cuda/cudaPackages_12_8.cuda_nvcc"
cuda_nvcc.priority = 1
cuda_nvcc.systems = ["aarch64-linux", "x86_64-linux"]
cuda_cudart.pkg-path = "flox-cuda/cudaPackages.cuda_cudart"
cuda_cudart.priority = 2
cuda_cudart.systems = ["aarch64-linux", "x86_64-linux"]
libcublas.pkg-path = "flox-cuda/cudaPackages_12_8.libcublas"
libcublas.priority = 2
libcublas.systems = ["aarch64-linux", "x86_64-linux"]
cudnn.pkg-path = "flox-cuda/cudaPackages_12_8.cudnn_9_11"
cudnn.priority = 2
cudnn.systems = ["aarch64-linux", "x86_64-linux"]
python313Full.pkg-path = "python313Full"
uv.pkg-path = "uv"
gcc-unwrapped.pkg-path = "gcc-unwrapped"
gcc-unwrapped.priority = 5
[vars]
CUDA_VERSION = "12.8"
PYTORCH_CUDA_ALLOC_CONF = "max_split_size_mb:128"
[hook]
setup_pytorch_cuda() {
venv="$FLOX_ENV_CACHE/venv"
if [ ! -d "$venv" ]; then
uv venv "$venv" --python python3
fi
if [ -f "$venv/bin/activate" ]; then
source "$venv/bin/activate"
fi
if [ ! -f "$FLOX_ENV_CACHE/.deps_installed" ]; then
uv pip install --python "$venv/bin/python" \
torch torchvision torchaudio \
--index-url https://download.pytorch.org/whl/cu129
touch "$FLOX_ENV_CACHE/.deps_installed"
fi
}
setup_pytorch_cuda
TensorFlow with CUDA
[install]
cuda_nvcc.pkg-path = "flox-cuda/cudaPackages_12_8.cuda_nvcc"
cuda_nvcc.priority = 1
cuda_nvcc.systems = ["aarch64-linux", "x86_64-linux"]
cuda_cudart.pkg-path = "flox-cuda/cudaPackages.cuda_cudart"
cuda_cudart.priority = 2
cuda_cudart.systems = ["aarch64-linux", "x86_64-linux"]
cudnn.pkg-path = "flox-cuda/cudaPackages_12_8.cudnn_9_11"
cudnn.priority = 2
cudnn.systems = ["aarch64-linux", "x86_64-linux"]
python313Full.pkg-path = "python313Full"
uv.pkg-path = "uv"
[hook]
setup_tensorflow() {
venv="$FLOX_ENV_CACHE/venv"
[ ! -d "$venv" ] && uv venv "$venv" --python python3
[ -f "$venv/bin/activate" ] && source "$venv/bin/activate"
if [ ! -f "$FLOX_ENV_CACHE/.tf_installed" ]; then
uv pip install --python "$venv/bin/python" tensorflow[and-cuda]
touch "$FLOX_ENV_CACHE/.tf_installed"
fi
}
setup_tensorflow
Multi-GPU Development
[install]
cuda_nvcc.pkg-path = "flox-cuda/cudaPackages_12_8.cuda_nvcc"
cuda_nvcc.priority = 1
cuda_nvcc.systems = ["aarch64-linux", "x86_64-linux"]
nccl.pkg-path = "flox-cuda/cudaPackages_12_8.nccl"
nccl.priority = 2
nccl.systems = ["aarch64-linux", "x86_64-linux"]
libcublas.pkg-path = "flox-cuda/cudaPackages_12_8.libcublas"
libcublas.priority = 2
libcublas.systems = ["aarch64-linux", "x86_64-linux"]
[vars]
CUDA_VISIBLE_DEVICES = "0,1,2,3" # All GPUs
NCCL_DEBUG = "INFO"
Modular CUDA Environments
Base CUDA Environment
# team/cuda-base
[install]
cuda_nvcc.pkg-path = "flox-cuda/cudaPackages_12_8.cuda_nvcc"
cuda_nvcc.priority = 1
cuda_nvcc.systems = ["aarch64-linux", "x86_64-linux"]
cuda_cudart.pkg-path = "flox-cuda/cudaPackages.cuda_cudart"
cuda_cudart.priority = 2
cuda_cudart.systems = ["aarch64-linux", "x86_64-linux"]
gcc.pkg-path = "gcc"
gcc-unwrapped.pkg-path = "gcc-unwrapped"
gcc-unwrapped.priority = 5
[vars]
CUDA_VERSION = "12.8"
CUDA_HOME = "$FLOX_ENV"
CUDA Math Libraries
# team/cuda-math
[include]
environments = [{ remote = "team/cuda-base" }]
[install]
libcublas.pkg-path = "flox-cuda/cudaPackages_12_8.libcublas"
libcublas.priority = 2
libcublas.systems = ["aarch64-linux", "x86_64-linux"]
libcufft.pkg-path = "flox-cuda/cudaPackages_12_8.libcufft"
libcufft.priority = 2
libcufft.systems = ["aarch64-linux", "x86_64-linux"]
libcurand.pkg-path = "flox-cuda/cudaPackages_12_8.libcurand"
libcurand.priority = 2
libcurand.systems = ["aarch64-linux", "x86_64-linux"]
CUDA Debugging Tools
# team/cuda-debug
[install]
cuda-gdb.pkg-path = "flox-cuda/cudaPackages_12_8.cuda-gdb"
cuda-gdb.systems = ["aarch64-linux", "x86_64-linux"]
nsight-systems.pkg-path = "flox-cuda/cudaPackages_12_8.nsight-systems"
nsight-systems.systems = ["aarch64-linux", "x86_64-linux"]
[vars]
CUDA_LAUNCH_BLOCKING = "1" # Synchronous kernel launches for debugging
Layer for Development
# Base CUDA environment
flox activate -r team/cuda-base
# Add debugging tools when needed
flox activate -r team/cuda-base -- flox activate -r team/cuda-debug
Testing CUDA Installation
Verify CUDA Compiler
nvcc --version
Check GPU Availability
nvidia-smi
Compile Test Program
cat > hello_cuda.cu << 'EOF'
#include <stdio.h>
__global__ void hello() {
printf("Hello from GPU!\n");
}
int main() {
hello<<<1,1>>>();
cudaDeviceSynchronize();
return 0;
}
EOF
nvcc hello_cuda.cu -o hello_cuda
./hello_cuda
Test PyTorch CUDA
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
print(f"GPU count: {torch.cuda.device_count()}")
if torch.cuda.is_available():
print(f"GPU name: {torch.cuda.get_device_name(0)}")
Best Practices
Always Use Priority Values
CUDA packages have predictable conflicts - assign explicit priorities
Version Consistency
Use specific versions (e.g., _12_8) for reproducibility. Don't mix CUDA versions.
Modular Design
Split base CUDA, math libs, and debugging into separate environments for flexibility
Test Compilation
Verify nvcc hello.cu -o hello works after setup
Platform Constraints
Always include systems = ["aarch64-linux", "x86_64-linux"]
Memory Management
Set appropriate CUDA memory allocator configs:
[vars]
PYTORCH_CUDA_ALLOC_CONF = "max_split_size_mb:128"
CUDA_LAUNCH_BLOCKING = "0" # Async by default
Common CUDA Gotchas
CUDA Toolkit ≠ Complete Toolkit
The cudatoolkit package doesn't include all libraries. Add what you need:
- libcublas for linear algebra
- libcufft for FFT
- cudnn for deep learning
License Conflicts
Every CUDA package may need explicit priority due to LICENSE file conflicts
No macOS Support
CUDA is Linux-only. Use Metal-accelerated packages on Darwin when available
Version Mixing
Don't mix CUDA versions. Use consistent _X_Y suffixes across all CUDA packages
Python Virtual Environments
CUDA Python packages (PyTorch, TensorFlow) should be installed in venv with correct CUDA version
Driver Requirements
Ensure NVIDIA driver supports your CUDA version. Check with nvidia-smi
Troubleshooting
CUDA Not Found
# Check CUDA_HOME
echo $CUDA_HOME
# Check nvcc
which nvcc
nvcc --version
# Check library paths
echo $LD_LIBRARY_PATH
PyTorch Not Using GPU
import torch
print(torch.cuda.is_available()) # Should be True
print(torch.version.cuda) # Should match your CUDA version
# If False, reinstall with correct CUDA version
# uv pip install torch --index-url https://download.pytorch.org/whl/cu129
Compilation Errors
# Check gcc/g++ version
gcc --version
g++ --version
# Ensure gcc-unwrapped is installed
flox list | grep gcc-unwrapped
# Check include paths
echo $CPATH
echo $LIBRARY_PATH
Runtime Errors
# Check GPU visibility
echo $CUDA_VISIBLE_DEVICES
# Check for GPU
nvidia-smi
# Run with debug output
CUDA_LAUNCH_BLOCKING=1 python my_script.py
Related Skills
- flox-environments - Setting up development environments
- flox-sharing - Composing CUDA base with project environments
- flox-containers - Containerizing CUDA environments for deployment
- flox-services - Running CUDA workloads as services
快速安装
/plugin add https://github.com/flox/flox-agentic/tree/main/flox-cuda在 Claude Code 中复制并粘贴此命令以安装该技能
GitHub 仓库
相关推荐技能
analyzing-dependencies
元这个Claude Skill能自动分析项目依赖的安全漏洞、过时包和许可证合规问题。它支持npm、pip、composer、gem和go modules等多种包管理器,帮助开发者识别潜在风险。当您需要检查依赖安全性、更新过时包或确保许可证兼容时,可使用"check dependencies"等触发短语来调用。
work-execution-principles
其他这个Claude Skill为开发者提供了一套通用的工作执行原则,涵盖任务分解、范围确定、测试策略和依赖管理。它确保开发活动中的一致质量标准,适用于代码审查、工作规划和架构决策等场景。该技能与所有编程语言和框架兼容,帮助开发者系统化地组织代码结构和定义工作边界。
Git Commit Helper
元Git Commit Helper能通过分析git diff自动生成规范的提交信息,适用于开发者编写提交消息或审查暂存区变更时。它能识别代码变更类型并自动匹配Conventional Commits规范,提供包含功能类型、作用域和描述的标准化消息。开发者只需提供git diff内容即可获得即用型的提交消息建议。
algorithmic-art
元该Skill使用p5.js创建包含种子随机性和交互参数探索的算法艺术,适用于生成艺术、流场或粒子系统等需求。它能自动生成算法哲学文档(.md)和对应的交互式艺术代码(.html/.js),确保作品原创性避免侵权。开发者可通过定义计算美学理念快速获得可交互的艺术实现方案。
