SKILL·08132F

runpod

Name: runpod
Author: digitalsamba

digitalsamba

Actualizado 1 month ago

9 vistas

1,757

296

1,757

Ver en GitHub

Documentaciónai

Acerca de

Esta habilidad de Claude permite el procesamiento con GPU en la nube a través de la plataforma serverless de RunPod para ejecutar modelos de IA como edición de imágenes, mejora de resolución y conversión de texto a voz. Maneja la configuración de endpoints, el despliegue de Docker y la gestión de recursos, cubriendo cinco imágenes específicas de herramientas. Los desarrolladores deben usarla cuando necesiten acceso a GPU con pago por segundo sin compromisos mínimos.

Instalación rápida

Claude Code

Recomendado

Principal

npx skills add digitalsamba/claude-code-video-toolkit -a claude-code

Comando PluginAlternativo

/plugin add https://github.com/digitalsamba/claude-code-video-toolkit

Git CloneAlternativo

git clone https://github.com/digitalsamba/claude-code-video-toolkit.git ~/.claude/skills/runpod

Copia y pega este comando en Claude Code para instalar esta habilidad

Documentación

RunPod Cloud GPU

Run open-source AI models on cloud GPUs via RunPod serverless. Pay-per-second, no minimums.

Setup

# 1. Create account at https://runpod.io
# 2. Add API key to .env
echo "RUNPOD_API_KEY=your_key_here" >> .env

# 3. Deploy any tool with --setup
python tools/image_edit.py --setup
python tools/upscale.py --setup
python tools/dewatermark.py --setup
python tools/sadtalker.py --setup
python tools/qwen3_tts.py --setup

Each --setup command:

Creates a RunPod template from the Docker image
Creates a serverless endpoint with appropriate GPU
Saves the endpoint ID to .env (e.g. RUNPOD_QWEN_EDIT_ENDPOINT_ID)

Available Images

All images are public on GHCR — no authentication needed.

Tool	Docker Image	GPU	VRAM	Typical Cost
image_edit	`ghcr.io/conalmullan/video-toolkit-qwen-edit:latest`	A6000/L40S	48GB+	~$0.05-0.15/job
upscale	`ghcr.io/conalmullan/video-toolkit-realesrgan:latest`	RTX 3090/4090	24GB	~$0.01-0.05/job
dewatermark	`ghcr.io/conalmullan/video-toolkit-propainter:latest`	RTX 3090/4090	24GB	~$0.05-0.30/job
sadtalker	`ghcr.io/conalmullan/video-toolkit-sadtalker:latest`	RTX 4090	24GB	~$0.05-0.15/job
qwen3_tts	`ghcr.io/conalmullan/video-toolkit-qwen3-tts:latest`	ADA 24GB	24GB	~$0.01-0.05/job

Total monthly cost: Rarely exceeds $10 even with heavy use.

How It Works

All tools follow the same pattern:

Local CLI → Upload input to cloud storage → RunPod API → Poll for result → Download output

File transfer: Tools use Cloudflare R2 when configured (R2_ACCOUNT_ID, R2_ACCESS_KEY_ID, R2_SECRET_ACCESS_KEY, R2_BUCKET_NAME), falling back to free upload services
RunPod API: Tools call the /run endpoint, then poll /status/{job_id} until complete
Cold vs warm start: First request after idle spins up a worker (~30-90s). Subsequent requests are fast (~5-15s)

Endpoint Management

Workers

workersMin: 0    — Scale to zero when idle (no cost)
workersMax: 1    — Max concurrent jobs (increase for throughput)
idleTimeout: 5   — Seconds before worker scales down

Across all endpoints, you share a total worker pool based on your RunPod plan. If you hit limits, reduce workersMax on endpoints you're not actively using.

Checking Endpoint Status

Each tool stores its endpoint ID in .env:

Tool	Env Var
image_edit	`RUNPOD_QWEN_EDIT_ENDPOINT_ID`
upscale	`RUNPOD_UPSCALE_ENDPOINT_ID`
dewatermark	`RUNPOD_DEWATERMARK_ENDPOINT_ID`
sadtalker	`RUNPOD_SADTALKER_ENDPOINT_ID`
qwen3_tts	`RUNPOD_QWEN3_TTS_ENDPOINT_ID`

Disabling an Endpoint

To free worker slots without deleting the endpoint, set workersMax=0 via the RunPod dashboard or GraphQL API.

RunPod API Reference

Use these to query and manage endpoints programmatically. RunPod disables GraphQL introspection, so these field names are verified and must be exact.

Authentication

All API calls require Authorization: Bearer $RUNPOD_API_KEY.

GraphQL: POST https://api.runpod.io/graphql
REST (Serverless): https://api.runpod.ai/v2/{endpoint_id}/...

GraphQL Queries

List all endpoints:

query { myself { endpoints { id name gpuIds templateId workersMax workersMin } } }

Current spend rate:

query { myself { currentSpendPerHr spendDetails { localStoragePerHour networkStoragePerHour gpuComputePerHour } } }

List pods:

query { myself { pods { id name runtime { uptimeInSeconds } machine { gpuDisplayName } desiredStatus } } }

Common mistakes: Field names are camelCase with full words — localStoragePerHour not localStoragePerHr. Endpoints are endpoints not serverlessWorkers. spending is not a field — use currentSpendPerHr and spendDetails.

GraphQL Mutations

Update endpoint GPU or config:

mutation { saveEndpoint(input: {
  id: "endpoint_id",
  name: "endpoint-name",
  templateId: "template_id",
  gpuIds: "AMPERE_24",
  workersMin: 0,
  workersMax: 1
}) { id gpuIds } }

saveEndpoint requires name and templateId even for updates — query first to get current values.

REST API (Serverless)

Action	Method	URL
Submit job	POST	`/v2/{id}/run`
Check status	GET	`/v2/{id}/status/{job_id}`
Cancel job	POST	`/v2/{id}/cancel/{job_id}`
List pending	GET	`/v2/{id}/requests`
Health/stats	GET	`/v2/{id}/health`

Health response includes job counts and worker state:

{
  "jobs": { "completed": 16, "failed": 1, "inProgress": 0, "inQueue": 2, "retried": 0 },
  "workers": { "idle": 0, "initializing": 1, "ready": 0, "running": 0, "throttled": 0 }
}

Note: /requests only returns pending/queued jobs. Completed job history is not available via the API — check the RunPod web console for logs.

GPU Type IDs

ID	GPU	VRAM	Typical Cost
`AMPERE_24`	RTX 3090	24GB	~$0.34/hr
`ADA_24`	RTX 4090	24GB	~$0.69/hr
`AMPERE_48`	A6000	48GB	~$0.76/hr
`AMPERE_80`	A100	80GB	~$1.99/hr

Availability note: ADA_24 (4090) is frequently throttled/unavailable on RunPod. Always configure endpoints with multiple fallback GPU types (comma-separated) to avoid jobs getting stuck in queue indefinitely:

gpuIds: "AMPERE_24,ADA_24"   # Try 3090 first, fall back to 4090

All toolkit tools also enforce a 5-minute queue timeout — if no GPU is available within 300 seconds, the job is automatically cancelled to prevent runaway billing from failed initialization cycles.

Cloudflare R2 via AWS CLI

R2 uses the S3-compatible API but requires --region auto:

AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY_ID" \
AWS_SECRET_ACCESS_KEY="$R2_SECRET_ACCESS_KEY" \
aws s3api list-objects-v2 \
  --bucket "$R2_BUCKET_NAME" \
  --endpoint-url "https://${R2_ACCOUNT_ID}.r2.cloudflarestorage.com" \
  --region auto

Common mistake: Omitting --region auto causes InvalidRegionName error. R2 valid regions: wnam, enam, weur, eeur, apac, oc, auto.

Troubleshooting

Force Image Pull

When you push a new Docker image version, RunPod may still use the cached old one. To force a pull:

Update the template's imageName to use @sha256:DIGEST notation
Wait for the worker to restart
Revert to :latest tag after confirming

Cold Start Too Slow

qwen3-tts: ~70s cold start, ~7s warm
sadtalker: ~60s cold start, ~10s warm
image_edit: ~90s cold start, ~15s warm

If cold starts are a problem, set workersMin: 1 (costs money when idle).

Job Fails with OOM

The model needs more VRAM than the GPU provides. Options:

Use a larger GPU tier
For dewatermark: reduce --resize-ratio (default 0.5 for safety)
For image_edit: reduce --steps

"No workers available"

You've hit your plan's concurrent worker limit. Either:

Wait for a running job to finish
Set workersMax=0 on endpoints you're not using
Upgrade your RunPod plan

Docker Images

All Dockerfiles live in docker/runpod-*/. Images use runpod/pytorch as the base to share layers across tools.

Building for RunPod (from Apple Silicon Mac):

docker buildx build --platform linux/amd64 -t ghcr.io/conalmullan/video-toolkit-<name>:latest docker/runpod-<name>/
docker push ghcr.io/conalmullan/video-toolkit-<name>:latest

GHCR packages default to private — you must manually make them public for RunPod to pull them. Go to GitHub > Packages > Package Settings > Change Visibility.

Cost Optimization

Keep workersMin: 0 on all endpoints (scale to zero)
Only deploy endpoints you actively need
Use workersMax=0 to disable idle endpoints without deleting them
Qwen3-TTS is significantly cheaper than ElevenLabs for voiceovers
Check the RunPod dashboard for usage and billing

Repositorio GitHub

digitalsamba/claude-code-video-toolkit

Ruta: .claude/skills/runpod

ai-video-generatorclaude-codedeveloper-toolselevenlabsopen-sourceopenclaw

FAQ

Frequently asked questions

What is the runpod skill?

runpod is a Claude Skill by digitalsamba. Skills package instructions and resources that Claude loads on demand, so Claude can perform runpod-related tasks without extra prompting.

How do I install runpod?

Use the install commands on this page: add runpod to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does runpod belong to?

runpod is in the Documentation category, tagged ai.

Is runpod free to use?

Yes. runpod is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

Habilidades relacionadas

railway-docs

Documentación

Esta habilidad obtiene la documentación actual de Railway para responder preguntas sobre características, funcionalidad o URLs específicas de documentación. Garantiza que los desarrolladores reciban información precisa y actualizada directamente de las fuentes oficiales de Railway. Úsala cuando los usuarios pregunten cómo funciona Railway o hagan referencia a la documentación de Railway.

Ver habilidad

n8n-code-python

Documentación

Esta Skill de Claude proporciona orientación experta para escribir código Python en los nodos Code de n8n, específicamente para usar la biblioteca estándar de Python y trabajar con la sintaxis especial de n8n como `_input`, `_json` y `_node`. Ayuda a los desarrolladores a comprender las limitaciones de Python dentro de n8n y recomienda usar JavaScript para la mayoría de los flujos de trabajo, mientras ofrece soluciones en Python para necesidades específicas de transformación de datos.

Ver habilidad

archon

Documentación

La habilidad Archon proporciona búsqueda semántica con tecnología RAG y gestión de proyectos a través de una API REST. Úsala para consultar documentación, gestionar proyectos/tareas jerárquicos y realizar recuperación de conocimiento con capacidades de carga de documentos. Prioriza siempre a Archon en primer lugar al buscar en documentación externa antes de utilizar otras fuentes.

Ver habilidad

n8n-code-javascript

Documentación

Esta habilidad de Claude proporciona orientación experta para escribir código JavaScript en los nodos de Código de n8n. Cubre sintaxis esencial específica de n8n como las variables `$input`/`$json`, ayudantes HTTP y manejo de DateTime, mientras soluciona errores comunes. Úsela al desarrollar flujos de trabajo en n8n que requieran procesamiento personalizado de JavaScript en los nodos de Código.

Ver habilidad