torchserve
About
TorchServe is a production-ready model serving engine for PyTorch that packages models into MAR files and serves them via REST/gRPC APIs. It's ideal when you need custom preprocessing/inference logic via Python handlers and automatic multi-GPU worker scaling. Use it for handling request batching, load balancing, and managing multiple model versions in deployment.
Quick Install
Claude Code
Recommendednpx skills add cuba6112/skillfactory -a claude-code/plugin add https://github.com/cuba6112/skillfactorygit clone https://github.com/cuba6112/skillfactory.git ~/.claude/skills/torchserveCopy and paste this command in Claude Code to install this skill
GitHub Repository
Related Skills
qmd
Developmentqmd is a local search and indexing CLI tool that enables developers to index and search through local files using hybrid search combining BM25, vector embeddings, and reranking. It supports both command-line usage and MCP (Model Context Protocol) mode for integration with Claude. The tool uses Ollama for embeddings and stores indexes locally, making it ideal for searching documentation or codebases directly from the terminal.
subagent-driven-development
DevelopmentThis skill executes implementation plans by dispatching a fresh subagent for each independent task, with code review between tasks. It enables fast iteration while maintaining quality gates through this review process. Use it when working on mostly independent tasks within the same session to ensure continuous progress with built-in quality checks.
mcporter
DevelopmentThe mcporter skill enables developers to manage and call Model Context Protocol (MCP) servers directly from Claude. It provides commands to list available servers, call their tools with arguments, and handle authentication and daemon lifecycle. Use this skill for integrating and testing MCP server functionality in your development workflow.
adk-deployment-specialist
DevelopmentThis skill deploys and orchestrates Vertex AI ADK agents using A2A protocol, managing AgentCard discovery, task submission, and supporting tools like Code Execution Sandbox and Memory Bank. It enables building multi-agent systems with sequential, parallel, or loop orchestration patterns in Python, Java, or Go. Use it when asked to deploy ADK agents or orchestrate agent workflows on Google Cloud.
