audiocraft-audio-generation
Über
Diese Claude Skill bietet Text-zu-Musik- und Text-zu-Audio-Generierung mithilfe von Metas AudioCraft PyTorch-Bibliothek. Sie ermöglicht Entwicklern, Musik aus Beschreibungen zu generieren, Soundeffekte zu erstellen und melodiengesteuerte Musikerzeugung durchzuführen. Zu den Kernfähigkeiten gehört die Verwendung der MusicGen- und AudioGen-Modelle für kontrollierbare, hochwertige Stereo-Audioausgabe.
Schnellinstallation
Claude Code
Empfohlennpx skills add davila7/claude-code-templates -a claude-code/plugin add https://github.com/davila7/claude-code-templatesgit clone https://github.com/davila7/claude-code-templates.git ~/.claude/skills/audiocraft-audio-generationKopieren Sie diesen Befehl und fügen Sie ihn in Claude Code ein, um diese Fähigkeit zu installieren
GitHub Repository
Verwandte Skills
blip-2-vision-language
DesignBLIP-2 is a vision-language framework that connects a frozen image encoder with a large language model for multimodal tasks. Use it for zero-shot image captioning, visual question answering, or image-text retrieval without task-specific fine-tuning. It's ideal for developers needing to add state-of-the-art visual understanding to LLM-based applications.
stable-diffusion-image-generation
MetaThis skill enables text-to-image generation and image manipulation using Stable Diffusion via HuggingFace Diffusers. It supports image generation from prompts, image-to-image translation, inpainting, and custom pipeline creation. Developers should use it when building applications requiring AI-powered visual content generation or editing.
whisper
AndereWhisper is OpenAI's multilingual speech recognition model for transcription and translation across 99 languages. It handles tasks like speech-to-text, podcast transcription, and processing noisy or multilingual audio. Developers should use it for robust, production-ready automatic speech recognition (ASR).
segment-anything-model
MetaThe segment-anything-model skill performs zero-shot image segmentation, allowing developers to isolate objects using prompts like points or bounding boxes, or to automatically generate all object masks. It's ideal for building annotation tools, generating training data, or processing images in new domains without task-specific training. Key capabilities include handling interactive prompts and providing strong out-of-the-box performance for various computer vision pipelines.
