nemo-guardrails
About
NeMo Guardrails is a runtime safety framework for LLM applications that adds programmable guardrails. It provides key safety features like jailbreak detection, input/output validation, and hallucination detection using the Colang 2.0 DSL. Use it to enforce safety and compliance rules in production LLM deployments.
Quick Install
Claude Code
Recommendednpx skills add davila7/claude-code-templates -a claude-code/plugin add https://github.com/davila7/claude-code-templatesgit clone https://github.com/davila7/claude-code-templates.git ~/.claude/skills/nemo-guardrailsCopy and paste this command in Claude Code to install this skill
GitHub Repository
Related Skills
huggingface-tokenizers
DocumentsThis skill provides high-performance tokenization using HuggingFace's Rust-based library, processing 1GB of text in under 20 seconds. It supports BPE, WordPiece, and Unigram algorithms while enabling custom tokenizer training and alignment tracking. Use it when you need production-fast tokenization or to build custom tokenizers integrated with the transformers ecosystem.
qdrant-vector-search
MetaThe qdrant-vector-search skill provides a high-performance vector similarity search engine for building production RAG systems. It enables fast nearest neighbor search, hybrid search with filtering, and scalable vector storage powered by Rust. Use it when you need low-latency semantic search with horizontal scaling capabilities and full data control.
crewai-multi-agent
MetaCrewAI is a lightweight multi-agent orchestration framework for building teams of specialized AI agents that collaborate autonomously on complex tasks. It enables role-based agent collaboration with memory and supports sequential or hierarchical workflows for production use. The framework is built without LangChain dependencies for lean, fast execution.
training-llms-megatron
DesignThis skill trains massive LLMs (2B-462B parameters) using NVIDIA's Megatron-Core framework for maximum GPU efficiency. Use it when training models over 1B parameters and needing advanced parallelism like tensor, pipeline, or expert parallelism. It's a production-ready framework proven on models like Nemotron and LLaMA.
