Back to Skills

huggingface-tokenizers

zechenzhangAGI
Updated 28 days ago
324 views
62
2
62
View on GitHub
Documentswordai

About

This Claude Skill provides high-performance tokenization using Rust-based implementations that can process 1GB of text in under 20 seconds. It supports popular algorithms like BPE, WordPiece, and Unigram while enabling custom vocabulary training and alignment tracking. Use it when you need fast production-ready tokenization or want to train custom tokenizers for NLP pipelines.

Quick Install

Claude Code

Recommended
Primary
npx skills add zechenzhangAGI/AI-research-SKILLs -a claude-code
Plugin CommandAlternative
/plugin add https://github.com/zechenzhangAGI/AI-research-SKILLs
Git CloneAlternative
git clone https://github.com/zechenzhangAGI/AI-research-SKILLs.git ~/.claude/skills/huggingface-tokenizers

Copy and paste this command in Claude Code to install this skill

GitHub Repository

zechenzhangAGI/AI-research-SKILLs
Path: 02-tokenization/huggingface-tokenizers
0
aiai-researchclaudeclaude-codeclaude-skillscodex

Related Skills