SKILL·18EF8E

fine-tuning-with-trl

Name: fine-tuning-with-trl
Author: zechenzhangAGI

zechenzhangAGI

更新日 2 months ago

219 閲覧

その他ai

について

このスキルは、開発者がTRLの強化学習パイプラインを使用してLLMをファインチューニングできるようにするものです。これには、指示チューニングのためのSFT、選好アラインメントのためのDPO、報酬最適化のためのPPOが含まれます。RLHFワークフローを実装し、モデルを人間の選好に適合させるために設計されています。このスキルはHuggingFaceエコシステムと直接統合され、シームレスなモデルトレーニングを実現します。

クイックインストール

Claude Code

推奨

メイン

npx skills add zechenzhangAGI/AI-research-SKILLs -a claude-code

プラグインコマンド代替

/plugin add https://github.com/zechenzhangAGI/AI-research-SKILLs

Git クローン代替

git clone https://github.com/zechenzhangAGI/AI-research-SKILLs.git ~/.claude/skills/fine-tuning-with-trl

このコマンドをClaude Codeにコピー＆ペーストしてスキルをインストールします

GitHub リポジトリ

zechenzhangAGI/AI-research-SKILLs

パス: 06-post-training/trl-fine-tuning

aiai-researchclaudeclaude-codeclaude-skillscodex

FAQ

Frequently asked questions

What is the fine-tuning-with-trl skill?

fine-tuning-with-trl is a Claude Skill by zechenzhangAGI. Skills package instructions and resources that Claude loads on demand, so Claude can perform fine-tuning-with-trl-related tasks without extra prompting.

How do I install fine-tuning-with-trl?

Use the install commands on this page: add fine-tuning-with-trl to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does fine-tuning-with-trl belong to?

fine-tuning-with-trl is in the Other category, tagged ai.

Is fine-tuning-with-trl free to use?

Yes. fine-tuning-with-trl is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

関連スキル

llamaguard

その他

LlamaGuardは、暴力やヘイトスピーチなど6つの安全性カテゴリーにおいて、LLMの入力と出力をモデレートするMetaの70-80億パラメータモデルです。94〜95%の精度を提供し、vLLM、Hugging Face、Amazon SageMakerを使用してデプロイ可能です。このスキルを使用して、AIアプリケーションにコンテンツフィルタリングと安全策を簡単に統合できます。

スキルを見る

cost-optimization

その他

このClaudeスキルは、リソースの適正サイジング、タグ付け戦略、支出分析を通じて、開発者がクラウドコストを最適化することを支援します。AWS、Azure、GCPにわたるクラウド支出の削減とコストガバナンスの実施のためのフレームワークを提供します。インフラコストの分析、リソースの適正サイジング、または予算制約への対応が必要な際にご利用ください。

スキルを見る

sports-betting-analyzer

その他

このClaudeスキルは、スポーツベッティング市場（スプレッド、オーバー/アンダー、プロップベットなど）を分析し、過去の傾向や状況統計を検証することでバリューベットを特定します。教育目的のための実践的な提案を構造化されたマークダウン形式で出力します。開発者はスポーツベッティング分析ツールとして本機能を活用できますが、娯楽および教育目的に限定されている点に留意してください。

スキルを見る

quantizing-models-bitsandbytes

その他

このスキルは、bitsandbytesを使用してLLMを8ビットまたは4ビット精度に量子化し、精度の低下を最小限に抑えつつ50〜75％のメモリ削減を実現します。限られたGPUメモリでより大規模なモデルを実行したり、推論を高速化するのに理想的で、INT8、NF4、FP4などのフォーマットをサポートしています。HuggingFace Transformersと統合され、QLoRAトレーニングや8ビットオプティマイザーを可能にします。

スキルを見る