metal
について
Metalスキルは、コードベースを分析し、その概念的なアーキテクチャを標準化されたスキル、エージェント、チーム定義として抽出します。実装の詳細には触れず、プロジェクトの目的(WHAT)と役割(WHO)を捉え、組織的なパターンを抽象化します。オンボーディング、エージェントシステムの立ち上げ、あるいはプロジェクトのDNAを研究してインスピレーションを得るためにご利用ください。
クイックインストール
Claude Code
推奨npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/metalこのコマンドをClaude Codeにコピー&ペーストしてスキルをインストールします
ドキュメント
Metal
Extract conceptual DNA of repo → roles + procedures + coordination patterns as generalized agentskills.io defs. Like noble metal from ore, separate IS (essence) from DOES (impl) → reusable skill/agent/team defs capturing organizational genome w/o reproducing codebase.
Use When
- Onboard new codebase → map conceptual architecture before code
- Bootstrap agentic system from existing project — implicit workflows → explicit defs
- Study project's organizational DNA for cross-pollination
- Build skill/agent/team library inspired by reference, no copy
- Project structure reveals creators' mental models + domain expertise
In
- Required: Path to repo/project root
- Required: Purpose statement — why extract? (onboard/bootstrap/study/cross-pollinate)
- Optional: Focus domains (default: all)
- Optional: Output depth —
survey(prospect+assay),extract(full),report(extract+written) (default:extract) - Optional: Max extractions cap (default: 15)
Ore Test
Central quality criterion:
Could concept exist in completely different impl?
YES → metal (essence). Extract. NO → gangue (impl detail). Leave.
Ex: weather app's "integrate external data source" = metal (any third-party fetch). "parse OpenWeatherMap v3 JSON res" = gangue (one API).
Skills = CLASS of task not instance. Agents = ROLE not person. Teams = COORDINATION PATTERN not org chart.
Do
Step 1: Prospect — Survey Ore Body
Survey repo structure, no judgment. Map terrain before mining.
- Glob tree for shape:
- Source dirs + org pattern (feature/layer/domain)
- Config:
package.json,DESCRIPTION,setup.py,Cargo.toml,go.mod,Makefile - Docs:
README.md,CLAUDE.md,CONTRIBUTING.md, architecture - CI/CD:
.github/workflows/,Dockerfile, deploy configs - Tests + structure
- Read self-description (README, manifest) → declared purpose
- Count files by type/lang → scope + primary tech
- Boundary — begins/ends, deps vs. provides
- Prospect Report:
Project: [name]
Declared Purpose: [from README/manifest]
Languages: [primary, secondary]
Size: [file count, approx LOC]
Shape: [monorepo/library/app/framework/docs]
External Surface: [CLI/API/UI/library exports/none]
→ Factual survey. No classification yet. Reads like geological survey not review.
If err: no README/manifest → infer from dirs, file content, test descs. >1000 files → narrow to most active dirs (git log freq or README refs).
Step 2: Assay — Composition
Read representative files → conceptual DOES.
- Sample 5-10 representative files diverse, not exhaustive:
- Entry points (main, route handlers, CLI cmds)
- Core logic (most-imported/referenced)
- Tests (intent > impl)
- Config (operational + deploy ctx)
- Per area:
- Domains: subject areas ("auth", "data transformation", "reporting")
- Verbs: actions ("validate", "transform", "deploy", "notify")
- Roles: actors ("data engineer", "end user", "reviewer")
- Flows: sequences ("ingest → validate → transform → store")
- Classify each:
- Essential: any impl solving this would have
- Accidental: this impl's tech choices
- Assay Report: domains/verbs/roles/flows + tags
→ Conceptual map reads like domain glossary not code walkthrough. Tech-stack-naive reader understands.
If err: opaque codebase (heavy metaprogramming, generated, obfuscated) → tests + docs over source. No tests → commit msgs for intent.
Step 3: Meditate — Release Impl Bias
Pause + clear cognitive anchoring from reading code.
- Notice dominating framework/lang/pattern → label
- Release HOW: "uses React" → "has UI layer." "PostgreSQL" → "persistent structured storage."
- Apply Ore Test on Assay findings:
- "integrate external data source" → YES → metal
- "configure Axios interceptors" → NO → gangue
- Rewrite failures at higher abstraction
- Multi-perspective lenses:
- Archaeologist: structure → creators' mental models?
- Biologist: replicable genome vs. specific phenotype?
- Music theorist: form (sonata, rondo) vs. notes?
- Cartographer: which abstraction level → useful topology?
→ Assay free of framework-specific lang. All findings pass Ore Test. Concepts portable to any lang/framework.
If err: bias persists → invert: "If rewritten in completely different stack, which concepts survive?" Only those = metal.
Step 4: Smelt — Separate Metal from Slag
Core extraction. Classify into skill/agent/team.
- Per essential concept, type:
Classification Criteria:
+--------+----------------------------+----------------------------+----------------------------+
| Type | What to Look For | Naming Convention | Test Question |
+--------+----------------------------+----------------------------+----------------------------+
| SKILL | Repeatable procedures, | Verb-first kebab-case: | "Could an agent follow |
| | workflows, transformations | validate-input, | this as a step-by-step |
| | with clear inputs/outputs | deploy-artifact | procedure?" |
+--------+----------------------------+----------------------------+----------------------------+
| AGENT | Persistent roles, domain | Noun/role kebab-case: | "Does this require ongoing |
| | expertise, judgment calls, | data-engineer, | context, expertise, or a |
| | communication styles | quality-reviewer | specific communication |
| | | | style?" |
+--------+----------------------------+----------------------------+----------------------------+
| TEAM | Multi-role coordination, | Group descriptor: | "Does this need more than |
| | handoffs, reviews, | pipeline-ops, | one distinct perspective |
| | parallel workstreams | review-board | to accomplish?" |
+--------+----------------------------+----------------------------+----------------------------+
-
Per extracted:
- Generalized name — not project-specific. "UserAuthService" →
identity-manager(agent). "deployToAWS()" →deploy-artifact(skill). - One-line desc standalone-readable
- Source concept for traceability not reproduction
- Apply Ore Test final time
- Generalized name — not project-specific. "UserAuthService" →
-
Avoid classification errs:
- Not every fn = skill — look for PROCEDURES not single ops
- Not every module = agent — look for ROLES needing judgment
- Not every collab = team — look for COORDINATION w/ distinct specialties
- Most projects yield 3-8 skills, 2-4 agents, 0-2 teams. 20+ → too fine.
→ Classified inventory: type + generalized name + one-line desc. No source-specific tech/API/data refs.
If err: ambiguous (skill or agent?) → "DOING (skill) vs. BEING someone who does (agent)?" Skill = recipe, agent = chef. Unclear → default skill — easier to compose later.
Step 5: Heal — Verify Quality
Honest extraction — neither too much nor too little.
-
Over-extraction: per def:
- Reconstruct original proprietary logic? → too much detail
- Refs specific libs/APIs/schemas/paths? → still gangue
- Full impl proc or concept sketch? → should be sketch
-
Under-extraction: defs only (no source) ask:
- Understand KIND of project that inspired? → yes
- Capture essential nature? → yes
- Major capabilities missing? → no
-
Generalization: per def:
- Name in different stack? → yes
- Desc framework-agnostic? → yes
- Useful in completely different domain? → ideally yes
-
Balance: ratios:
- 3-8 skills, 2-4 agents, 0-2 teams = typical focused
- <3 total → under-extracted
-
15 → over-extracted or insufficient generalization
→ Confidence right abstraction. Each def = seed for different soil, not cutting only surviving original garden.
If err: over-extracted → raise abstraction, merge specifics, collapse similar agents. Under-extracted → Step 2 + sample more. Generalization fails → strip tech refs, rewrite.
Step 6: Cast — Pour Metal into Forms
Produce agentskills.io standard outputs.
- Per skill skeletal:
# Skill: [generalized-name]
name: [generalized-name]
locale: caveman-ultra
source_locale: en
source_commit: 82c77053
translator: "Julius Brussee homage — caveman"
translation_date: "2026-04-19"
description: [one-line, framework-agnostic]
domain: [closest domain from the 52 existing domains, or suggest a new one]
complexity: [basic/intermediate/advanced]
# Concept-level procedure (3-5 steps, NOT full implementation):
# Step 1: [high-level action]
# Step 2: [high-level action]
# Step 3: [high-level action]
# Derived from: [source concept in original project]
- Per agent skeletal:
# Agent: [role-name]
name: [role-name]
locale: caveman-ultra
source_locale: en
source_commit: 82c77053
translator: "Julius Brussee homage — caveman"
translation_date: "2026-04-19"
description: [one-line purpose]
tools: [minimal tool set needed]
skills: [list of extracted skills this agent would carry]
# Derived from: [source role/module in original project]
- Per team skeletal:
# Team: [group-name]
name: [group-name]
locale: caveman-ultra
source_locale: en
source_commit: 82c77053
translator: "Julius Brussee homage — caveman"
translation_date: "2026-04-19"
description: [one-line purpose]
lead: [lead agent from extracted agents]
members: [list of member agents]
coordination: [hub-and-spoke/sequential/parallel/adaptive]
# Derived from: [source workflow/process in original project]
- Compile all → Assay Report w/ Skills/Agents/Teams sections + summary table
→ Structured report w/ all defs in agentskills.io format. Each skeletal (concept not impl) → starting point for create-skill/create-agent/create-team.
If err: >15 items → priority by centrality, keep most unique to project's domain. Generic ("manage-configuration") drop unless unusual twist.
Step 7: Temper — Final Validation
Verify complete extraction + summary.
- Count: N skills, N agents, N teams
- Coverage: span major domains?
- Independence: read each w/o source ctx → stands alone?
- Final Ore Test on complete set:
Temper Assessment:
+-----+---------------------------+----------+------------------------------------+
| # | Name | Type | Ore Test Result |
+-----+---------------------------+----------+------------------------------------+
| 1 | [name] | skill | PASS / FAIL (reason) |
| 2 | [name] | agent | PASS / FAIL (reason) |
| ... | ... | ... | ... |
+-----+---------------------------+----------+------------------------------------+
- Final summary:
- Total (skills/agents/teams)
- Coverage (which domains represented)
- Confidence (high/med/low) + rationale
- Next steps: which defs ready to flesh out first
→ Validated Assay Report w/ table + confidence + actionable next steps. Self-contained — naive reader understands extracted concepts.
If err: >20% fail Ore Test → Step 4, re-extract higher abstraction. Coverage <60% → Step 2, sample more.
Check
- Prospect: structure, langs, size, declared purpose
- Assay: domains, verbs, roles, flows + essential/accidental
- Meditate: bias cleared, no framework-specific lang
- All elements pass Ore Test (essence not impl)
- Skills = verbs, agents = nouns, teams = group descriptors
- All names generalized — no project-specific refs
- Count in typical range (5-15 total, not 1, not 30)
- Outputs follow agentskills.io format (frontmatter + sections)
- Over + under-extraction checks pass
- Temper: count + coverage + confidence + next steps
- Complete report understandable w/o source
Traps
- Mirror dir structure: One skill per file → not cross-cutting concepts. Metal = CONCEPTUAL not filesystem. 20-file project ≠ 20 skills.
- Framework worship: "configure-nextjs-api-routes" instead of "define-api-endpoints". Strip framework, keep pattern. Ore Test catches: "Without Next.js?" No → gangue.
- Role inflation: Agent per module. Most projects = 2-5 genuine roles needing distinct expertise. Look JUDGMENT + COMMUNICATION STYLE differences, not just functional.
- Skip Ore Test: Biggest failure mode. Every output must pass: "Could concept exist in different impl?" Refs specific libs/APIs/schemas → slag.
- Impl guides: Skills = CONCEPT-LEVEL sketches (3-5 steps), not full procs. Seeds for
create-skill, not finished. 50-step extraction = reproduction not essence. - Under-generalize names: "UserAuthService" = class. "identity-manager" = role. "manage-user-identity" = skill. Specific → universal.
- Ignore coordination: Teams hardest because coordination implicit. Look review workflows, deploy pipelines, data handoffs, approval chains.
→
athanor— when metal reveals project needs transformation not just essencechrysopoeia— value extraction at code level; metal = conceptual level above codetransmute— convert extracted concepts between domains/paradigmscreate-skill— flesh out extracted skill sketches → full SKILL.mdcreate-agent— flesh out extracted agent sketches → full agent defscreate-team— flesh out extracted team sketches → full team compositionsobserve— deeper observation when prospect reveals unfamiliar domainanalyze-codebase-for-mcp— complementary: metal extracts concepts, that extracts tool surfacesreview-codebase— complementary: metal extracts essence, that evaluates quality
GitHub リポジトリ
関連スキル
content-collections
メタこのスキルは、Content Collections(Markdown/MDXファイルを型安全なデータコレクションに変換するTypeScriptファーストのツール)の本番環境でテストされた設定を提供します。Zodバリデーションによる型安全性を実現し、ブログ、ドキュメントサイト、コンテンツ重視のVite + Reactアプリケーション構築時にご利用ください。Viteプラグインの設定、MDXコンパイルから、デプロイ最適化、スキーマバリデーションまで、すべてを網羅しています。
polymarket
メタこのスキルは、開発者がPolymarket予測市場プラットフォームを活用したアプリケーション構築を可能にします。API統合による取引や市場データの取得に加え、WebSocketを介したリアルタイムデータストリーミングにより、ライブ取引や市場活動を監視できます。取引戦略の実装や、ライブ市場更新を処理するツールの作成にご利用ください。
creating-opencode-plugins
メタこのスキルは、開発者がコマンド、ファイル、LSP操作など25種類以上のイベントタイプにフックするOpenCodeプラグインを作成することを支援します。JavaScript/TypeScriptモジュール向けに、プラグイン構造、イベントAPI仕様、および実装パターンを提供します。カスタムイベント駆動ロジックでOpenCode AIアシスタントのライフサイクルをインターセプト、監視、または拡張する必要がある場合にご利用ください。
sglang
メタSGLangは、高性能なLLMサービングフレームワークであり、RadixAttentionプレフィックスキャッシュを活用したJSON、正規表現、エージェントワークフロー向けの高速で構造化された生成を特長とします。特にプレフィックスが繰り返されるタスクにおいて、大幅に高速な推論を実現し、複雑な構造化出力やマルチターン対話に最適です。制約付きデコードが必要な場合や、広範なプレフィックス共有を伴うアプリケーションを構築する場合は、vLLMなどの代替案ではなくSGLangを選択してください。
