metal
정보
메탈 스킬은 코드베이스를 분석하여 개념적 아키텍처를 표준화된 에이전트, 스킬, 팀 정의로 추출합니다. 구현 세부사항 없이 프로젝트의 목적(WHAT)과 역할(WHO)을 포착하여, 온보딩이나 에이전트 시스템 부트스트래핑에 이상적입니다. 이를 통해 개발자는 프로젝트의 조직적 DNA를 빠르게 이해하고 기존 코드에서 재사용 가능한 에이전트 라이브러리를 생성할 수 있습니다.
빠른 설치
Claude Code
추천npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/metalClaude Code에서 이 명령을 복사하여 붙여넣어 스킬을 설치하세요
문서
Metal
Extract the conceptual DNA of a repository — its roles, procedures, and coordination patterns — as generalized agentskills.io definitions. Like extracting noble metal from ore, the skill separates what a project IS (its essence) from what it DOES (its implementation), producing reusable skill, agent, and team definitions that capture the project's organizational genome without reproducing its codebase.
When to Use
- Onboarding to a new codebase and wanting to map its conceptual architecture before diving into code
- Bootstrapping an agentic system from an existing project — turning implicit workflows into explicit skill/agent/team definitions
- Studying a project's organizational DNA for cross-pollination into other projects
- Creating a skill/agent/team library inspired by a reference implementation without copying it
- Understanding what a project's structure reveals about its creators' mental models and domain expertise
Inputs
- Required: Path to the repository or project root directory
- Required: Purpose statement — why is essence being extracted? (onboarding, bootstrapping, study, or cross-pollination)
- Optional: Focus domains — specific areas of the project to concentrate on (default: all)
- Optional: Output depth —
survey(prospect + assay only),extract(full procedure), orreport(extraction + written report) (default:extract) - Optional: Maximum extractions — cap on total skills + agents + teams to produce (default: 15)
The Ore Test
The central quality criterion for all extraction:
Could this concept exist in a completely different implementation?
If YES — it is metal (essence). Extract it. If NO — it is gangue (implementation detail). Leave it behind.
Example: A weather app's concept "integrate external data source" is metal — it applies to any project fetching third-party data. But "parse OpenWeatherMap v3 JSON response" is gangue — it is specific to one API.
Extracted skills should describe the CLASS of task, not the specific instance. Extracted agents should describe the ROLE, not the person. Extracted teams should describe the COORDINATION PATTERN, not the org chart.
Procedure
Step 1: Prospect — Survey the Ore Body
Survey the repository structure without judgment. Map the terrain before mining.
- Glob the directory tree to understand the project's shape:
- Source directories and their organization pattern (by feature, by layer, by domain)
- Configuration files:
package.json,DESCRIPTION,setup.py,Cargo.toml,go.mod,Makefile - Documentation:
README.md,CLAUDE.md,CONTRIBUTING.md, architecture docs - CI/CD:
.github/workflows/,Dockerfile, deployment configs - Test directories and their structure
- Read the project's self-description (README, package manifest) to understand its declared purpose
- Count files by type/language to gauge scope and identify the primary technology
- Identify the project's boundary — where it begins and ends, what it depends on vs what it provides
- Produce the Prospect Report:
Project: [name]
Declared Purpose: [from README/manifest]
Languages: [primary, secondary]
Size: [file count, approx LOC]
Shape: [monorepo/library/app/framework/docs]
External Surface: [CLI/API/UI/library exports/none]
Got: A factual survey — what is here, how large, what does the project claim to be. No classification or judgment yet. The report reads like a geological survey, not a review.
If fail: If the repository has no README or manifest, infer purpose from directory names, file contents, and test descriptions. If the project is too large (>1000 source files), narrow the scope to the most active directories (use git log frequency or README references).
Step 2: Assay — Analyze the Composition
Read representative files to understand what the project DOES at the conceptual level.
- Sample 5-10 representative files from different areas of the project — diverse, not exhaustive:
- Entry points (main files, route handlers, CLI commands)
- Core logic (the most-imported or most-referenced modules)
- Tests (they reveal intended behavior more clearly than implementation)
- Configuration (reveals operational concerns and deployment context)
- For each sampled area, identify:
- Domains: What subject areas does the project touch? (e.g., "authentication", "data transformation", "reporting")
- Verbs: What actions does the project perform? (e.g., "validate", "transform", "deploy", "notify")
- Roles: What human or system actors does the code serve? (e.g., "data engineer", "end user", "reviewer")
- Flows: What sequences of actions form workflows? (e.g., "ingest → validate → transform → store")
- For each finding, classify as:
- Essential: Would exist in any implementation solving this problem
- Accidental: Specific to this implementation's technology choices
- Produce the Assay Report: a table of domains, verbs, roles, and flows with essential/accidental tags
Got: A conceptual map of the project that reads like a domain glossary, not a code walkthrough. Someone unfamiliar with the tech stack should understand what the project does from this report.
If fail: If the codebase is opaque (heavy metaprogramming, generated code, or obfuscated), lean on tests and documentation rather than source code. If no tests exist, read commit messages for intent.
Step 3: Meditate — Release Implementation Bias
Pause to clear the cognitive anchoring from reading code.
- Notice which framework, language, or architectural pattern is dominating your mental model — label it
- Release attachment to the HOW: "This project uses React" becomes "This project has a user interface layer." "This uses PostgreSQL" becomes "This has persistent structured storage."
- For each finding in the Assay Report, apply the Ore Test:
- "integrate external data source" — could exist anywhere? YES → metal
- "configure Axios interceptors" — could exist anywhere? NO → gangue
- Rewrite any findings that failed the Ore Test at a higher abstraction level
- If multiple perspectives help, consider the project through these lenses:
- Archaeologist: What does the code's structure reveal about its creators' mental models?
- Biologist: What is the replicable genome vs the specific phenotype?
- Music theorist: What is the form (sonata, rondo) vs the specific notes?
- Cartographer: What level of abstraction captures the useful topology?
Got: The Assay Report is now free of framework-specific language. Every finding passes the Ore Test. The concepts feel portable — they could apply to a project in any language or framework.
If fail: If bias persists (findings keep referencing specific technologies), try inverting: "If this project were rewritten in a completely different stack, which concepts would survive?" Only those are metal.
Step 4: Smelt — Separate Metal from Slag
The core extraction step. Classify each essential concept into skills, agents, or teams.
- For each essential concept from the purified Assay Report, determine its type:
Classification Criteria:
+--------+----------------------------+----------------------------+----------------------------+
| Type | What to Look For | Naming Convention | Test Question |
+--------+----------------------------+----------------------------+----------------------------+
| SKILL | Repeatable procedures, | Verb-first kebab-case: | "Could an agent follow |
| | workflows, transformations | validate-input, | this as a step-by-step |
| | with clear inputs/outputs | deploy-artifact | procedure?" |
+--------+----------------------------+----------------------------+----------------------------+
| AGENT | Persistent roles, domain | Noun/role kebab-case: | "Does this require ongoing |
| | expertise, judgment calls, | data-engineer, | context, expertise, or a |
| | communication styles | quality-reviewer | specific communication |
| | | | style?" |
+--------+----------------------------+----------------------------+----------------------------+
| TEAM | Multi-role coordination, | Group descriptor: | "Does this need more than |
| | handoffs, reviews, | pipeline-ops, | one distinct perspective |
| | parallel workstreams | review-board | to accomplish?" |
+--------+----------------------------+----------------------------+----------------------------+
-
For each extracted element:
- Assign a generalized name — not project-specific. "UserAuthService" becomes
identity-manager(agent). "deployToAWS()" becomesdeploy-artifact(skill). - Write a one-line description that makes sense without knowing the source project
- Note the source concept it derives from (for traceability, not reproduction)
- Apply the Ore Test one final time
- Assign a generalized name — not project-specific. "UserAuthService" becomes
-
Guard against common classification errors:
- Not every function is a skill — look for PROCEDURES, not individual operations
- Not every module is an agent — look for ROLES that require judgment
- Not every collaboration is a team — look for COORDINATION PATTERNS with distinct specialties
- Most projects yield 3-8 skills, 2-4 agents, and 0-2 teams. With 20+, you are extracting too fine.
Got: A classified inventory where each item has a type (skill/agent/team), a generalized name, and a one-line description. No item references the source project's specific technologies, APIs, or data structures.
If fail: If classification is ambiguous (is this a skill or an agent?), ask: "Is this about DOING something (skill) or BEING someone who does things (agent)?" A skill is a recipe; an agent is a chef. If still unclear, default to skill — skills are easier to compose later.
Step 5: Heal — Verify Extraction Quality
Assess whether the extraction is honest — neither too much nor too little.
-
Over-extraction check: Read each extracted definition and ask:
- Could someone reconstruct the original project's proprietary logic from this? → Too much detail
- Does this reference specific libraries, APIs, database schemas, or file paths? → Still gangue
- Is this a full implementation procedure or a concept-level sketch? → Should be sketch
-
Under-extraction check: Show only the extracted definitions (without the source project) and ask:
- Could someone understand what KIND of project inspired these? → Should be yes
- Do the definitions capture the project's essential nature? → Should be yes
- Are there major project capabilities not represented? → Should be no
-
Generalization check: For each definition:
- Would the name make sense in a different tech stack? → Should be yes
- Is the description framework-agnostic? → Should be yes
- Could this definition be useful to a project in a completely different domain? → Ideally yes
-
Balance check: Review the extraction ratios:
- 3-8 skills, 2-4 agents, 0-2 teams is typical for a focused project
- Fewer than 3 total extractions suggests under-extraction
- More than 15 total suggests over-extraction or insufficient generalization
Got: Confidence that the extraction is at the right level of abstraction. Each definition is a seed that could grow in different soil, not a cutting that survives only in the original garden.
If fail: If over-extracted, raise the abstraction level — merge specific skills into broader ones, collapse similar agents into a single role. If under-extracted, return to Step 2 and sample additional files. If generalization check fails, strip technology references and rewrite descriptions.
Step 6: Cast — Pour the Metal into Forms
Produce the agentskills.io-standard output documents.
- For each extracted skill, write a skeletal definition:
# Skill: [generalized-name]
name: [generalized-name]
description: [one-line, framework-agnostic]
domain: [closest domain from the 52 existing domains, or suggest a new one]
complexity: [basic/intermediate/advanced]
# Concept-level procedure (3-5 steps, NOT full implementation):
# Step 1: [high-level action]
# Step 2: [high-level action]
# Step 3: [high-level action]
# Derived from: [source concept in original project]
- For each extracted agent, write a skeletal definition:
# Agent: [role-name]
name: [role-name]
description: [one-line purpose]
tools: [minimal tool set needed]
skills: [list of extracted skills this agent would carry]
# Derived from: [source role/module in original project]
- For each extracted team, write a skeletal definition:
# Team: [group-name]
name: [group-name]
description: [one-line purpose]
lead: [lead agent from extracted agents]
members: [list of member agents]
coordination: [hub-and-spoke/sequential/parallel/adaptive]
# Derived from: [source workflow/process in original project]
- Compile all extractions into the Assay Report — a single document with sections for Skills, Agents, and Teams, plus a summary table
Got: A structured report containing all extracted definitions in agentskills.io format. Each definition is skeletal (concept-level, not implementation-level) and could serve as a starting point for the create-skill, create-agent, or create-team skills to flesh out.
If fail: If the output exceeds 15 items, prioritize by centrality — keep the concepts that are most unique to this project's domain. Generic concepts (like "manage-configuration") that exist in most projects should be dropped unless they have an unusual twist.
Step 7: Temper — Final Validation
Verify the complete extraction and produce the summary.
- Count the extractions: N skills, N agents, N teams
- Assess coverage: do they span the project's major domains?
- Verify independence: read each definition WITHOUT the source project context — does it stand alone?
- Run the Ore Test one final time on the complete set:
Temper Assessment:
+-----+---------------------------+----------+------------------------------------+
| # | Name | Type | Ore Test Result |
+-----+---------------------------+----------+------------------------------------+
| 1 | [name] | skill | PASS / FAIL (reason) |
| 2 | [name] | agent | PASS / FAIL (reason) |
| ... | ... | ... | ... |
+-----+---------------------------+----------+------------------------------------+
- Produce the final summary:
- Total extractions (skills / agents / teams)
- Coverage assessment (which project domains are represented)
- Confidence level (high / medium / low) with rationale
- Suggested next steps: which extracted definitions are ready to flesh out first
Got: A validated Assay Report with a summary table, confidence assessment, and actionable next steps. The report is self-contained — someone who has never seen the source project can read it and understand the extracted concepts.
If fail: If more than 20% of items fail the final Ore Test, return to Step 4 (Smelt) and re-extract at a higher abstraction level. If coverage is below 60% of identified domains, return to Step 2 (Assay) and sample additional files.
Validation Checklist
- Prospect report covers project structure, languages, size, and declared purpose
- Assay identifies domains, verbs, roles, and flows with essential/accidental classification
- Meditate checkpoint clears implementation bias — no framework-specific language in outputs
- Every extracted element passes the Ore Test (essence, not implementation detail)
- Skills are named with verbs, agents with nouns, teams with group descriptors
- All names are generalized — no project-specific references
- Extraction count is within typical range (5-15 total, not 1 and not 30)
- Output definitions follow agentskills.io format (frontmatter + sections)
- Over-extraction and under-extraction checks both pass
- Final Temper assessment includes count, coverage, confidence, and next steps
- The complete Assay Report is understandable without access to the source project
Pitfalls
- Mirroring the directory structure: Producing one skill per source file instead of extracting cross-cutting concepts. The metal should reflect the project's CONCEPTUAL structure, not its file system. A 20-file project does not have 20 skills.
- Framework worship: Extracting "configure-nextjs-api-routes" instead of "define-api-endpoints". Strip the framework; keep the pattern. The Ore Test catches this: "Could this exist without Next.js?" If no, it's gangue.
- Role inflation: Creating an agent for every module. Most projects have 2-5 genuine roles requiring distinct expertise, not 20. Look for JUDGMENT and COMMUNICATION STYLE differences, not functional differences alone.
- Skipping the Ore Test: The single biggest failure mode. Every output must pass: "Could this concept exist in a completely different implementation?" If it references specific libraries, APIs, or data schemas, it is slag, not metal.
- Producing implementation guides: Extracted skills should be CONCEPT-LEVEL sketches (3-5 high-level steps), not full implementation procedures. They are seeds to be fleshed out with
create-skill, not finished products. A 50-step extraction is a reproduction, not an essence. - Under-generalizing names: "UserAuthService" is a class name, not a concept. "identity-manager" is a role. "manage-user-identity" is a skill. Generalize from the specific to the universal.
- Ignoring coordination patterns: Teams are the hardest to extract because coordination is often implicit. Look for code review workflows, deployment pipelines, data handoffs between systems, and approval chains — these reveal team structures.
Related Skills
athanor— When metal reveals the project needs transformation, not just essence extractionchrysopoeia— Value extraction at the code level; metal works at the conceptual level above codetransmute— Converting extracted concepts between domains or paradigmscreate-skill— Flesh out extracted skill sketches into full SKILL.md implementationscreate-agent— Flesh out extracted agent sketches into full agent definitionscreate-team— Flesh out extracted team sketches into full team compositionsobserve— Deeper observation when the prospect phase reveals an unfamiliar domainanalyze-codebase-for-mcp— Complementary: metal extracts concepts, analyze-codebase-for-mcp extracts tool surfacesreview-codebase— Complementary: metal extracts essence, review-codebase evaluates quality
GitHub 저장소
연관 스킬
content-collections
메타이 스킬은 콘텐츠 콜렉션(Content Collections)을 위한 프로덕션 검증된 설정을 제공합니다. 콘텐츠 콜렉션은 Markdown/MDX 파일을 Zod 검증이 포함된 타입 안전한 데이터 콜렉션으로 변환해주는 TypeScript 최우선 도구입니다. 블로그, 문서 사이트 또는 콘텐츠 중심의 Vite + React 애플리케이션을 구축할 때 타입 안전성과 자동 콘텐츠 검증을 보장하기 위해 사용하세요. Vite 플러그인 구성과 MDX 컴파일부터 배포 최적화 및 스키마 검증에 이르기까지 모든 것을 다룹니다.
polymarket
메타이 스킬은 개발자들이 Polymarket 예측 시장 플랫폼을 활용한 애플리케이션을 구축할 수 있도록 지원하며, 거래 및 시장 데이터를 위한 API 통합 기능을 포함합니다. 또한 WebSocket을 통한 실시간 데이터 스트리밍을 제공하여 실시간 거래와 시장 활동을 모니터링할 수 있습니다. 이를 통해 거래 전략을 구현하거나 실시간 시장 업데이트를 처리하는 도구를 생성하는 데 활용할 수 있습니다.
creating-opencode-plugins
메타이 스킬은 개발자들이 명령어, 파일, LSP 작업 등 25개 이상의 이벤트 유형에 연결되는 OpenCode 플러그인을 만들 수 있도록 돕습니다. JavaScript/TypeScript 모듈을 위한 플러그인 구조, 이벤트 API 명세, 구현 패턴을 제공합니다. OpenCode AI 어시스턴트의 라이프사이클을 사용자 정의 이벤트 기반 로직으로 가로채거나, 모니터링하거나, 확장해야 할 때 사용하세요.
sglang
메타SGLang은 RadixAttention 프리픽스 캐싱을 활용하여 JSON, 정규식, 에이전트 워크플로우를 위한 고속 구조화 생성에 특화된 고성능 LLM 서빙 프레임워크입니다. 특히 반복되는 프리픽스가 있는 작업에서 상당히 빠른 추론 속도를 제공하여 복잡한 구조화 출력 및 다중 턴 대화에 이상적입니다. 제약 디코딩이 필요하거나 광범위한 프리픽스 공유가 있는 애플리케이션을 구축할 때는 vLLM과 같은 대안보다 SGLang을 선택하십시오.
