SKILL·94F965

test-team-coordination

Name: test-team-coordination
Author: pjt222

pjt222

Updated 1 month ago

15 views

Metaaitestingdesign

About

This skill executes test scenarios against AI agent teams to validate and compare their coordination patterns. It observes behaviors, evaluates acceptance criteria, and generates structured RESULT.md reports. Use it for validating team performance, comparing coordination strategies, or establishing baseline metrics for team compositions.

Quick Install

Claude Code

Recommended

Primary

npx skills add pjt222/agent-almanac -a claude-code

Plugin CommandAlternative

/plugin add https://github.com/pjt222/agent-almanac

Git CloneAlternative

git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/test-team-coordination

Copy and paste this command in Claude Code to install this skill

Documentation

測試團隊協調

對目標團隊執行 tests/scenarios/teams/ 之測試情境。觀察協調模式行為、評估接受準則、評分準則表，並於 tests/results/ 產出 RESULT.md。

適用時機

驗證團隊協調模式產生預期行為
修改團隊定義或代理後執結構化測試
以同情境跑不同團隊以比較協調模式
為團隊組成建立基線效能指標
加新代理或變團隊成員後做回歸測試

輸入

必要：測試情境檔之路徑（如 tests/scenarios/teams/test-opaque-team-cartographers-audit.md）
選擇性：執行 ID 覆寫（預設：YYYY-MM-DD-<target>-NNN 自動產生）
選擇性：團隊大小覆寫（預設：自情境前置設定）
選擇性：跳過範圍變化（預設：false——若已定義則注入範圍變化）

步驟

步驟一：載入並驗證測試情境

1.1. 讀輸入指定之測試情境檔。

1.2. 解析 YAML 前置設定並萃：

target — 待測之團隊
coordination-pattern — 預期模式
team-size — 待生成之成員數
接受準則表
評分準則表（如有）
真實值資料（如有）

1.3. 驗情境檔含所有必要節：

Objective
Pre-conditions
Task（含 Primary Task 子節）
Expected Behaviors
Acceptance Criteria
Observation Protocol

預期： 情境檔載入、解析且含所有必要節。

失敗時： 若檔缺或不可解析，以辨缺檔或畸形節之錯訊息中止。若選擇性節（Rubric、Ground Truth、Variants）缺，註其缺並續。

步驟二：驗證先決條件

2.1. 走過情境之每先決條件勾選。

2.2. 對檔存在檢查，用 Glob 驗。

2.3. 對註冊表計數檢查，解析相關 _registry.yml 並比對 total_* 與磁碟上實際檔數。

2.4. 對分支／git 狀態檢查，跑 git status --porcelain 與 git branch --show-current。

預期： 所有先決條件已滿足。

失敗時： 若任一先決條件失敗，於結果中記為 BLOCKED。決定是否續（軟先決）或中止（硬先決如缺目標團隊檔）。記錄該決定。

步驟三：載入協調模式準則

3.1. 讀 tests/_registry.yml 並定位匹配情境之 coordination-pattern 值之 coordination_patterns 條目。

3.2. 萃此模式之 key_behaviors 列表。

3.3. 此等行為成觀察清單——執行期間每項皆須觀察並記為已觀察／未觀察。

預期： 模式關鍵行為已載且備觀察。

失敗時： 若協調模式未於註冊表中定義，用情境之 Expected Behaviors 節為唯一觀察源。記警告。

步驟四：執行任務

4.1. 建結果目錄：tests/results/YYYY-MM-DD-<target>-NNN/。

4.2. 記 T0（任務開始時戳）。

4.3. 自 teams/<target>.md 讀目標團隊定義，萃 CONFIG 區塊，並啟動團隊：以團隊名叫 TeamCreate、用每成員之 subagent_type 生成隊友、自 CONFIG tasks 列表建任務。用情境之 team-size。逐字傳情境 Task 節之 Primary Task 提示。

4.4. 觀察團隊執行階段。記時戳：

T1：型態評估／任務分解完成
T2：角色分配可見

4.5. 若情境定義 Scope Change Trigger 且 skip-scope-change 為 false：

待 Phase 2（角色分配）可見
記 T3（範圍變化注入時戳）
透過 SendMessage 傳範圍變化提示予團隊
記 T4（範圍變化已吸收——角色調整可見）

4.6. 續觀察直至團隊遞輸出。

記 T5（整合始）
記 T6（最終報告遞）

4.7. 捕團隊之完整輸出。

預期： 團隊經其協調模式階段執任務。所有轉換皆記時戳。範圍變化（如適用）已注入並吸收。

失敗時： 若團隊未產輸出，記失敗點及任何錯訊息。若團隊停滯，註最後觀察階段與逾時。以部分結果進評估。

步驟五：評估模式行為

5.1. 對步驟三之每關鍵行為，定其於執行期間是否被觀察：

Observed：團隊輸出或協調中之清晰證據
Partial：某證據但不完整或含混
Not observed：無證據

5.2. 對情境之 Expected Behaviors 節之每任務專屬行為，套同評估。

5.3. 將發現記入觀察日誌。

預期： 所有或多數模式專屬與任務專屬行為被觀察。

失敗時： 未觀察之行為為發現，非測試程序之失敗。準確記之——其示協調模式未完全顯現。

步驟六：評估接受準則

6.1. 走過情境之每接受準則。

6.2. 對每準則，賦定：

PASS：準則明確達成且具可觀察證據
PARTIAL：準則部分達成（以 0.5 權重計入閾值）
FAIL：雖有機會準則未達
BLOCKED：無法評估（先決失敗、團隊逾時等）

6.3. 若情境含 Ground Truth 資料，對之驗報告之發現：

計算每類別之準確率
標出偽陽與偽陰

6.4. 若情境含評分準則表，每維度 1-5 評分附簡述。

6.5. 計算摘要指標：

Acceptance：X/N 準則通過（PARTIAL 計 0.5）
Threshold：若 >= 情境定義之閾值則 PASS
Rubric total：X/Y 點（如適用）

預期： 所有接受準則皆有定。摘要指標已計算。

失敗時： 若可評估之準則少於半（過多 BLOCKED），測試執行不確。記其因並建議修先決後重執。

步驟七：產生 RESULT.md

7.1. 用情境 Observation Protocol 之記錄模板，建 tests/results/YYYY-MM-DD-<target>-NNN/RESULT.md。

7.2. 填所有節：

執行後設資料（觀察者、時戳、時長）
含所有所記時戳之階段日誌
角色湧現日誌（對適應／團隊測試）
接受準則結果表
評分準則表（如適用）
真實值驗證表（如適用）
關鍵觀察（敘事）
學習教訓

7.3. 將團隊原始輸出含為附錄或於同結果目錄之分離檔（team-output.md）。

7.4. 於頂加摘要結論：

**Verdict**: PASS | FAIL | INCONCLUSIVE
**Score**: X/N criteria (Y/Z rubric points)
**Duration**: Xm

預期： 完整 RESULT.md 含所有節已填且結論清晰。

失敗時： 若結果檔無法寫，將結果輸出至 stdout 為退路。評估資料絕不應失。

驗證

常見陷阱

評估輸出品質而非協調：此技能測團隊如何協調，非任務輸出是否完美。協調良好但僅找 7/9 損壞參考之團隊仍展現該模式。
過早注入範圍變化：待角色分配清晰可見再注入範圍變化。過早則團隊尚未分化，故無物可調。
將團隊成員輸出與團隊輸出混：不透明團隊應呈統一輸出。若見個別成員報告，此乃關於不透明性之發現，非測試基礎建設問題。
真實值精確匹配：真實值計數為近似。評估發現是否在合理範圍內，非是否精確匹配。
遺忘記時戳：時戳對量階段時長與適應速度至關。事件發生時即設之，非追溯設。

GitHub Repository

pjt222/agent-almanac

Path: i18n/wenyan-lite/skills/test-team-coordination

agentsagentskillsai-assisted-developmentclaude-codeskillsteams

FAQ

Frequently asked questions

What is the test-team-coordination skill?

test-team-coordination is a Claude Skill by pjt222. Skills package instructions and resources that Claude loads on demand, so Claude can perform test-team-coordination-related tasks without extra prompting.

How do I install test-team-coordination?

Use the install commands on this page: add test-team-coordination to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does test-team-coordination belong to?

test-team-coordination is in the Meta category, tagged ai, testing and design.

Is test-team-coordination free to use?

Yes. test-team-coordination is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

Related Skills

content-collections

Meta

This skill provides a production-tested setup for Content Collections, a TypeScript-first tool that transforms Markdown/MDX files into type-safe data collections with Zod validation. Use it when building blogs, documentation sites, or content-heavy Vite + React applications to ensure type safety and automatic content validation. It covers everything from Vite plugin configuration and MDX compilation to deployment optimization and schema validation.

View skill

polymarket

Meta

This skill enables developers to build applications with the Polymarket prediction markets platform, including API integration for trading and market data. It also provides real-time data streaming via WebSocket to monitor live trades and market activity. Use it for implementing trading strategies or creating tools that process live market updates.

View skill

creating-opencode-plugins

Meta

This skill helps developers create OpenCode plugins that hook into 25+ event types like commands, files, and LSP operations. It provides the plugin structure, event API specifications, and implementation patterns for JavaScript/TypeScript modules. Use it when you need to intercept, monitor, or extend the OpenCode AI assistant's lifecycle with custom event-driven logic.

View skill

sglang

Meta

SGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.

View skill

test-team-coordination

About

Quick Install

Claude Code

Documentation

測試團隊協調

適用時機

輸入

步驟

步驟一：載入並驗證測試情境

步驟二：驗證先決條件

步驟三：載入協調模式準則

步驟四：執行任務

步驟五：評估模式行為

步驟六：評估接受準則

步驟七：產生 RESULT.md

驗證

常見陷阱

相關技能

GitHub Repository

Frequently asked questions

What is the test-team-coordination skill?

How do I install test-team-coordination?

What category does test-team-coordination belong to?

Is test-team-coordination free to use?

Related Skills