test-a2a-interop
À propos
Cette compétence teste l'interopérabilité des agents A2A en validant la conformité des Cartes d'Agent, en exerçant tous les états du cycle de vie des tâches, et en vérifiant la gestion des flux et des erreurs. Utilisez-la pour vérifier les nouvelles implémentations de serveurs A2A, valider l'interopérabilité entre les agents, exécuter des tests de conformité en CI/CD, déboguer des workflows multi-agents, ou certifier des agents pour les registres de protocole.
Installation rapide
Claude Code
Recommandénpx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/test-a2a-interopCopiez et collez cette commande dans Claude Code pour installer cette compétence
Documentation
A2A-Interoperabilitaet testen
Validieren that an A2A agent implementation conforms to the protocol specification by testing Agent Card discovery, task lifecycle management, SSE streaming, Fehlerbehandlung, and multi-agent communication patterns.
Wann verwenden
- Verifying a new A2A server implementation vor deployment
- Validating interoperability zwischen two or more A2A agents
- Running conformance tests as part of CI/CD for A2A services
- Debugging failures in multi-agent A2A workflows
- Certifying that an agent meets A2A protocol requirements for a registry
Eingaben
- Erforderlich: Base URL of the A2A agent under test
- Erforderlich: Authentication Zugangsdaten (if the agent requires them)
- Optional: Second agent URL for bidirectional interop testing
- Optional: Specific skills to test (default: all skills in the Agent Card)
- Optional: Testen timeout per task (default: 60 seconds)
- Optional: Output format for the conformance report (
json,markdown,junit)
Vorgehensweise
Schritt 1: Abrufen and Validieren Agent Cards
1.1. Abrufen the Agent Card from the well-known endpoint:
curl -s https://agent.example.com/.well-known/agent.json -o agent-card.json
1.2. Validieren required top-level fields:
const requiredFields = ["name", "description", "url", "skills"];
for (const field of requiredFields) {
assert(agentCard[field] !== undefined, `Missing required field: ${field}`);
}
1.3. Validieren each skill entry:
for (const skill of agentCard.skills) {
assert(skill.id, "Skill missing id");
assert(skill.name, "Skill missing name");
assert(skill.description, "Skill missing description");
assert(
Array.isArray(skill.inputModes) && skill.inputModes.length > 0,
`Skill ${skill.id} missing inputModes`
);
assert(
Array.isArray(skill.outputModes) && skill.outputModes.length > 0,
`Skill ${skill.id} missing outputModes`
);
}
1.4. Validieren Authentifizierung configuration:
- If
Authentifizierung.schemesincludesoauth2, verifyZugangsdaten.oauth2hastokenUrl - If
authentication.schemesincludesapiKey, verifycredentials.apiKeyhasheaderName
1.5. Validieren capability flags are boolean values.
1.6. Erfassen validation results in the conformance report:
interface ConformanceResult {
test: string;
category: "agent-card" | "lifecycle" | "streaming" | "error-handling" | "interop";
status: "pass" | "fail" | "skip";
message?: string;
duration_ms?: number;
}
Erwartet: Agent Card passes all structural validation checks.
Bei Fehler: Erfassen each validation failure with the specific field and reason. Do not abort; continue testing other aspects. An invalid Agent Card is itself a test result.
Schritt 2: Senden Testen Tasks Covering All Lifecycle States
2.1. Test: Task submission (submitted -> working -> completed)
Senden a task that the agent sollte able to handle basierend auf its declared skills:
const submitResult = await sendJsonRpc(agentUrl, {
jsonrpc: "2.0",
id: 1,
method: "tasks/send",
params: {
id: `test-${uuid()}`,
sessionId: `session-${uuid()}`,
message: {
role: "user",
parts: [{ type: "text", text: skillExamples[0] }],
},
},
});
assert(submitResult.result, "tasks/send should return a result");
assert(submitResult.result.id, "Result should include task ID");
assert(
["submitted", "working", "completed"].includes(submitResult.result.status.state),
`Unexpected initial state: ${submitResult.result.status.state}`
);
2.2. Test: Task polling (tasks/get)
Poll until the task reaches a terminal state:
let task = submitResult.result;
const startTime = Date.now();
while (!["completed", "failed", "canceled"].includes(task.status.state)) {
if (Date.now() - startTime > TEST_TIMEOUT_MS) {
fail(`Task ${task.id} did not complete within ${TEST_TIMEOUT_MS}ms`);
break;
}
await sleep(1000);
const getResult = await sendJsonRpc(agentUrl, {
jsonrpc: "2.0",
id: 2,
method: "tasks/get",
params: { id: task.id },
});
task = getResult.result;
}
assert(task.status.state === "completed", `Task should complete, got: ${task.status.state}`);
2.3. Test: Task cancellation
Submit a task and sofort cancel it:
const cancelTask = await sendJsonRpc(agentUrl, {
jsonrpc: "2.0",
id: 3,
method: "tasks/send",
params: { id: `test-cancel-${uuid()}`, sessionId: `session-${uuid()}`, message: { ... } },
});
const cancelResult = await sendJsonRpc(agentUrl, {
jsonrpc: "2.0",
id: 4,
method: "tasks/cancel",
params: { id: cancelTask.result.id },
});
assert(
cancelResult.result.status.state === "canceled",
"Canceled task should be in canceled state"
);
2.4. Test: Input-required state (multi-turn)
If any skill supports multi-turn interaction, send an ambiguous request that should trigger input-required, then provide the follow-up:
// Send ambiguous request
const multiTurnTask = await sendJsonRpc(agentUrl, { ... });
// Poll until input-required or completed
// If input-required, send follow-up
if (task.status.state === "input-required") {
const followUp = await sendJsonRpc(agentUrl, {
jsonrpc: "2.0",
id: 6,
method: "tasks/send",
params: {
id: task.id,
sessionId: task.sessionId,
message: { role: "user", parts: [{ type: "text", text: "Column A and Column B" }] },
},
});
assert(
["working", "completed"].includes(followUp.result.status.state),
"Follow-up should resume task"
);
}
2.5. Test: State transition history
If the Agent Card declares stateTransitionHistory: true:
const getWithHistory = await sendJsonRpc(agentUrl, {
jsonrpc: "2.0",
id: 7,
method: "tasks/get",
params: { id: completedTaskId, historyLength: 100 },
});
assert(
Array.isArray(getWithHistory.result.history),
"Task should include history array"
);
assert(
getWithHistory.result.history.length >= 2,
"History should have at least 2 entries (submitted and completed)"
);
Erwartet: All lifecycle state transitions work korrekt. Tasks complete erfolgreich, cancel cleanly, and multi-turn interaction functions when supported.
Bei Fehler: Erfassen the specific state transition that failed, the expected state, and the actual state. Einschliessen the full JSON-RPC response in der Bericht for debugging.
Schritt 3: Validieren SSE Streaming Responses
3.1. Ueberspringen this step if the Agent Card declares streaming: false.
3.2. Senden a tasks/sendSubscribe request and validate the SSE stream:
const response = await fetch(`${agentUrl}/subscribe`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
jsonrpc: "2.0",
id: 10,
method: "tasks/sendSubscribe",
params: {
id: `test-stream-${uuid()}`,
sessionId: `session-${uuid()}`,
message: { role: "user", parts: [{ type: "text", text: "Stream test task" }] },
},
}),
});
assert(
response.headers.get("content-type")?.includes("text/event-stream"),
"Response must be text/event-stream"
);
3.3. Parsen SSE events and validate structure:
const events: SSEEvent[] = [];
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
// Parse SSE events from buffer
const lines = buffer.split("\n");
for (const line of lines) {
if (line.startsWith("event: ")) {
currentEvent.type = line.slice(7);
} else if (line.startsWith("data: ")) {
currentEvent.data = JSON.parse(line.slice(6));
events.push(currentEvent);
}
}
}
3.4. Validieren the event sequence:
- First event sollte a
statusevent with statesubmittedorworking - Intermediate events may include
statusupdates andartifactdeliveries - Final event should have
final: truewith a terminal state - No events should arrive nach the final event
3.5. Validieren that SSE connection cleanup works:
- Schliessen the connection mid-stream
- Verifizieren the task can still be retrieved via
tasks/get - Verifizieren no server errors from the premature disconnect
Erwartet: SSE stream delivers korrekt formatted events in the right sequence, ending with a final terminal event.
Bei Fehler: If SSE is advertised but the endpoint returns a non-SSE response, record as a conformance failure. If events arrive out of order, record the sequence. If the stream never terminates, record a timeout.
Schritt 4: Testen Error Handling and Edge Cases
4.1. Test: Unknown method
const unknownMethod = await sendJsonRpc(agentUrl, {
jsonrpc: "2.0",
id: 20,
method: "tasks/nonexistent",
params: {},
});
assert(unknownMethod.error?.code === -32601, "Should return method not found");
4.2. Test: Malformed JSON-RPC request
const malformed = await fetch(agentUrl, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: '{"not": "valid jsonrpc"}',
});
const response = await malformed.json();
assert(response.error?.code === -32600, "Should return invalid request");
4.3. Test: Get nonexistent task
const notFound = await sendJsonRpc(agentUrl, {
jsonrpc: "2.0",
id: 22,
method: "tasks/get",
params: { id: "nonexistent-task-id" },
});
assert(notFound.error, "Should return error for nonexistent task");
4.4. Test: Cancel already completed task
const cancelCompleted = await sendJsonRpc(agentUrl, {
jsonrpc: "2.0",
id: 23,
method: "tasks/cancel",
params: { id: completedTaskId },
});
assert(cancelCompleted.error, "Should error when canceling completed task");
4.5. Test: Authentication enforcement
If Authentifizierung is configured, send a request ohne Zugangsdaten:
const unauthResponse = await fetch(agentUrl, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ jsonrpc: "2.0", id: 24, method: "tasks/get", params: { id: "x" } }),
});
assert(unauthResponse.status === 401, "Should reject unauthenticated requests");
4.6. Test: Agent Card is publicly accessible ohne auth
const publicCard = await fetch(`${agentUrl}/.well-known/agent.json`);
assert(publicCard.status === 200, "Agent Card should be publicly accessible");
Erwartet: All error conditions return appropriate JSON-RPC error codes ohne crashing der Server.
Bei Fehler: Erfassen each Fehlerbehandlung test that fails. Server crashes waehrend error testing are critical failures that muss fixed vor deployment.
Schritt 5: Generieren Interoperability Conformance Report
5.1. Aggregate all test results into a structured report:
interface ConformanceReport {
agentUrl: string;
agentName: string;
agentVersion: string;
testDate: string;
summary: {
total: number;
passed: number;
failed: number;
skipped: number;
};
categories: {
agentCard: ConformanceResult[];
lifecycle: ConformanceResult[];
streaming: ConformanceResult[];
errorHandling: ConformanceResult[];
interop: ConformanceResult[];
};
conformanceLevel: "full" | "partial" | "minimal" | "non-conformant";
}
5.2. Berechnen the conformance level:
- full: All tests pass, einschliesslich streaming and push notifications
- partial: Core lifecycle tests pass, some optional features fail
- minimal: Agent Card valid and basic task send/get works
- non-conformant: Agent Card invalid or basic lifecycle broken
5.3. Generieren der Bericht in die Anfrageed format:
- json: Machine-readable for CI/CD integration
- markdown: Human-readable with pass/fail tables
- junit: XML format for test framework integration
5.4. Einschliessen recommendations for fixing failures:
## Failed Tests
| Test | Category | Message | Recommendation |
|------|----------|---------|----------------|
| cancel-completed-task | error-handling | Server returned 500 | Add guard for terminal state transitions |
| sse-final-event | streaming | No final event received | Ensure SSE sends event with final:true |
5.5. If bidirectional testing was requested (two agents), validate:
- Agent A can discover Agent B's Agent Card
- Agent A can send a task to Agent B
- Agent B can send a task to Agent A
- Both agents handle concurrent tasks ohne interference
Erwartet: A complete conformance report with pass/fail results, conformance level, and actionable recommendations.
Bei Fehler: If der Bericht generation itself fails, output raw test results to stdout as a fallback. The test data should never be lost due to a reporting error.
Validierung
- Agent Card is fetched and structurally validated
- At least one task completes the full lifecycle (submitted -> working -> completed)
- Task cancellation works korrekt
- Error responses use correct JSON-RPC error codes
- SSE streaming is tested if advertised in capabilities
- Authentication is enforced on task endpoints but not on Agent Card
- Conformance report is generated in die Anfrageed format
- Failed tests include actionable remediation guidance
- Testen suite can run in CI/CD ohne manual intervention
Haeufige Stolperfallen
- Testing gegen a cold server: Some agents take time to initialize. Hinzufuegen a health check or warmup request vor running tests.
- Hardcoded test data: Use dynamic task and session IDs (UUIDs) to avoid collisions when running tests repeatedly. Never assume a specific task ID ist verfuegbar.
- Ignoring timing: Task transitions are asynchronous. Always poll with backoff anstatt asserting immediate state changes.
- SSE parsing complexity: SSE events may span multiple chunks. Buffer incoming data and parse complete events, not raw chunks.
- Testing only the happy path: Error handling tests are as important as success tests. Malformed requests, invalid transitions, and auth failures must all be covered.
- Network Abhaengigkeit: Tests sollte runnable gegen localhost for development and remote URLs for production. Parameterize the agent URL.
- Assuming skill behavior: The Testsuite validates protocol conformance, not skill correctness. Use example phrases from the Agent Card to trigger skills, but nicht assert specific output content.
Verwandte Skills
design-a2a-agent-card- design the Agent Card being testedimplement-a2a-server- implement der Server being testedbuild-ci-cd-pipeline- integrate conformance tests into CI/CDtroubleshoot-mcp-connection- debugging patterns applicable to A2A connectivityreview-software-architecture- architecture review for multi-agent systems
Dépôt GitHub
Compétences associées
evaluating-llms-harness
TestsCette compétence Claude exécute le lm-evaluation-harness pour évaluer les modèles de langage sur plus de 60 tâches académiques standardisées telles que MMLU et GSM8K. Elle est conçue pour permettre aux développeurs de comparer la qualité des modèles, de suivre les progrès de l'entraînement ou de rapporter des résultats académiques. L'outil prend en charge différents backends, incluant les modèles HuggingFace et vLLM.
cloudflare-cron-triggers
TestsCette compétence fournit une connaissance complète pour la mise en œuvre de Déclencheurs Cron Cloudflare afin de planifier des Workers à l'aide d'expressions cron. Elle couvre la configuration de tâches périodiques, de travaux de maintenance et de flux de travail automatisés, tout en traitant des problèmes courants tels que les expressions cron non valides et les problèmes de fuseau horaire. Les développeurs peuvent l'utiliser pour configurer des gestionnaires planifiés, tester des déclencheurs cron et intégrer avec Workflows et Green Compute.
webapp-testing
TestsCette Compétence Claude fournit une boîte à outils basée sur Playwright pour tester des applications web locales via des scripts Python. Elle permet la vérification frontend, le débogage d'interface utilisateur, la capture d'écrans et la consultation des journaux, tout en gérant les cycles de vie du serveur. Utilisez-la pour les tâches d'automatisation de navigateur, mais exécutez les scripts directement plutôt que de lire leur code source pour éviter la pollution du contexte.
finishing-a-development-branch
TestsCette compétence aide les développeurs à finaliser leur travail en vérifiant que les tests passent, puis en présentant des options d'intégration structurées. Elle guide le processus de fusion, de création de PRs ou de nettoyage des branches une fois l'implémentation terminée. Utilisez-la lorsque votre code est prêt et testé pour finaliser systématiquement le cycle de développement.
