SKILL·CCFC58

test-a2a-interop

Name: test-a2a-interop
Author: pjt222

pjt222

Actualizado 1 month ago

9 vistas

Pruebasaitestingautomationdesign

Acerca de

Esta habilidad prueba la interoperabilidad de agentes A2A mediante la validación del cumplimiento de la Tarjeta de Agente, ejercitando todos los estados del ciclo de vida de tareas, y verificando el manejo de transmisiones en flujo continuo y de errores. Úsela para verificar nuevas implementaciones de servidores A2A, validar la interoperabilidad entre agentes, ejecutar pruebas de conformidad en CI/CD, depurar flujos de trabajo multiagente o certificar agentes para registros de protocolos.

Instalación rápida

Claude Code

Recomendado

Principal

npx skills add pjt222/agent-almanac -a claude-code

Comando PluginAlternativo

/plugin add https://github.com/pjt222/agent-almanac

Git CloneAlternativo

git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/test-a2a-interop

Copia y pega este comando en Claude Code para instalar esta habilidad

Documentación

A2A-Interoperabilitaet testen

Validieren that an A2A agent implementation conforms to the protocol specification by testing Agent Card discovery, task lifecycle management, SSE streaming, Fehlerbehandlung, and multi-agent communication patterns.

Wann verwenden

Verifying a new A2A server implementation vor deployment
Validating interoperability zwischen two or more A2A agents
Running conformance tests as part of CI/CD for A2A services
Debugging failures in multi-agent A2A workflows
Certifying that an agent meets A2A protocol requirements for a registry

Eingaben

Erforderlich: Base URL of the A2A agent under test
Erforderlich: Authentication Zugangsdaten (if the agent requires them)
Optional: Second agent URL for bidirectional interop testing
Optional: Specific skills to test (default: all skills in the Agent Card)
Optional: Testen timeout per task (default: 60 seconds)
Optional: Output format for the conformance report (json, markdown, junit)

Vorgehensweise

Schritt 1: Abrufen and Validieren Agent Cards

1.1. Abrufen the Agent Card from the well-known endpoint:

curl -s https://agent.example.com/.well-known/agent.json -o agent-card.json

1.2. Validieren required top-level fields:

const requiredFields = ["name", "description", "url", "skills"];
for (const field of requiredFields) {
  assert(agentCard[field] !== undefined, `Missing required field: ${field}`);
}

1.3. Validieren each skill entry:

for (const skill of agentCard.skills) {
  assert(skill.id, "Skill missing id");
  assert(skill.name, "Skill missing name");
  assert(skill.description, "Skill missing description");
  assert(
    Array.isArray(skill.inputModes) && skill.inputModes.length > 0,
    `Skill ${skill.id} missing inputModes`
  );
  assert(
    Array.isArray(skill.outputModes) && skill.outputModes.length > 0,
    `Skill ${skill.id} missing outputModes`
  );
}

1.4. Validieren Authentifizierung configuration:

If Authentifizierung.schemes includes oauth2, verify Zugangsdaten.oauth2 has tokenUrl
If authentication.schemes includes apiKey, verify credentials.apiKey has headerName

1.5. Validieren capability flags are boolean values.

1.6. Erfassen validation results in the conformance report:

interface ConformanceResult {
  test: string;
  category: "agent-card" | "lifecycle" | "streaming" | "error-handling" | "interop";
  status: "pass" | "fail" | "skip";
  message?: string;
  duration_ms?: number;
}

Erwartet: Agent Card passes all structural validation checks.

Bei Fehler: Erfassen each validation failure with the specific field and reason. Do not abort; continue testing other aspects. An invalid Agent Card is itself a test result.

Schritt 2: Senden Testen Tasks Covering All Lifecycle States

2.1. Test: Task submission (submitted -> working -> completed)

Senden a task that the agent sollte able to handle basierend auf its declared skills:

const submitResult = await sendJsonRpc(agentUrl, {
  jsonrpc: "2.0",
  id: 1,
  method: "tasks/send",
  params: {
    id: `test-${uuid()}`,
    sessionId: `session-${uuid()}`,
    message: {
      role: "user",
      parts: [{ type: "text", text: skillExamples[0] }],
    },
  },
});

assert(submitResult.result, "tasks/send should return a result");
assert(submitResult.result.id, "Result should include task ID");
assert(
  ["submitted", "working", "completed"].includes(submitResult.result.status.state),
  `Unexpected initial state: ${submitResult.result.status.state}`
);

2.2. Test: Task polling (tasks/get)

Poll until the task reaches a terminal state:

let task = submitResult.result;
const startTime = Date.now();
while (!["completed", "failed", "canceled"].includes(task.status.state)) {
  if (Date.now() - startTime > TEST_TIMEOUT_MS) {
    fail(`Task ${task.id} did not complete within ${TEST_TIMEOUT_MS}ms`);
    break;
  }
  await sleep(1000);
  const getResult = await sendJsonRpc(agentUrl, {
    jsonrpc: "2.0",
    id: 2,
    method: "tasks/get",
    params: { id: task.id },
  });
  task = getResult.result;
}

assert(task.status.state === "completed", `Task should complete, got: ${task.status.state}`);

2.3. Test: Task cancellation

Submit a task and sofort cancel it:

const cancelTask = await sendJsonRpc(agentUrl, {
  jsonrpc: "2.0",
  id: 3,
  method: "tasks/send",
  params: { id: `test-cancel-${uuid()}`, sessionId: `session-${uuid()}`, message: { ... } },
});

const cancelResult = await sendJsonRpc(agentUrl, {
  jsonrpc: "2.0",
  id: 4,
  method: "tasks/cancel",
  params: { id: cancelTask.result.id },
});

assert(
  cancelResult.result.status.state === "canceled",
  "Canceled task should be in canceled state"
);

2.4. Test: Input-required state (multi-turn)

If any skill supports multi-turn interaction, send an ambiguous request that should trigger input-required, then provide the follow-up:

// Send ambiguous request
const multiTurnTask = await sendJsonRpc(agentUrl, { ... });

// Poll until input-required or completed
// If input-required, send follow-up
if (task.status.state === "input-required") {
  const followUp = await sendJsonRpc(agentUrl, {
    jsonrpc: "2.0",
    id: 6,
    method: "tasks/send",
    params: {
      id: task.id,
      sessionId: task.sessionId,
      message: { role: "user", parts: [{ type: "text", text: "Column A and Column B" }] },
    },
  });
  assert(
    ["working", "completed"].includes(followUp.result.status.state),
    "Follow-up should resume task"
  );
}

2.5. Test: State transition history

If the Agent Card declares stateTransitionHistory: true:

const getWithHistory = await sendJsonRpc(agentUrl, {
  jsonrpc: "2.0",
  id: 7,
  method: "tasks/get",
  params: { id: completedTaskId, historyLength: 100 },
});

assert(
  Array.isArray(getWithHistory.result.history),
  "Task should include history array"
);
assert(
  getWithHistory.result.history.length >= 2,
  "History should have at least 2 entries (submitted and completed)"
);

Erwartet: All lifecycle state transitions work korrekt. Tasks complete erfolgreich, cancel cleanly, and multi-turn interaction functions when supported.

Bei Fehler: Erfassen the specific state transition that failed, the expected state, and the actual state. Einschliessen the full JSON-RPC response in der Bericht for debugging.

Schritt 3: Validieren SSE Streaming Responses

3.1. Ueberspringen this step if the Agent Card declares streaming: false.

3.2. Senden a tasks/sendSubscribe request and validate the SSE stream:

const response = await fetch(`${agentUrl}/subscribe`, {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    jsonrpc: "2.0",
    id: 10,
    method: "tasks/sendSubscribe",
    params: {
      id: `test-stream-${uuid()}`,
      sessionId: `session-${uuid()}`,
      message: { role: "user", parts: [{ type: "text", text: "Stream test task" }] },
    },
  }),
});

assert(
  response.headers.get("content-type")?.includes("text/event-stream"),
  "Response must be text/event-stream"
);

3.3. Parsen SSE events and validate structure:

const events: SSEEvent[] = [];
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = "";

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  buffer += decoder.decode(value, { stream: true });

  // Parse SSE events from buffer
  const lines = buffer.split("\n");
  for (const line of lines) {
    if (line.startsWith("event: ")) {
      currentEvent.type = line.slice(7);
    } else if (line.startsWith("data: ")) {
      currentEvent.data = JSON.parse(line.slice(6));
      events.push(currentEvent);
    }
  }
}

3.4. Validieren the event sequence:

First event sollte a status event with state submitted or working
Intermediate events may include status updates and artifact deliveries
Final event should have final: true with a terminal state
No events should arrive nach the final event

3.5. Validieren that SSE connection cleanup works:

Schliessen the connection mid-stream
Verifizieren the task can still be retrieved via tasks/get
Verifizieren no server errors from the premature disconnect

Erwartet: SSE stream delivers korrekt formatted events in the right sequence, ending with a final terminal event.

Bei Fehler: If SSE is advertised but the endpoint returns a non-SSE response, record as a conformance failure. If events arrive out of order, record the sequence. If the stream never terminates, record a timeout.

Schritt 4: Testen Error Handling and Edge Cases

4.1. Test: Unknown method

const unknownMethod = await sendJsonRpc(agentUrl, {
  jsonrpc: "2.0",
  id: 20,
  method: "tasks/nonexistent",
  params: {},
});
assert(unknownMethod.error?.code === -32601, "Should return method not found");

4.2. Test: Malformed JSON-RPC request

const malformed = await fetch(agentUrl, {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: '{"not": "valid jsonrpc"}',
});
const response = await malformed.json();
assert(response.error?.code === -32600, "Should return invalid request");

4.3. Test: Get nonexistent task

const notFound = await sendJsonRpc(agentUrl, {
  jsonrpc: "2.0",
  id: 22,
  method: "tasks/get",
  params: { id: "nonexistent-task-id" },
});
assert(notFound.error, "Should return error for nonexistent task");

4.4. Test: Cancel already completed task

const cancelCompleted = await sendJsonRpc(agentUrl, {
  jsonrpc: "2.0",
  id: 23,
  method: "tasks/cancel",
  params: { id: completedTaskId },
});
assert(cancelCompleted.error, "Should error when canceling completed task");

4.5. Test: Authentication enforcement

If Authentifizierung is configured, send a request ohne Zugangsdaten:

const unauthResponse = await fetch(agentUrl, {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ jsonrpc: "2.0", id: 24, method: "tasks/get", params: { id: "x" } }),
});
assert(unauthResponse.status === 401, "Should reject unauthenticated requests");

4.6. Test: Agent Card is publicly accessible ohne auth

const publicCard = await fetch(`${agentUrl}/.well-known/agent.json`);
assert(publicCard.status === 200, "Agent Card should be publicly accessible");

Erwartet: All error conditions return appropriate JSON-RPC error codes ohne crashing der Server.

Bei Fehler: Erfassen each Fehlerbehandlung test that fails. Server crashes waehrend error testing are critical failures that muss fixed vor deployment.

Schritt 5: Generieren Interoperability Conformance Report

5.1. Aggregate all test results into a structured report:

interface ConformanceReport {
  agentUrl: string;
  agentName: string;
  agentVersion: string;
  testDate: string;
  summary: {
    total: number;
    passed: number;
    failed: number;
    skipped: number;
  };
  categories: {
    agentCard: ConformanceResult[];
    lifecycle: ConformanceResult[];
    streaming: ConformanceResult[];
    errorHandling: ConformanceResult[];
    interop: ConformanceResult[];
  };
  conformanceLevel: "full" | "partial" | "minimal" | "non-conformant";
}

5.2. Berechnen the conformance level:

full: All tests pass, einschliesslich streaming and push notifications
partial: Core lifecycle tests pass, some optional features fail
minimal: Agent Card valid and basic task send/get works
non-conformant: Agent Card invalid or basic lifecycle broken

5.3. Generieren der Bericht in die Anfrageed format:

json: Machine-readable for CI/CD integration
markdown: Human-readable with pass/fail tables
junit: XML format for test framework integration

5.4. Einschliessen recommendations for fixing failures:

## Failed Tests

| Test | Category | Message | Recommendation |
|------|----------|---------|----------------|
| cancel-completed-task | error-handling | Server returned 500 | Add guard for terminal state transitions |
| sse-final-event | streaming | No final event received | Ensure SSE sends event with final:true |

5.5. If bidirectional testing was requested (two agents), validate:

Agent A can discover Agent B's Agent Card
Agent A can send a task to Agent B
Agent B can send a task to Agent A
Both agents handle concurrent tasks ohne interference

Erwartet: A complete conformance report with pass/fail results, conformance level, and actionable recommendations.

Bei Fehler: If der Bericht generation itself fails, output raw test results to stdout as a fallback. The test data should never be lost due to a reporting error.

Validierung

Agent Card is fetched and structurally validated
At least one task completes the full lifecycle (submitted -> working -> completed)
Task cancellation works korrekt
Error responses use correct JSON-RPC error codes
SSE streaming is tested if advertised in capabilities
Authentication is enforced on task endpoints but not on Agent Card
Conformance report is generated in die Anfrageed format
Failed tests include actionable remediation guidance
Testen suite can run in CI/CD ohne manual intervention

Haeufige Stolperfallen

Testing gegen a cold server: Some agents take time to initialize. Hinzufuegen a health check or warmup request vor running tests.
Hardcoded test data: Use dynamic task and session IDs (UUIDs) to avoid collisions when running tests repeatedly. Never assume a specific task ID ist verfuegbar.
Ignoring timing: Task transitions are asynchronous. Always poll with backoff anstatt asserting immediate state changes.
SSE parsing complexity: SSE events may span multiple chunks. Buffer incoming data and parse complete events, not raw chunks.
Testing only the happy path: Error handling tests are as important as success tests. Malformed requests, invalid transitions, and auth failures must all be covered.
Network Abhaengigkeit: Tests sollte runnable gegen localhost for development and remote URLs for production. Parameterize the agent URL.
Assuming skill behavior: The Testsuite validates protocol conformance, not skill correctness. Use example phrases from the Agent Card to trigger skills, but nicht assert specific output content.

Repositorio GitHub

pjt222/agent-almanac

Ruta: i18n/de/skills/test-a2a-interop

agentsagentskillsai-assisted-developmentclaude-codeskillsteams

FAQ

Frequently asked questions

What is the test-a2a-interop skill?

test-a2a-interop is a Claude Skill by pjt222. Skills package instructions and resources that Claude loads on demand, so Claude can perform test-a2a-interop-related tasks without extra prompting.

How do I install test-a2a-interop?

Use the install commands on this page: add test-a2a-interop to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does test-a2a-interop belong to?

test-a2a-interop is in the Testing category, tagged ai, testing, automation and design.

Is test-a2a-interop free to use?

Yes. test-a2a-interop is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

Habilidades relacionadas

evaluating-llms-harness

Pruebas

Esta Skill de Claude ejecuta el benchmark lm-evaluation-harness para evaluar modelos de lenguaje en más de 60 tareas académicas estandarizadas como MMLU y GSM8K. Está diseñada para que los desarrolladores comparen la calidad de los modelos, realicen seguimiento del progreso del entrenamiento o reporten resultados académicos. La herramienta admite varios backends, incluidos modelos de HuggingFace y vLLM.

Ver habilidad

cloudflare-cron-triggers

Pruebas

Esta habilidad proporciona conocimiento integral para implementar Cron Triggers de Cloudflare y programar Workers mediante expresiones cron. Cubre la configuración de tareas periódicas, trabajos de mantenimiento y flujos de trabajo automatizados, manejando problemas comunes como expresiones cron inválidas y inconvenientes de zonas horarias. Los desarrolladores pueden utilizarla para configurar manejadores programados, probar activadores cron e integrar con Workflows y Green Compute.

Ver habilidad

webapp-testing

Pruebas

Esta habilidad de Claude proporciona un kit de herramientas basado en Playwright para probar aplicaciones web locales mediante scripts de Python. Permite verificación de frontend, depuración de interfaz de usuario, captura de pantallas y visualización de registros, mientras gestiona los ciclos de vida del servidor. Úsela para tareas de automatización de navegadores, pero ejecute los scripts directamente en lugar de leer su código fuente para evitar contaminación del contexto.

Ver habilidad

finishing-a-development-branch

Pruebas

Esta habilidad ayuda a los desarrolladores a completar el trabajo terminado verificando que las pruebas pasen y luego presentando opciones estructuradas de integración. Guía el flujo de trabajo para fusionar, crear PRs o limpiar ramas después de que se completa la implementación. Úsala cuando tu código esté listo y probado para finalizar sistemáticamente el proceso de desarrollo.

Ver habilidad

test-a2a-interop

Acerca de

Instalación rápida

Claude Code

Documentación

A2A-Interoperabilitaet testen

Wann verwenden

Eingaben

Vorgehensweise

Schritt 1: Abrufen and Validieren Agent Cards

Schritt 2: Senden Testen Tasks Covering All Lifecycle States

Schritt 3: Validieren SSE Streaming Responses

Schritt 4: Testen Error Handling and Edge Cases

Schritt 5: Generieren Interoperability Conformance Report

Validierung

Haeufige Stolperfallen

Verwandte Skills

Repositorio GitHub

Frequently asked questions

What is the test-a2a-interop skill?

How do I install test-a2a-interop?

What category does test-a2a-interop belong to?

Is test-a2a-interop free to use?

Habilidades relacionadas