SKILL·4721D9

conduct-empirical-wire-capture

Name: conduct-empirical-wire-capture
Author: pjt222

pjt222

Updated 1 month ago

9 views

Designdesign

About

This skill captures runtime HTTP and telemetry data from CLI tools using multiple channels like transcript files or proxies, outputting diff-friendly JSONL. It's used to confirm static findings, obtain payload shapes for re-implementation, or disambiguate actual network behavior. The skill includes an observability table mapping targets to the most efficient capture method.

Quick Install

Claude Code

Recommended

Primary

npx skills add pjt222/agent-almanac -a claude-code

Plugin CommandAlternative

/plugin add https://github.com/pjt222/agent-almanac

Git CloneAlternative

git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/conduct-empirical-wire-capture

Copy and paste this command in Claude Code to install this skill

Documentation

Conduct Empirical Wire Capture

Set up reproducible wire-capture harness for CLI tool's outbound HTTP + telemetry → match each observability target to cheapest channel that captures it.

Scope and Ethics

Read this before configuring any capture.

Wire capture = your own reqs vs. your own account, on your own machine. Capturing other users' traffic = exfiltration, not research, out of scope.
Credentials almost always appear in raw wire out. Redact at capture time (Step 6) → never "capture now, redact later."
Capture = observation, not modification. Don't use captured payloads to bypass server-side rate limits, replay another user's session, or activate dark-launched capability w/o auth.
This skill's out = internal artifact. Public publication of wire findings goes through redact-for-public-disclosure (Phase 5 of parent guide), not this skill.

Use When

Static finding (flag, endpoint ref, telemetry-event name) needs runtime confirmation it actually fires.
Payload shape needed for client re-impl, tracing instrumentation, or cross-version diff.
Dark-vs-live disambiguation → watch what binary actually sends, not what bundle suggests.
Behavior changed silently between vers → want reproducible artifact to compare vs. future vers.

Do NOT use for: version baselining (use monitor-binary-version-baselines), flag-state probing (use probe-feature-flag-state), or preparing redacted artifacts for public publication (use redact-for-public-disclosure).

In

Required: CLI harness binary runnable locally vs. own account.
Required: Specific question (e.g., "does endpoint X fire on event Y?", "what payload shape for telemetry event Z?"). Capture w/o question → log nobody reads.
Optional: Static findings from prior phases (marker catalog, candidate flag list, suspected endpoints) → scope capture targets.
Optional: Private workspace path for capture artifacts. Default ./captures/ → must be in .gitignore.

Do

Step 1: Build Observability Table First

Before configuring any capture → enumerate questions + map each to capture channel. One row per target.

target	observable via	blocker
Outbound HTTP to endpoint X	verbose-fetch stderr	TUI noise pollutes terminal
Telemetry event Y on user action	hook-driven subprocess	requires harness hook surface
Token-refresh handshake	outbound HTTP proxy	cert trust required
Scheduled-task lifecycle event	long-running session capture	wallclock alignment
Local config mutation	on-disk state diff	none — cheapest channel

Common channels, cheapest first:

On-disk state file mutation — harness writes state to known path → diff between snapshots = free.
Transcript file — harness already writes session transcript → parse direct. No instrumentation.
Verbose-fetch stderr — bundler-provided env var (e.g., bun's BUN_CONFIG_VERBOSE_FETCH=curl) routes every fetch to stderr. Noisy but captures every fetch.
Hook-driven subprocess — harness exposes lifecycle hooks (UserPromptSubmit, Stop, etc.) → spawn short capture subprocess per event.
Long-running session capture — one proc across session, wallclock-tagged. Use for sequences.
Outbound HTTP proxy — clean separation, but requires CA cert trust + breaks when harness pins certs.

Pick cheapest channel capturing target. 3-target capture answering one specific question > 20-target capture answering none.

→ Observability table w/ one row per question, each annotated w/ channel + known blockers. Targets w/o viable channel → flag "out of scope this session."

If err: Every target lands in proxy column → table too ambitious. Trim to 1-2 highest-value questions, revisit lower-cost channels for them.

Step 2: Prepare Disposable Workspace

Wire capture pollutes terminals, leaves files in unexpected places, may leak credentials into logs.

mkdir -p captures/$(date -u +%Y-%m-%dT%H-%M-%S)
cd captures/$(date -u +%Y-%m-%dT%H-%M-%S)
echo 'captures/' >> ../../.gitignore
git check-ignore captures/ || echo "WARNING: captures/ not git-ignored"

Confirm capture session ≠ primary working session → verbose-fetch + TUI rendering interfere.

→ Timestamped capture dir, git-ignored, separate from working session.

If err: git check-ignore reports dir not ignored → fix .gitignore before any capture cmd. Don't proceed w/ creds at risk.

Step 3: Hook-Driven Capture for Per-Event Targets

Target = discrete event (tool invocation, prompt submit, session stop) → use harness's hook surface. Spawn short-lived capture subprocess per event; don't sit in-process.

Pattern (synthetic example):

# Hook script, registered with the harness's hook config.
# Invoked once per event; writes one JSONL line; exits.
#!/usr/bin/env bash
set -euo pipefail
TS=$(date -u +%Y-%m-%dT%H:%M:%S.%3NZ)
EVENT="${1:-unknown}"
PAYLOAD=$(jq -c --arg ts "$TS" --arg ev "$EVENT" \
  '{ts:$ts, source:"hook", target:$ev, payload:.}' < /dev/stdin)
echo "$PAYLOAD" >> "$CAPTURE_DIR/events.jsonl"

Why subprocess-per-event:

No token state, no session coupling → each invocation indep.
Fail of one capture doesn't contaminate next.
Subprocess overhead OK → events rare (per-user-action, not per-byte).

→ One JSONL line per fired event in events.jsonl, each well-formed JSON parseable w/ jq.

If err: jq reports parse errs → payload has unescaped control chars / binary data → pipe through jq -R (raw in) + base64-encode payload field.

Step 4: Long-Running Session Capture for Sequential State

Target = sequence (multi-turn handshake, scheduled-task lifecycle, retry/backoff state machine) → one capture proc across session, wallclock-tagged.

# Run the harness with verbose-fetch routed to a tee-d log.
BUN_CONFIG_VERBOSE_FETCH=curl harness-cli run-task 2> >(
  while IFS= read -r line; do
    printf '%s\t%s\n' "$(date -u +%Y-%m-%dT%H:%M:%S.%3NZ)" "$line"
  done >> "$CAPTURE_DIR/session.tsv"
)

Wallclock prefix makes ordering unambiguous when multi captures run concurrent. TSV (tab-separated) intentional → survives shells that mangle JSON quoting on stderr.

Convert TSV → JSONL after session ends (Step 5), not during.

→ TSV log w/ monotonic timestamps, one stderr line per row.

If err: Timestamps go backwards → harness buffering stderr → re-run w/ stdbuf -oL -eL or bundler's line-buffer flag.

Step 5: Normalize to JSONL

JSONL = artifact format: one JSON object per line, fields timestamp, source, target, payload. Diff-friendly, jq-filterable, stable across editor reloads.

# Parse the TSV from Step 4 into JSONL.
awk -F'\t' '{
  printf "{\"timestamp\":\"%s\",\"source\":\"verbose-fetch\",\"target\":\"%s\",\"payload\":%s}\n",
    $1, "session", $2
}' < session.tsv | jq -c . > session.jsonl

Valid. every line parses:

while IFS= read -r line; do
  echo "$line" | jq -e . > /dev/null || echo "BAD LINE: $line"
done < session.jsonl

Typical filter usage:

# Show only requests to a specific endpoint pattern.
jq -c 'select(.payload | tostring | test("/api/v1/example"))' session.jsonl

# Show timing between consecutive captures.
jq -r '.timestamp' session.jsonl | sort | uniq -c

→ Every line of *.jsonl parses w/ jq -e .; no BAD LINE warns.

If err: Some lines fail valid. → source TSV had embedded tabs in payload → re-run Step 4 w/ diff delimiter or base64-encode second field.

Step 6: Redact at Capture Time

Strip auth headers, session IDs, bearer tokens, PII before writing to disk. events.jsonl + session.jsonl should not, on first write, contain a single secret.

# Stream the raw capture through a redactor before persisting.
redact() {
  sed -E \
    -e 's/(authorization:[[:space:]]*Bearer[[:space:]]+)[A-Za-z0-9._-]+/\1<REDACTED>/gi' \
    -e 's/(x-api-key:[[:space:]]*)[A-Za-z0-9._-]+/\1<REDACTED>/gi' \
    -e 's/(cookie:[[:space:]]*)[^;]+/\1<REDACTED>/gi' \
    -e 's/("password"[[:space:]]*:[[:space:]]*)"[^"]*"/\1"<REDACTED>"/g' \
    -e 's/("token"[[:space:]]*:[[:space:]]*)"[^"]*"/\1"<REDACTED>"/g'
}

cat raw-capture.txt | redact > session.tsv

Post-capture, valid. nothing slipped through:

# Patterns that must not appear in any *.jsonl file.
grep -Ei 'bearer [A-Za-z0-9]{20,}|sk-[A-Za-z0-9]{20,}|ghp_[A-Za-z0-9]{20,}' captures/ \
  && { echo "LEAK DETECTED"; exit 1; } \
  || echo "redaction clean"

captured-then-redacted artifact always leaks something. Only safe pattern = redacted-as-captured. Unredacted token found in finalized artifact → treat whole capture as compromised → delete, rotate credential, re-run.

→ LEAK DETECTED check exits 0 (no matches). grep for known credential prefixes returns nothing.

If err: Leak check finds hit → don't edit file in place. Delete whole capture dir, extend redactor regex to cover leaked pattern category, re-run from Step 3 or 4.

Step 7: Classify Response Categories Before Recording

HTTP status codes carry diff semantic weight in diff contexts. Classify before recording → downstream jq filters operate on intent, not raw codes.

Observed status	Channel context	Classification
200 / 201	Any	success
401 on token-refresh endpoint	Handshake	expected handshake step
401 on data endpoint	After auth	auth failure (real)
404 on lazy-loaded resource	First fetch	expected miss
404 on documented endpoint	After feature gate	gate-induced absence
429	Any	rate-limit (back off; do not retry tight)
5xx	Any	server failure (record, do not assume)

Add class field at capture time:

jq -c '. + {class: (
  if (.payload.status == 401 and (.target | test("token|refresh"))) then "handshake"
  elif (.payload.status >= 200 and .payload.status < 300) then "success"
  elif (.payload.status == 401) then "auth-fail"
  elif (.payload.status == 429) then "rate-limit"
  elif (.payload.status >= 500) then "server-fail"
  else "other" end)}' session.jsonl > session.classified.jsonl

401 on token-refresh channel ≠ failure → first half of handshake. Misclassifying handshake steps as failures produces false-positive findings wasting reviewer attention.

→ Every line in *.classified.jsonl has class field w/ known value.

If err: Classification produces many other entries → table above incomplete for this harness → extend w/ one row per recurring other pattern before analysis.

Step 8: Persist Capture Manifest

Capture run reproducible only if inputs recorded alongside outs. Write manifest:

cat > capture-manifest.json <<EOF
{
  "captured_at": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
  "harness_version": "$(harness-cli --version 2>/dev/null || echo unknown)",
  "channel": "verbose-fetch",
  "question": "Does endpoint X fire on event Y?",
  "targets": ["endpoint-X", "event-Y"],
  "files": ["session.jsonl", "session.classified.jsonl"],
  "redaction_check": "passed"
}
EOF

Manifest = what makes capture diff-able vs. future vers.

→ capture-manifest.json exists, parses w/ jq, lists every artifact file in capture dir.

If err: Harness has no ver flag → record binary's sha256sum instead. Unidentified binary → uncomparable captures.

Check

Observability table built before any capture cmd run
Capture dir git-ignored + timestamped
Every *.jsonl file parses w/ jq -e . line-by-line
Redaction leak-check returns no matches for known credential prefixes
Each captured event has class field w/ known value
capture-manifest.json records harness ver (or sha256), channel, question
Capture dir contains only targets enumerated in Step 1 (no incidental traffic from other apps)

Traps

Capture-first, question-later: Log nobody reads = wasted disk + attention. Build observability table first; capture only what answers specific question.
Reach for mitmproxy first: Outbound proxy = most invasive channel. Requires cert trust, breaks on cert pinning, pollutes harness env. Use only when on-disk, transcript, verbose-fetch, hook channels all blocked.
Capture in primary working session: Verbose-fetch stderr bleeds into TUI rendering → can leak fragments of other work into capture. Always use disposable shell.
"We'll redact later": Every captured-then-redacted artifact has leaked credential at least once. Redact at capture time or don't capture.
Treat 4xx as fail uniformly: 401 on token-refresh channel = handshake step, not failure. Classify response categories per channel context (Step 7) before drawing conclusions.
Long-running capture for per-event targets: Session-long proc to capture 3 discrete events couples token state across captures → one bad event poisons next. Use hook-driven subprocesses for events; reserve session capture for sequences.
No manifest: JSONL file w/o capture-manifest.json not reproducible → can't diff vs. next month's binary if you don't know which ver produced it.
Capture other users' traffic: Out of scope. Wire capture = own account on own machine. Capture incidentally records another user's req → delete capture + tighten channel.

→

monitor-binary-version-baselines — Phase 1 parent methodology; produces version baseline this skill's manifest references.
probe-feature-flag-state — Phases 2-3; wire capture = one of its evidence prongs, this skill teaches capture half.
instrument-distributed-tracing — shares JSONL-over-wallclock philosophy; applied here to single binary vs. service mesh.
redact-for-public-disclosure — Phase 5; this skill covers only capture-time redaction for internal use, not publication-bar redaction needed before any capture leaves private workspace.

GitHub Repository

pjt222/agent-almanac

Path: i18n/caveman-ultra/skills/conduct-empirical-wire-capture

agentsagentskillsai-assisted-developmentclaude-codeskillsteams

FAQ

Frequently asked questions

What is the conduct-empirical-wire-capture skill?

conduct-empirical-wire-capture is a Claude Skill by pjt222. Skills package instructions and resources that Claude loads on demand, so Claude can perform conduct-empirical-wire-capture-related tasks without extra prompting.

How do I install conduct-empirical-wire-capture?

Use the install commands on this page: add conduct-empirical-wire-capture to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does conduct-empirical-wire-capture belong to?

conduct-empirical-wire-capture is in the Design category, tagged design.

Is conduct-empirical-wire-capture free to use?

Yes. conduct-empirical-wire-capture is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

Related Skills

executing-plans

Design

Use the executing-plans skill when you have a complete implementation plan to execute in controlled batches with review checkpoints. It loads and critically reviews the plan, then executes tasks in small batches (default 3 tasks) while reporting progress between each batch for architect review. This ensures systematic implementation with built-in quality control checkpoints.

View skill

requesting-code-review

Design

This skill dispatches a code-reviewer subagent to analyze code changes against requirements before proceeding. It should be used after completing tasks, implementing major features, or before merging to main. The review helps catch issues early by comparing the current implementation with the original plan.

View skill

connect-mcp-server

Design

This skill provides a comprehensive guide for developers to connect MCP servers to Claude Code using HTTP, stdio, or SSE transports. It covers installation, configuration, authentication, and security for integrating external services like GitHub, Notion, and custom APIs. Use it when setting up MCP integrations, configuring external tools, or working with Claude's Model Context Protocol.

View skill

web-cli-teleport

Design

This skill helps developers choose between Claude Code Web and CLI interfaces based on task analysis, then enables seamless session teleportation between these environments. It optimizes workflow by managing session state and context when switching between web, CLI, or mobile. Use it for complex projects requiring different tools at various stages.

View skill