SKILL·80A430

redact-for-public-disclosure

Name: redact-for-public-disclosure
Author: pjt222

pjt222

업데이트됨 1 month ago

9 조회

커뮤니케이션ai

정보

이 스킬은 역공학 결과물을 공개하기 전에 민감한 세부 정보를 체계적으로 제거하는 방법을 제공합니다. 여기에는 비공개/공개 저장소 분리, 패턴 차단 목록, 실수로 인한 유출을 방지하기 위한 CI 검사 등의 기법이 포함됩니다. 소유하지 않은 소프트웨어에 대한 연구를 발표하거나 비공개 조사 자료로부터 공개 아카이브를 준비할 때 사용하세요.

빠른 설치

Claude Code

문서

Redact for Public Disclosure

Split a reverse-engineering research repo into a private source-of-truth and a public-disclosure subset using a redaction checker, pattern deny-lists, and an orphan-commit publish pattern. Methodology travels; specific findings stay private.

When to Use

Publishing methodology findings about a closed-source CLI harness you integrate with
Preparing an upstream proposal or bug report to a project you do not own
Archiving a private research repo as a public reference
Promoting investigation notes (Phase 1-4 artifacts) into a public guide
Establishing a publish pipeline before findings accumulate so leak risk does not back up
Cleaning up after a near-miss where a draft almost shipped a sensitive identifier

Inputs

Required: A private research repo with mixed-sensitivity content (the source of truth)
Required: A target public mirror (separate repo, or a public/ worktree) where redacted content will be published
Optional: An existing draft slated for publication
Optional: A version-lag policy (defaults to "current + 1 prior stays private")
Optional: A list of vendor identifiers, flag prefixes, or namespaces already known to be sensitive

Procedure

Step 1: Categorize Every Candidate Fact

Before writing or promoting any content, sort each fact into one of four categories. The category determines whether and when it can ship.

Category	Definition	Shareable?
methodology	The how of investigation, independent of any specific finding	Always
generic pattern	Class-level observations (e.g., "harnesses commonly use a single-prefix flag namespace")	Yes
version-specific finding	Concrete observation tied to a specific release (e.g., "in vN.M, the gate defaults off")	Only after the version-lag cool-off
live internal	Minified names, byte offsets, dark flag names, current-version gate logic, PRNG/salt constants, internal codenames	Never

Annotate each draft section, capture log, or note with its category before reviewing for publication. A section that mixes categories splits — methodology lifts out clean, the rest stays private.

Got: Every candidate fact has a category label. Drafts intended for the public mirror contain only methodology and generic-pattern entries (plus version-specific findings older than the cool-off).

If fail: If a fact resists categorization, treat it as a live internal by default. Re-categorize only after explicit review against the version-lag policy.

Step 2: Set the Version-Lag Cool-Off Policy

Decide up front how many versions sit between "current" and "shareable." Two is typical: current + 1 prior remain private, older patterns may be discussed. Write the policy into the private repo (e.g., REDACTION_POLICY.md) so future-you does not have to re-derive it.

# Redaction Policy

Version-lag cool-off: **2 releases**.
- Current release (vN): all version-specific findings PRIVATE.
- Previous release (vN-1): all version-specific findings PRIVATE.
- Releases vN-2 and earlier: version-specific findings may move to public draft after Step 5 review.

Source of truth for "current": output of `monitor-binary-version-baselines`.
Owner: <name>. Reviewed quarterly.

The "current" version must be empirical (read from the installed binary), not administrative. Tie the policy to the baseline scanner output rather than to a calendar.

Got: A committed REDACTION_POLICY.md in the private repo with an explicit cool-off and an owner.

If fail: If stakeholders cannot agree on the cool-off, default to the most conservative proposal. Cool-offs can be shortened later; recalling a leak cannot.

Step 3: Build the Deny-List Scanner

Maintain patterns in a single executable script that is the source of truth for the redaction policy. The script lives in the private repo (tools/check-redaction.sh) and runs against the public mirror.

#!/usr/bin/env bash
set -u
PUBLIC_REPO="${1:-./public}"
LEAKS=0

PATTERNS=(
  "minified identifier shape|<regex matching short bundle-style identifiers>"
  "vendor-prefixed flag|<regex matching the vendor's flag prefix>"
  "PRNG/salt constant|<regex matching the specific constants>"
)

for entry in "${PATTERNS[@]}"; do
  desc="${entry%%|*}"
  pattern="${entry##*|}"
  if rg -q "$pattern" "$PUBLIC_REPO"; then
    echo "LEAK: $desc"; LEAKS=$((LEAKS+1))
  fi
done
exit $LEAKS

Each entry has a human-readable label and a regex. One entry per sensitive identifier shape (not per literal string — shapes survive version churn). The exit code equals the number of leaks; a clean run exits 0.

Got: tools/check-redaction.sh ./public-mirror runs in under a second on a small repo and exits 0 when nothing matches.

If fail: If rg is unavailable, fall back to grep -rqE. If patterns are too broad (every run reports leaks), narrow them at the source rather than adding suppressions.

Step 4: Maintain the Deny-List Before Drafting

When a Phase 1-4 finding could leak through a draft, extend the scanner before the draft is written. Drafts are cheap; teaching the scanner new patterns is durable.

Workflow:

New finding lands in the private repo (e.g., a newly-discovered flag prefix).
Ask: "If this leaked, what would I want the scanner to catch?"
Add a pattern entry to tools/check-redaction.sh (label + regex).
Run the scanner against the entire public mirror to confirm the new pattern is not already tripped by legitimate content.
Only then draft any public content that touches the area.

This inverts the usual order: the scanner is updated first, the draft second. The scanner becomes the executable specification of "what is too sensitive to publish," and the draft cannot accidentally outpace it.

Got: Pattern entries in tools/check-redaction.sh predate any public-mirror content that could match them. git log tools/check-redaction.sh shows scanner updates landing before related draft commits.

If fail: If scanner updates lag drafts, audit the public mirror against the new pattern immediately. Redact, then commit the scanner update with a note explaining the discovered pattern.

Step 5: Establish the Private/Public File-Set Split

Define an explicit allow-list of files that sync to the public mirror. New files default to private; promotion requires redaction-check clearance.

# tools/public-allowlist.txt
README.md
LICENSE
guides/methodology-overview.md
guides/category-classification.md
docs/contributing.md

A tools/sync-to-public.sh reads the allow-list, copies only those files to the public mirror, and exits non-zero if the allow-list references a file that does not exist (catches typos).

#!/usr/bin/env bash
set -eu
PRIVATE_ROOT="${1:?private repo path required}"
PUBLIC_ROOT="${2:?public mirror path required}"
ALLOWLIST="$PRIVATE_ROOT/tools/public-allowlist.txt"

while IFS= read -r path; do
  [ -z "$path" ] && continue
  case "$path" in \#*) continue ;; esac
  src="$PRIVATE_ROOT/$path"
  dst="$PUBLIC_ROOT/$path"
  if [ ! -e "$src" ]; then
    echo "MISSING: $path"; exit 2
  fi
  mkdir -p "$(dirname "$dst")"
  cp -a "$src" "$dst"
done < "$ALLOWLIST"

Promotion requires three things in order: the file is added to the allow-list, the file passes the redaction check, and a reviewer confirms the category labels from Step 1.

Got: The public mirror contains exactly the files listed in tools/public-allowlist.txt. No file appears in the public mirror that is not on the allow-list.

If fail: If a file appears in the public mirror but is missing from the allow-list, treat it as a leak event — investigate how it arrived, then either remove it or formally promote it after redaction review.

Step 6: Publish via Orphan Commit

The public mirror is a single git commit --orphan-rooted commit recreated at each publish. This prevents git log on the public repo from exposing pre-redaction drafts.

# In the public mirror (separate repo or worktree)
cd /path/to/public-mirror
git checkout --orphan publish-tmp
git rm -rf .                                    # Clear the index
# Sync from private using the allow-list
bash /path/to/private/tools/sync-to-public.sh /path/to/private .
git add -A
git commit -m "Publish: <date>"
git branch -D main 2>/dev/null || true
git branch -m main
git push --force origin main

The public repo's git log shows exactly one commit. Prior drafts and any redaction iterations stay in the private repo's history. No git log -p, git reflog, or branch listing on the public repo can recover pre-redaction content because it was never committed there.

Got: git log --oneline on the public mirror shows a single commit per publish. No references to the private repo's history (no parent SHAs, no merge commits, no tags from the private repo) appear.

If fail: If git push --force is rejected (branch protection), open a single-commit pull request from a clean orphan branch instead. Never solve a rejection by pushing the private history.

Step 7: Wire the CI Gate

Run tools/check-redaction.sh on every commit to the public-sync branch. A failed check blocks the publish, not just warns.

# .github/workflows/redaction-check.yml (in the public mirror repo)
name: redaction-check
locale: caveman-lite
source_locale: en
source_commit: 82c77053
translator: "Julius Brussee homage — caveman"
translation_date: "2026-04-26"
on:
  push:
    branches: [main, publish-*]
  pull_request:
    branches: [main]
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install ripgrep
        run: sudo apt-get update && sudo apt-get install -y ripgrep
      - name: Fetch redaction scanner
        env:
          GH_TOKEN: ${{ secrets.PRIVATE_REPO_TOKEN }}
        run: |
          gh api repos/<org>/<private-repo>/contents/tools/check-redaction.sh \
            --jq .content | base64 -d > check-redaction.sh
          chmod +x check-redaction.sh
      - name: Run scanner
        run: ./check-redaction.sh .

Two design choices here:

The scanner is pulled from the private repo at CI time so the deny-list itself never lives in the public repo (the patterns are themselves sensitive — publishing them would tell a reader exactly what to look for).
The job exits with the scanner's exit code; non-zero blocks the workflow.

Got: Pushes that introduce a deny-listed pattern fail CI; the publish does not land. Maintainers see the failing label (e.g., LEAK: vendor-prefixed flag) without seeing the regex itself.

If fail: If the private-repo token cannot be granted to the public CI, embed only a minimum-leak portion of the scanner in the public repo (broad shape patterns that do not themselves identify the vendor) and run the full scanner pre-push from the private repo.

Step 8: Handle False Positives Honestly

When the scanner trips on legitimate content, prefer narrowing the pattern over adding an ignore-line. Broad deny-lists with local suppressions rot fast — six months later no one remembers why a particular line was suppressed, and the next leak slides past unnoticed.

Decision tree:

Is the match actually safe? Re-categorize using Step 1. If the content turns out to be a live internal in disguise, redact it; do not suppress the scanner.
Is the pattern too broad? Tighten the regex so the safe content no longer matches. Document the tightening with a comment in check-redaction.sh linking to the case that motivated it.
Only if 1 and 2 both fail — and the pattern is structurally too entangled with legitimate content to narrow further — use a single-line suppression with a # REASON: comment that states why the suppression is safe. Date the comment.

# Bad — mystery suppression
echo "API endpoint pattern" >> ignore.txt

# Good — narrowed pattern with rationale
# Pattern v2: tightened from `\bgate\(` to `\bgate\(['\"][a-z]+_phase` after
# legitimate `gate(true)` calls in our own SDK examples started matching. 2026-04-15.
PATTERNS+=("vendor flag predicate|\\bgate\\(['\"][a-z]+_phase")

Got: Each scanner pattern has zero or one inline comment explaining a tightening. Suppressions, if any, carry a date and a rationale.

If fail: If suppressions accumulate (more than one per quarter), the deny-list is mis-shaped. Schedule a redaction-policy review and rebuild the patterns from the categorized fact inventory.

Step 9: Periodic Redaction Sweeps

Not all redaction work is incident-driven. Run a periodic sweep (monthly is typical) that re-categorizes the most recent additions to the private repo and re-runs the scanner against the public mirror. Drift catches itself before it becomes incident-grade.

Sweep checklist:

Re-read the version-lag policy; confirm the empirical "current" version is unchanged or update the policy
Audit the last month of private-repo commits for newly-added findings that were not categorized (Step 1)
Run tools/check-redaction.sh against the public mirror (should still exit 0)
Review any scanner patterns added since last sweep — are any too broad? Tighten if so
If any version has aged past the cool-off, identify findings now eligible for promotion
Confirm tools/public-allowlist.txt matches the actual public-mirror file set

Got: A short sweep log per month in the private repo (e.g., sweeps/2026-04.md) with checklist outcomes and any actions taken.

If fail: If the sweep is repeatedly skipped, automate a calendar reminder. If the sweep keeps finding the same drift, the workflow upstream of it is the problem — investigate why categorization is being skipped at draft time.

Validation

Every file in the public mirror is on tools/public-allowlist.txt
tools/check-redaction.sh ./public-mirror exits 0
git log --oneline on the public mirror shows a single orphan commit per publish
REDACTION_POLICY.md exists in the private repo with an explicit version-lag cool-off
Every Phase 1-4 finding has a category label (methodology / generic pattern / version-specific / live internal)
Public CI runs the scanner on every push; a deliberate test pattern fails the build
The deny-list scanner itself does not live in the public repo
The most recent monthly sweep log is dated within the last 35 days

Pitfalls

"Just one example to make it concrete." The temptation to include one specific finding "to ground the methodology" is the most common leak path. Use synthetic placeholders (e.g., acme_widget_v3, widget_handler_42) — clearly invented, never traceable to a real product.
Using git rebase or git filter-branch to scrub a leak in place on the public repo. Force-pushing rewritten history still leaves traces in clones and forks. The orphan-commit publish pattern is a structural fix; ad-hoc history rewriting is not.
Suppressions instead of pattern tightening. A scanner with twenty suppressions is a scanner with zero meaningful coverage. Every suppression is a future leak waiting for context to fade.
Public CI that warns instead of failing. Warnings get ignored. The CI gate must block the publish (non-zero exit, no merge button).
Allow-list drift. New files added to the private repo do not automatically belong on the allow-list. Default-deny is the only safe posture.
Mistaking encryption for redaction. Encoding, hashing, or rot13-ing a sensitive identifier and publishing the result still publishes it — the original is recoverable. Redact means "does not appear at all."
Publishing the deny-list. The patterns themselves are a finding catalog: a reader who sees the regex knows exactly what to grep for in the binary. Keep the scanner private; only its labels (e.g., LEAK: vendor-prefixed flag) should appear in public CI logs.
Treating the private repo as a draft pile. It is the source of truth for the research, not a scratch space. Apply the same versioning, review, and backup discipline you would to any production artifact.

Related Skills

monitor-binary-version-baselines — Phase 1, baselines feed the version-lag policy: what counts as "current" is an empirical fact, not a calendar fact
probe-feature-flag-state — Phases 2-3, classification findings here enter the redaction pipeline at category step (Step 1)
conduct-empirical-wire-capture — Phase 4, capture artifacts (wire logs, payload schemas) need redaction before any can be referenced publicly
security-audit-codebase — both pipelines benefit from deny-list-style scanning; this skill specializes for research disclosure rather than secret leakage
manage-git-branches — the orphan-commit publish pattern is a branch operation; safe execution requires the branch hygiene practices documented there

GitHub 저장소

pjt222/agent-almanac

경로: i18n/caveman-lite/skills/redact-for-public-disclosure

agentsagentskillsai-assisted-developmentclaude-codeskillsteams

FAQ

Frequently asked questions

What is the redact-for-public-disclosure skill?

redact-for-public-disclosure is a Claude Skill by pjt222. Skills package instructions and resources that Claude loads on demand, so Claude can perform redact-for-public-disclosure-related tasks without extra prompting.

How do I install redact-for-public-disclosure?

Use the install commands on this page: add redact-for-public-disclosure to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does redact-for-public-disclosure belong to?

redact-for-public-disclosure is in the Communication category, tagged ai.

Is redact-for-public-disclosure free to use?

Yes. redact-for-public-disclosure is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

연관 스킬

himalaya-email-manager

커뮤니케이션

이 Claude Skill은 IMAP을 통해 Himalaya CLI 도구를 이용한 이메일 관리를 가능하게 합니다. 개발자들이 자연어 쿼리로 IMAP 계정의 이메일을 검색하고, 요약하고, 삭제할 수 있게 해줍니다. 일일 요약 수신이나 Claude에서 직접 배치 작업 수행과 같은 자동화된 이메일 워크플로우에 활용하세요.

스킬 보기

imsg

커뮤니케이션

imsg는 macOS용 CLI 도구로, Messages.app을 통해 iMessage/SMS와 프로그래밍 방식으로 상호작용할 수 있게 해줍니다. 이 도구를 사용하면 개발자가 채팅 목록을 확인하고, 메시지 기록을 조회하며, 대화를 실시간으로 모니터링하고, 메시지나 첨부 파일을 보낼 수 있습니다. 이 스킬을 활용하여 메시징 작업을 자동화하거나 개발 워크플로우에 iMessage/SMS 기능을 통합해 보세요.

스킬 보기

internationalization-i18n

커뮤니케이션

이 Claude Skill은 애플리케이션에 국제화(i18n)와 현지화를 구현하기 위한 포괄적인 지침을 제공합니다. i18next 및 gettext와 같은 라이브러리를 활용하여 메시지 추출, 번역 관리, 로케일별 형식 지정, RTL(오른쪽에서 왼쪽) 지원 등 주요 작업을 다룹니다. 다국어 애플리케이션을 구축하거나 국제 사용자를 위한 현지화 기능을 추가할 때 활용하세요.

스킬 보기

wacli

커뮤니케이션

wacli는 WhatsApp Web 프로토콜을 통해 WhatsApp 메시징, 검색 및 동기화를 가능하게 하는 명령줄 도구입니다. 주로 Clawdis 워크플로우 내에서 자동화 처리를 위해 사용되지만, 메시지 전송, 채팅 동기화 또는 기록 조회를 위해 직접 호출할 수도 있습니다. 주요 기능으로는 QR 기반 인증, 지속적인 백그라운드 동기화, 텍스트 및 파일 전송 기능이 포함됩니다.

스킬 보기