run-puzzle-tests
关于
This skill runs the jigsawR test suite through WSL R execution, supporting full tests, pattern filtering, or single file runs. It interprets pass/fail/skip results and identifies failures while avoiding the --vanilla flag for proper renv activation. Developers should use it after code changes, before commits, or when debugging specific test failures.
快速安装
Claude Code
推荐npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/run-puzzle-tests在 Claude Code 中复制并粘贴此命令以安装该技能
技能文档
Run Puzzle Tests
Run jigsawR test suite. Read results.
When Use
- After modifying R source in package
- After adding new puzzle type or feature
- Before commit to verify nothing broken
- Debugging specific test failure
Inputs
- Required: Test scope (
full,filtered,single) - Optional: Filter pattern (filtered mode, e.g.
"snic","rectangular") - Optional: Specific test file path (single mode)
Steps
Step 1: Choose Test Scope
| Scope | Use when | Duration |
|---|---|---|
| Full | Before commits, after major changes | ~2-5 min |
| Filtered | Working on one puzzle type | ~30s |
| Single | Debugging a specific test file | ~10s |
Got: Scope selected by workflow: full before commits, filtered for one type, single for one debug.
If fail: Unsure? Default to full. Slower but catches cross-type regressions.
Step 2: Create and Execute Test Script
Full suite.
Make script (e.g., /tmp/run_tests.R).
devtools::test()
R_EXE="/mnt/c/Program Files/R/R-4.5.0/bin/Rscript.exe"
cd /mnt/d/dev/p/jigsawR && "$R_EXE" -e "devtools::test()"
Filtered by pattern.
"$R_EXE" -e "devtools::test(filter = 'snic')"
Single file.
"$R_EXE" -e "testthat::test_file('tests/testthat/test-snic-puzzles.R')"
Got: Test output with pass/fail/skip counts.
If fail:
- Do NOT use
--vanillaflag; renv needs.Rprofileto activate - renv errors? Run
renv::restore()first - Complex commands fail with Exit 5? Write to script file
Step 3: Interpret Results
Look for summary line.
[ FAIL 0 | WARN 0 | SKIP 7 | PASS 2042 ]
- PASS: Tests succeeded
- FAIL: Tests failed (need investigation)
- SKIP: Tests skipped (usually missing optional packages like
snic) - WARN: Warnings during tests (review but not blocking)
Got: Summary line parsed for PASS, FAIL, SKIP, WARN. FAIL = 0 = clean run.
If fail: Summary not visible? Runner crashed before completing. Check R-level errors above. Output truncated? Redirect to file: "$R_EXE" -e "devtools::test()" > test_results.txt 2>&1.
Step 4: Investigate Failures
If tests fail.
- Read failure msg — includes file, line, expected vs actual
- Check if new failure or pre-existing
- Assertion failures: read test + function being tested
- Error failures: check function signature changed
# Run just the failing test with verbose output
"$R_EXE" -e "testthat::test_file('tests/testthat/test-failing.R', reporter = 'summary')"
Got: Root cause of each failing test identified. Failure = real regression (fix code) or test env issue (missing dep, path).
If fail: Failure msg unclear? Add browser() or print() to test, re-run with testthat::test_file() for interactive debug.
Step 5: Verify Skip Reasons
Skipped tests normal when optional deps missing.
snicpackage tests skip withskip_if_not_installed("snic")- Tests needing specific OS skip with
skip_on_os() - CRAN-only skips with
skip_on_cran()
Confirm skip reasons legitimate, not masking real failures.
Got: All skips accounted for by legitimate reasons (optional dep, platform skip, CRAN-only). No skips masking actual failures.
If fail: Skip suspicious? Temporarily remove skip_if_*() and run test to see pass or hidden failure.
Checks
- All tests pass (FAIL = 0)
- No unexpected warnings
- Skip count matches expected (only optional dep skips)
- Test count not decreased (no tests removed by accident)
Pitfalls
- Use
--vanilla: Breaks renv activation. Never with jigsawR. - Complex
-estrings: Shell escaping = Exit 5. Use script files. - Stale package state: Run
devtools::load_all()ordevtools::document()before testing if NAMESPACE-affecting code changed. - Missing test deps: Some tests need suggested packages. Check
DESCRIPTIONSuggests. - Parallel test issues: Tests interfere? Run sequential with
testthat::test_file().
See Also
generate-puzzle— generate puzzles to verify behavior matches testsadd-puzzle-type— new types need comprehensive test suiteswrite-testthat-tests— general patterns for writing R testsvalidate-piles-notation— test PILES parsing independently
GitHub 仓库
相关推荐技能
evaluating-llms-harness
测试该Skill通过60+个学术基准测试(如MMLU、GSM8K等)评估大语言模型质量,适用于模型对比、学术研究及训练进度追踪。它支持HuggingFace、vLLM和API接口,被EleutherAI等行业领先机构广泛采用。开发者可通过简单命令行快速对模型进行多任务批量评估。
cloudflare-cron-triggers
测试这个Claude Skill提供了关于Cloudflare Cron Triggers的完整知识库,用于通过cron表达式定时执行Workers。它支持配置周期性任务、维护作业和自动化工作流,并能处理常见的cron触发错误。开发者可以用它来设置定时任务、测试cron处理器,并集成Workflows和Green Compute功能。
webapp-testing
测试该Skill为开发者提供了基于Playwright的本地Web应用测试工具集,支持自动化测试前端功能、调试UI行为、捕获屏幕截图和查看浏览器日志。它包含管理服务器生命周期的辅助脚本,可直接作为黑盒工具运行而无需阅读源码。适用于需要快速验证本地Web应用界面和交互功能的开发场景。
finishing-a-development-branch
测试这个Skill用于开发分支完成后的集成决策,当代码实现完成且测试通过时,它会引导开发者选择合适的工作流。它首先验证测试状态,然后提供合并、创建PR或清理等结构化选项。核心价值在于确保代码质量的同时,标准化分支收尾流程。
