learn-datalake
About
The `learn-datalake` skill is a continuous orchestrator that monitors a directory, processes new PDFs through quality review loops, and ingests other file types into graph memory. It automatically extracts and links framework controls (like NIST or ATT&CK) from PDF content, enabling semantic search and multi-hop traversal. Use this skill to automatically build a queryable knowledge graph from a watched folder of documents and assets.
Quick Install
Claude Code
Recommendednpx skills add grahama1970/agent-skills -a claude-code/plugin add https://github.com/grahama1970/agent-skillsgit clone https://github.com/grahama1970/agent-skills.git ~/.claude/skills/learn-datalakeCopy and paste this command in Claude Code to install this skill
GitHub Repository
Related Skills
release-standards
DocumentsThis skill provides semantic versioning (semver) guidelines and changelog formatting standards for software releases. Use it when preparing releases to correctly increment version numbers (major/minor/patch) and structure changelog entries. It includes rules for pre-release identifiers and clear examples for developers.
commit-standards
DocumentsThis skill formats Git commit messages according to the Conventional Commits standard. It provides templates and type definitions (like `feat`, `fix`, `refactor`) to ensure consistency when writing or reviewing commits. Use it during the commit process to create clear, structured commit history.
huggingface-tokenizers
DocumentsThis skill provides high-performance tokenization using HuggingFace's Rust-based library, processing 1GB of text in under 20 seconds. It supports BPE, WordPiece, and Unigram algorithms while enabling custom tokenizer training and alignment tracking. Use it when you need production-fast tokenization or to build custom tokenizers integrated with the transformers ecosystem.
nano-pdf
Documentsnano-pdf is a CLI tool that lets developers edit PDFs using natural-language instructions, like changing text or fixing typos on specific pages. It's ideal for quick, programmatic PDF modifications directly from the terminal. Always verify the output, as page numbering can vary between versions.
