gemma_telemetry_retention_detector

Foundup

Updated Today

68 views

Othergeneral

About

This skill provides fast binary classification of YouTube telemetry records to determine retention strategy. It uses pattern matching to scan heartbeat data as the first phase in a cleanup workflow. Developers should use it for quick initial classification before passing records to downstream agents for retention execution.

Quick Install

Claude Code

Recommended

Plugin CommandRecommended

/plugin add https://github.com/Foundup/Foundups-Agent

Git CloneAlternative

git clone https://github.com/Foundup/Foundups-Agent.git ~/.claude/skills/gemma_telemetry_retention_detector

Copy and paste this command in Claude Code to install this skill

Documentation

Gemma Telemetry Retention Detector

Purpose: Fast pattern matching to classify YouTube DAE heartbeat records for retention strategy

Architecture: Phase 1 of Gemma→Qwen→0102 cleanup wardrobe pattern

WSP Compliance

WSP 77: Agent Coordination (Gemma fast classification → Qwen strategy)
WSP 91: DAEMON Observability (telemetry lifecycle management)
WSP 96: WRE Skills Wardrobe (autonomous cleanup execution)

Task Description

Scan data/foundups.db::youtube_heartbeats table and classify records into retention categories using fast binary pattern matching.

Input Contract

{
  "database_path": "data/foundups.db",
  "table": "youtube_heartbeats",
  "scan_limit": 1000,
  "current_timestamp": "2025-10-27T20:00:00Z"
}

Classification Rules (Fast Binary Decisions)

Rule 1: Recent Activity (KEEP)

Age: < 30 days
Pattern: High training value, operational visibility
Binary decision: category = "keep_recent"

Rule 2: Training Data (KEEP)

Age: 30-90 days
Pattern: Historical patterns for Gemma learning
Binary decision: category = "keep_training"

Rule 3: Archivable (ARCHIVE)

Age: 91-365 days
Pattern: Historical value but low operational need
Binary decision: category = "archive_candidate"

Rule 4: Purgeable (PURGE)

Age: > 365 days
Pattern: Minimal value, disk space reclamation
Binary decision: category = "purge_candidate"

Output Contract

{
  "scan_timestamp": "2025-10-27T20:00:00Z",
  "total_records_scanned": 3719,
  "categories": {
    "keep_recent": {
      "count": 1200,
      "age_range_days": "0-30",
      "disk_mb": 45
    },
    "keep_training": {
      "count": 1500,
      "age_range_days": "30-90",
      "disk_mb": 95
    },
    "archive_candidate": {
      "count": 800,
      "age_range_days": "91-365",
      "disk_mb": 70
    },
    "purge_candidate": {
      "count": 219,
      "age_range_days": ">365",
      "disk_mb": 19
    }
  },
  "recommendation": "archive_and_vacuum",
  "estimated_reclaim_mb": 89,
  "confidence": 0.95
}

Execution Logic (Gemma Implementation)

from datetime import datetime, timedelta, timezone
import sqlite3

def classify_heartbeat_age(timestamp_iso: str, now: datetime) -> str:
    """Fast binary classification by age"""
    ts = datetime.fromisoformat(timestamp_iso.replace('Z', '+00:00'))
    age_days = (now - ts).days

    if age_days < 30:
        return "keep_recent"
    elif age_days < 91:
        return "keep_training"
    elif age_days < 366:
        return "archive_candidate"
    else:
        return "purge_candidate"

def scan_telemetry_retention(db_path: str) -> dict:
    """Gemma fast scan for retention categories"""
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()

    # Get all heartbeat timestamps
    cursor.execute("SELECT timestamp FROM youtube_heartbeats ORDER BY timestamp DESC")
    rows = cursor.fetchall()

    now = datetime.now(timezone.utc)
    categories = {
        "keep_recent": [],
        "keep_training": [],
        "archive_candidate": [],
        "purge_candidate": []
    }

    # Fast classification loop
    for (ts,) in rows:
        category = classify_heartbeat_age(ts, now)
        categories[category].append(ts)

    conn.close()

    # Generate output
    return {
        "scan_timestamp": now.isoformat(),
        "total_records_scanned": len(rows),
        "categories": {
            cat: {
                "count": len(records),
                "age_range_days": _get_age_range(cat),
                "disk_mb": len(records) * 0.06  # Rough estimate
            }
            for cat, records in categories.items()
        },
        "recommendation": "archive_and_vacuum" if len(categories["archive_candidate"]) > 500 else "no_action",
        "estimated_reclaim_mb": (len(categories["archive_candidate"]) + len(categories["purge_candidate"])) * 0.06,
        "confidence": 0.95
    }

def _get_age_range(category: str) -> str:
    ranges = {
        "keep_recent": "0-30",
        "keep_training": "30-90",
        "archive_candidate": "91-365",
        "purge_candidate": ">365"
    }
    return ranges[category]

Performance Metrics

Scan speed: 10,000 records/second (pure SQLite query)
Classification: <1ms per record (simple age comparison)
Total execution: <50ms for 3,719 records
Token cost: 50-100 tokens (output generation only, no LLM inference)

Pattern Memory Integration

Store execution results in wre_core/recursive_improvement/metrics/telemetry_cleanup_metrics.jsonl:

{
  "skill": "gemma_telemetry_retention_detector",
  "timestamp": "2025-10-27T20:00:00Z",
  "execution_time_ms": 47,
  "records_scanned": 3719,
  "recommendation": "archive_and_vacuum",
  "estimated_reclaim_mb": 89,
  "pattern_fidelity": 0.95
}

Next Phase

When recommendation == "archive_and_vacuum", trigger:

Phase 2: qwen_telemetry_cleanup_strategist - Strategic cleanup plan
Phase 3: 0102 validation and execution

Training Value

Gemma learns:

Fast age-based classification patterns
Binary decision thresholds (30/90/365 days)
Disk usage estimation heuristics

Pattern reuse:

Same logic applies to other telemetry tables
Reusable for foundups_selenium telemetry
Generic retention classifier for any time-series data

GitHub Repository

Foundup/Foundups-Agent

Path: modules/communication/livechat/skills/gemma_telemetry_retention_detector

bitcoinblockchain-technologydaesdaofoundupspartifact

Related Skills

algorithmic-art

subagent-driven-development

Development

This skill executes implementation plans by dispatching a fresh subagent for each independent task, with code review between tasks. It enables fast iteration while maintaining quality gates through this review process. Use it when working on mostly independent tasks within the same session to ensure continuous progress with built-in quality checks.

View skill

executing-plans

Design

Use the executing-plans skill when you have a complete implementation plan to execute in controlled batches with review checkpoints. It loads and critically reviews the plan, then executes tasks in small batches (default 3 tasks) while reporting progress between each batch for architect review. This ensures systematic implementation with built-in quality control checkpoints.

View skill

cost-optimization

Other

This Claude Skill helps developers optimize cloud costs through resource rightsizing, tagging strategies, and spending analysis. It provides a framework for reducing cloud expenses and implementing cost governance across AWS, Azure, and GCP. Use it when you need to analyze infrastructure costs, right-size resources, or meet budget constraints.

View skill