observability-monitoring
About
This skill implements observability and monitoring for Cloudflare Workers using built-in tools. It provides logging through wrangler tail, metrics via Workers Analytics, and health checks for endpoint monitoring. Use it when setting up monitoring, configuring alerts, or debugging production issues in your Workers applications.
Quick Install
Claude Code
Recommended/plugin add https://github.com/greyhaven-ai/claude-code-configgit clone https://github.com/greyhaven-ai/claude-code-config.git ~/.claude/skills/observability-monitoringCopy and paste this command in Claude Code to install this skill
Documentation
Grey Haven Observability and Monitoring
Implement comprehensive monitoring for Grey Haven applications using Cloudflare Workers built-in observability tools.
Observability Stack
Grey Haven Monitoring Architecture
- Logging: Cloudflare Workers logs + wrangler tail
- Metrics: Cloudflare Workers Analytics dashboard
- Custom Events: Cloudflare Analytics Engine
- Health Checks: Cloudflare Health Checks for endpoint availability
- Error Tracking: Console errors visible in Cloudflare dashboard
Cloudflare Workers Logging
Console Logging in Workers
// app/utils/logger.ts
export interface LogEvent {
level: "debug" | "info" | "warn" | "error";
message: string;
context?: Record<string, unknown>;
userId?: string;
tenantId?: string;
requestId?: string;
duration?: number;
}
export function log(event: LogEvent) {
const logData = {
timestamp: new Date().toISOString(),
level: event.level,
message: event.message,
environment: process.env.ENVIRONMENT,
user_id: event.userId,
tenant_id: event.tenantId,
request_id: event.requestId,
duration_ms: event.duration,
...event.context,
};
// Structured console logging (visible in Cloudflare dashboard)
console[event.level](JSON.stringify(logData));
}
// Convenience methods
export const logger = {
debug: (message: string, context?: Record<string, unknown>) =>
log({ level: "debug", message, context }),
info: (message: string, context?: Record<string, unknown>) =>
log({ level: "info", message, context }),
warn: (message: string, context?: Record<string, unknown>) =>
log({ level: "warn", message, context }),
error: (message: string, context?: Record<string, unknown>) =>
log({ level: "error", message, context }),
};
Logging Middleware
// app/middleware/logging.ts
import { logger } from "~/utils/logger";
import { v4 as uuidv4 } from "uuid";
export async function loggingMiddleware(
request: Request,
next: () => Promise<Response>
) {
const requestId = uuidv4();
const startTime = Date.now();
try {
const response = await next();
const duration = Date.now() - startTime;
logger.info("Request completed", {
request_id: requestId,
method: request.method,
url: request.url,
status: response.status,
duration_ms: duration,
});
return response;
} catch (error) {
const duration = Date.now() - startTime;
logger.error("Request failed", {
request_id: requestId,
method: request.method,
url: request.url,
error: error.message,
stack: error.stack,
duration_ms: duration,
});
throw error;
}
}
Cloudflare Workers Analytics
Workers Analytics Dashboard
Access metrics at: https://dash.cloudflare.com → Workers → Analytics
Key Metrics:
- Request rate (requests/second)
- CPU time (milliseconds)
- Error rate (%)
- Success rate (%)
- Response time (P50, P95, P99)
- Invocations per day
- GB-seconds (compute usage)
Wrangler Tail (Real-time Logs)
# Stream production logs
npx wrangler tail --config wrangler.production.toml
# Filter by status code
npx wrangler tail --status error --config wrangler.production.toml
# Filter by method
npx wrangler tail --method POST --config wrangler.production.toml
# Filter by IP address
npx wrangler tail --ip 1.2.3.4 --config wrangler.production.toml
# Output to file
npx wrangler tail --config wrangler.production.toml > logs.txt
Accessing Logs in Cloudflare Dashboard
- Go to
https://dash.cloudflare.com - Navigate to Workers & Pages
- Select your Worker
- Click "Logs" tab
- View real-time logs with filtering
Log Features:
- Real-time streaming
- Filter by status code
- Filter by request method
- Search log content
- Export logs (JSON)
Analytics Engine (Custom Events)
Setup Analytics Engine
wrangler.toml:
[[analytics_engine_datasets]]
binding = "ANALYTICS"
Track Custom Events
// app/utils/analytics.ts
export async function trackEvent(
env: Env,
eventName: string,
data: {
user_id?: string;
tenant_id?: string;
duration_ms?: number;
[key: string]: string | number | undefined;
}
) {
try {
await env.ANALYTICS.writeDataPoint({
blobs: [eventName],
doubles: [data.duration_ms || 0],
indexes: [data.user_id || "", data.tenant_id || ""],
});
} catch (error) {
console.error("Failed to track event:", error);
}
}
// Usage in server function
export const loginUser = createServerFn({ method: "POST" }).handler(
async ({ data, context }) => {
const startTime = Date.now();
const user = await authenticateUser(data);
const duration = Date.now() - startTime;
// Track login event
await trackEvent(context.env, "user_login", {
user_id: user.id,
tenant_id: user.tenantId,
duration_ms: duration,
});
return user;
}
);
Query Analytics Data
Use Cloudflare GraphQL API:
query GetLoginStats {
viewer {
accounts(filter: { accountTag: $accountId }) {
workersAnalyticsEngineDataset(dataset: "my_analytics") {
query(
filter: {
blob1: "user_login"
datetime_gt: "2025-01-01T00:00:00Z"
}
) {
count
dimensions {
blob1 # event name
index1 # user_id
index2 # tenant_id
}
}
}
}
}
}
Health Checks
Health Check Endpoint
// app/routes/api/health.ts
import { createServerFn } from "@tanstack/start";
import { db } from "~/lib/server/db";
export const GET = createServerFn({ method: "GET" }).handler(async ({ context }) => {
const startTime = Date.now();
const checks: Record<string, string> = {};
// Check database
let dbHealthy = false;
try {
await db.execute("SELECT 1");
dbHealthy = true;
checks.database = "ok";
} catch (error) {
console.error("Database health check failed:", error);
checks.database = "failed";
}
// Check Redis (if using Upstash)
let redisHealthy = false;
if (context.env.REDIS) {
try {
await context.env.REDIS.ping();
redisHealthy = true;
checks.redis = "ok";
} catch (error) {
console.error("Redis health check failed:", error);
checks.redis = "failed";
}
}
const duration = Date.now() - startTime;
const healthy = dbHealthy && (!context.env.REDIS || redisHealthy);
return new Response(
JSON.stringify({
status: healthy ? "healthy" : "unhealthy",
checks,
duration_ms: duration,
timestamp: new Date().toISOString(),
environment: process.env.ENVIRONMENT,
}),
{
status: healthy ? 200 : 503,
headers: { "Content-Type": "application/json" },
}
);
});
Cloudflare Health Checks
Configure in Cloudflare dashboard:
- Go to Traffic → Health Checks
- Create health check for
/api/health - Configure:
- Interval: 60 seconds
- Timeout: 5 seconds
- Retries: 2
- Expected status: 200
- Set up notifications (email/webhook)
Error Tracking
Structured Error Logging
// app/utils/error-handler.ts
import { logger } from "~/utils/logger";
export function handleError(error: Error, context?: Record<string, unknown>) {
// Log error with full context
logger.error(error.message, {
error_name: error.name,
stack: error.stack,
...context,
});
// Also log to Analytics Engine for tracking
if (context?.env) {
trackEvent(context.env as Env, "error_occurred", {
error_name: error.name,
error_message: error.message,
});
}
}
// Usage in server function
export const updateUser = createServerFn({ method: "POST" }).handler(
async ({ data, context }) => {
try {
return await userService.update(data);
} catch (error) {
handleError(error, {
user_id: context.user?.id,
tenant_id: context.tenant?.id,
env: context.env,
});
throw error;
}
}
);
Viewing Errors in Cloudflare
- Workers Dashboard: View errors in real-time
- Wrangler Tail:
npx wrangler tail --status error - Analytics: Check error rate metrics
- Health Checks: Monitor endpoint failures
Supporting Documentation
All supporting files are under 500 lines per Anthropic best practices:
-
examples/ - Complete monitoring examples
- cloudflare-logging.md - Structured console logging
- wrangler-tail.md - Real-time log streaming
- analytics-engine.md - Custom event tracking
- health-checks.md - Health check implementations
- error-tracking.md - Error handling patterns
- INDEX.md - Examples navigation
-
reference/ - Monitoring references
- cloudflare-metrics.md - Available metrics
- wrangler-commands.md - Wrangler CLI reference
- alert-configuration.md - Setting up alerts
- INDEX.md - Reference navigation
-
templates/ - Copy-paste ready templates
- logger.ts - Cloudflare logger template
- health-check.ts - Health check endpoint
-
checklists/ - Monitoring checklists
- observability-setup-checklist.md - Setup checklist
When to Apply This Skill
Use this skill when:
- Setting up monitoring for new Cloudflare Workers projects
- Implementing structured logging with console
- Debugging production issues with wrangler tail
- Setting up health checks
- Implementing custom metrics tracking with Analytics Engine
- Configuring Cloudflare alerts
Template Reference
These patterns are from Grey Haven's production monitoring:
- Cloudflare Workers Analytics: Request and performance metrics
- Wrangler tail: Real-time log streaming
- Console logging: Structured JSON logs
- Analytics Engine: Custom event tracking
Critical Reminders
- Structured logging: Use JSON.stringify for console logs
- Request IDs: Track requests with UUIDs for debugging
- Error context: Include tenant_id, user_id in all error logs
- Health checks: Monitor database and external service connections
- Wrangler tail: Use filters to narrow down logs (--status, --method)
- Performance: Track duration_ms for all operations
- Environment: Log environment in all messages for filtering
- Analytics Engine: Use for custom metrics and event tracking
- Dashboard access: Logs available in Cloudflare Workers dashboard
- Real-time debugging: Use wrangler tail for live production debugging
GitHub Repository
Related Skills
sglang
MetaSGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.
evaluating-llms-harness
TestingThis Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.
content-collections
MetaThis skill provides a production-tested setup for Content Collections, a TypeScript-first tool that transforms Markdown/MDX files into type-safe data collections with Zod validation. Use it when building blogs, documentation sites, or content-heavy Vite + React applications to ensure type safety and automatic content validation. It covers everything from Vite plugin configuration and MDX compilation to deployment optimization and schema validation.
llamaguard
OtherLlamaGuard is Meta's 7-8B parameter model for moderating LLM inputs and outputs across six safety categories like violence and hate speech. It offers 94-95% accuracy and can be deployed using vLLM, Hugging Face, or Amazon SageMaker. Use this skill to easily integrate content filtering and safety guardrails into your AI applications.
