distributed-tracing

aj-geddes

Updated Today

24 views

Testinggeneral

About

This Claude Skill helps developers implement distributed tracing using Jaeger and Zipkin to track requests across microservices. It's particularly useful for debugging distributed systems, analyzing performance bottlenecks, and tracing request flows. The skill provides setup instructions and instrumentation examples for comprehensive tracing implementation.

Documentation

Distributed Tracing

Overview

Set up distributed tracing infrastructure with Jaeger or Zipkin to track requests across microservices and identify performance bottlenecks.

When to Use

Debugging microservice interactions
Identifying performance bottlenecks
Tracking request flows
Analyzing service dependencies
Root cause analysis

Instructions

1. Jaeger Setup

# docker-compose.yml
version: '3.8'
services:
  jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - "5775:5775/udp"
      - "6831:6831/udp"
      - "16686:16686"
      - "14268:14268"
    networks:
      - tracing

networks:
  tracing:

2. Node.js Jaeger Instrumentation

// tracing.js
const initTracer = require('jaeger-client').initTracer;
const opentracing = require('opentracing');

const initJaegerTracer = (serviceName) => {
  const config = {
    serviceName: serviceName,
    sampler: {
      type: 'const',
      param: 1
    },
    reporter: {
      logSpans: true,
      agentHost: process.env.JAEGER_AGENT_HOST || 'localhost',
      agentPort: process.env.JAEGER_AGENT_PORT || 6831
    }
  };

  return initTracer(config, {});
};

const tracer = initJaegerTracer('api-service');
module.exports = { tracer };

3. Express Tracing Middleware

// middleware.js
const { tracer } = require('./tracing');
const opentracing = require('opentracing');

const tracingMiddleware = (req, res, next) => {
  const wireCtx = tracer.extract(
    opentracing.FORMAT_HTTP_HEADERS,
    req.headers
  );

  const span = tracer.startSpan(req.path, {
    childOf: wireCtx,
    tags: {
      [opentracing.Tags.SPAN_KIND]: opentracing.Tags.SPAN_KIND_RPC_SERVER,
      [opentracing.Tags.HTTP_METHOD]: req.method,
      [opentracing.Tags.HTTP_URL]: req.url
    }
  });

  req.span = span;

  res.on('finish', () => {
    span.setTag(opentracing.Tags.HTTP_STATUS_CODE, res.statusCode);
    span.finish();
  });

  next();
};

module.exports = tracingMiddleware;

4. Python Jaeger Integration

# tracing.py
from jaeger_client import Config
from opentracing.propagation import Format

def init_jaeger_tracer(service_name):
    config = Config(
        config={
            'sampler': {'type': 'const', 'param': 1},
            'local_agent': {
                'reporting_host': 'localhost',
                'reporting_port': 6831,
            },
            'logging': True,
        },
        service_name=service_name,
    )
    return config.initialize_tracer()

# Flask integration
from flask import Flask, request

app = Flask(__name__)
tracer = init_jaeger_tracer('api-service')

@app.before_request
def before_request():
    ctx = tracer.extract(Format.HTTP_HEADERS, request.headers)
    request.span = tracer.start_span(
        request.path,
        child_of=ctx,
        tags={
            'http.method': request.method,
            'http.url': request.url,
        }
    )

@app.after_request
def after_request(response):
    request.span.set_tag('http.status_code', response.status_code)
    request.span.finish()
    return response

@app.route('/api/users/<user_id>')
def get_user(user_id):
    with tracer.start_span('fetch-user', child_of=request.span) as span:
        span.set_tag('user.id', user_id)
        # Fetch user from database
        return {'user': {'id': user_id}}

5. Distributed Context Propagation

// propagation.js
const axios = require('axios');
const { tracer } = require('./tracing');
const opentracing = require('opentracing');

async function callDownstreamService(span, url, data) {
  const headers = {};

  // Inject trace context
  tracer.inject(span, opentracing.FORMAT_HTTP_HEADERS, headers);

  try {
    const response = await axios.post(url, data, { headers });
    span.setTag('downstream.success', true);
    return response.data;
  } catch (error) {
    span.setTag(opentracing.Tags.ERROR, true);
    span.log({
      event: 'error',
      message: error.message
    });
    throw error;
  }
}

module.exports = { callDownstreamService };

6. Zipkin Integration

// zipkin-setup.js
const CLSContext = require('zipkin-context-cls');
const { Tracer, BatchRecorder, HttpLogger } = require('zipkin');
const zipkinMiddleware = require('zipkin-instrumentation-express').expressMiddleware;

const recorder = new BatchRecorder({
  logger: new HttpLogger({
    endpoint: 'http://localhost:9411/api/v2/spans',
    headers: { 'Content-Type': 'application/json' }
  })
});

const ctxImpl = new CLSContext('zipkin');
const tracer = new Tracer({ recorder, ctxImpl });

module.exports = {
  tracer,
  zipkinMiddleware: zipkinMiddleware({
    tracer,
    serviceName: 'api-service'
  })
};

7. Trace Analysis

# query-traces.py
import requests

def query_traces(service_name, operation=None, limit=20):
    params = {
        'service': service_name,
        'limit': limit
    }
    if operation:
        params['operation'] = operation

    response = requests.get('http://localhost:16686/api/traces', params=params)
    return response.json()['data']

def find_slow_traces(service_name, min_duration_ms=1000):
    traces = query_traces(service_name, limit=100)
    slow_traces = [
        t for t in traces
        if t['duration'] > min_duration_ms * 1000
    ]
    return sorted(slow_traces, key=lambda t: t['duration'], reverse=True)

Best Practices

✅ DO

Sample appropriately for your traffic volume
Propagate trace context across services
Add meaningful span tags
Log errors with spans
Use consistent service naming
Monitor trace latency
Document trace format
Keep instrumentation lightweight

❌ DON'T

Sample 100% in production
Skip trace context propagation
Log sensitive data in spans
Create excessive spans
Ignore sampling configuration
Use unbounded cardinality tags
Deploy without testing collection

Key Concepts

Trace: Complete request flow across services
Span: Single operation within a trace
Tag: Metadata attached to spans
Log: Timestamped events within spans
Context: Trace information propagated between services

Quick Install

/plugin add https://github.com/aj-geddes/useful-ai-prompts/tree/main/distributed-tracing

Copy and paste this command in Claude Code to install this skill

GitHub 仓库

aj-geddes/useful-ai-prompts

Path: skills/distributed-tracing

Related Skills

subagent-driven-development

Development

This skill executes implementation plans by dispatching a fresh subagent for each independent task, with code review between tasks. It enables fast iteration while maintaining quality gates through this review process. Use it when working on mostly independent tasks within the same session to ensure continuous progress with built-in quality checks.

View skill

algorithmic-art

executing-plans

Design

Use the executing-plans skill when you have a complete implementation plan to execute in controlled batches with review checkpoints. It loads and critically reviews the plan, then executes tasks in small batches (default 3 tasks) while reporting progress between each batch for architect review. This ensures systematic implementation with built-in quality control checkpoints.

View skill

cost-optimization

Other

This Claude Skill helps developers optimize cloud costs through resource rightsizing, tagging strategies, and spending analysis. It provides a framework for reducing cloud expenses and implementing cost governance across AWS, Azure, and GCP. Use it when you need to analyze infrastructure costs, right-size resources, or meet budget constraints.

View skill