Back to Skills

Convex Agents Debugging

Sstobo
Updated Today
53 views
5
5
View on GitHub
Testingaidata

About

This debugging skill helps developers troubleshoot Convex agent behavior by logging LLM interactions and inspecting database state. Use it when agent responses are unexpected, to understand the context the LLM receives, or to diagnose data-related issues. It provides visibility into raw requests/responses, tool calls, and storage for building reliable agent applications.

Quick Install

Claude Code

Recommended
Plugin CommandRecommended
/plugin add https://github.com/Sstobo/convex-skills
Git CloneAlternative
git clone https://github.com/Sstobo/convex-skills.git ~/.claude/skills/Convex Agents Debugging

Copy and paste this command in Claude Code to install this skill

Documentation

Purpose

Debugging tools help understand what's happening inside agents, what the LLM receives, and what's stored. Essential for developing reliable agent applications.

When to Use This Skill

  • Agent behavior is unexpected
  • LLM responses are off-target
  • Investigating why certain context isn't being used
  • Understanding message ordering
  • Checking file storage and references
  • Auditing tool calls and results
  • Profiling token usage

Log Raw LLM Requests and Responses

const myAgent = new Agent(components.agent, {
  name: "My Agent",
  languageModel: openai.chat("gpt-4o-mini"),
  rawRequestResponseHandler: async (ctx, { request, response }) => {
    console.log("LLM Request:", JSON.stringify(request, null, 2));
    console.log("LLM Response:", JSON.stringify(response, null, 2));

    await ctx.runMutation(internal.logging.saveLLMCall, {
      request,
      response,
      timestamp: Date.now(),
    });
  },
});

Log Context Messages

See exactly what context the LLM receives:

const myAgent = new Agent(components.agent, {
  name: "My Agent",
  languageModel: openai.chat("gpt-4o-mini"),
  contextHandler: async (ctx, args) => {
    console.log("Context Messages:", {
      recent: args.recent.length,
      search: args.search.length,
      input: args.inputMessages.length,
    });

    args.allMessages.forEach((msg, i) => {
      console.log(`Message ${i}:`, {
        role: msg.role,
        contentLength: typeof msg.content === "string"
          ? msg.content.length
          : JSON.stringify(msg.content).length,
      });
    });

    return args.allMessages;
  },
});

Inspect Database Tables

Query agent data directly:

export const getThreadMessages = query({
  args: { threadId: v.string() },
  handler: async (ctx, { threadId }) => {
    return await ctx.db
      .query(components.agent.tables.messages)
      .filter((msg) => msg.threadId === threadId)
      .collect();
  },
});

Fetch Context Manually

Inspect what context would be used:

import { fetchContextWithPrompt } from "@convex-dev/agent";

export const inspectContext = action({
  args: { threadId: v.string(), prompt: v.string() },
  handler: async (ctx, { threadId, prompt }) => {
    const { messages } = await fetchContextWithPrompt(ctx, components.agent, {
      threadId,
      prompt,
    });

    return {
      contextMessages: messages.length,
      messages: messages.map((msg) => ({
        role: msg.role,
        contentType: typeof msg.content,
      })),
    };
  },
});

Trace Tool Calls

Log all tool invocations:

export const myTool = createTool({
  description: "My tool",
  args: z.object({ query: z.string() }),
  handler: async (ctx, { query }): Promise<string> => {
    console.log("[TOOL] myTool called with:", query);
    const result = await someOperation(query);
    console.log("[TOOL] myTool returned:", result);
    return result;
  },
});

Fix Type Errors

Common circular reference issue:

// WRONG - no return type
export const myFunction = action({
  args: { prompt: v.string() },
  handler: async (ctx, { prompt }) => {
    return await someLogic();
  },
});

// CORRECT - explicit return type
export const myFunction = action({
  args: { prompt: v.string() },
  returns: v.string(),
  handler: async (ctx, { prompt }): Promise<string> => {
    return await someLogic();
  },
});

Analyze Message Structure

Debug message ordering:

export const analyzeMessages = query({
  args: { threadId: v.string() },
  handler: async (ctx, { threadId }) => {
    const messages = await listMessages(ctx, components.agent, {
      threadId,
      paginationOpts: { cursor: null, numItems: 100 },
    });

    return messages.results.map((msg) => ({
      order: msg.order,
      stepOrder: msg.stepOrder,
      role: msg.message.role,
      status: msg.status,
    }));
  },
});

Key Principles

  • Log early: Capture data while developing
  • Use console for quick checks: Fast iteration
  • Save important events: Archive LLM calls for analysis
  • Explicit return types: Prevents circular references
  • Dashboard inspection: Easiest way to see database state

Next Steps

  • See playground for interactive debugging
  • See fundamentals for agent setup
  • See context for context-aware debugging

GitHub Repository

Sstobo/convex-skills
Path: convex-agents-debugging

Related Skills

sglang

Meta

SGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.

View skill

content-collections

Meta

This skill provides a production-tested setup for Content Collections, a TypeScript-first tool that transforms Markdown/MDX files into type-safe data collections with Zod validation. Use it when building blogs, documentation sites, or content-heavy Vite + React applications to ensure type safety and automatic content validation. It covers everything from Vite plugin configuration and MDX compilation to deployment optimization and schema validation.

View skill

evaluating-llms-harness

Testing

This Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.

View skill

llamaguard

Other

LlamaGuard is Meta's 7-8B parameter model for moderating LLM inputs and outputs across six safety categories like violence and hate speech. It offers 94-95% accuracy and can be deployed using vLLM, Hugging Face, or Amazon SageMaker. Use this skill to easily integrate content filtering and safety guardrails into your AI applications.

View skill