Zurück zu Fähigkeiten

phoenix-observability

davila7
Aktualisiert 11 days ago
218 Ansichten
18,478
1,685
18,478
Auf GitHub ansehen
TestenObservabilityPhoenixArizeTracingEvaluationMonitoringLLM OpsOpenTelemetry

Über

Phoenix ist eine Open-Source-Plattform für KI-Observability zur Verfolgung, Evaluierung und Überwachung von LLM-Anwendungen. Sie bietet detaillierte Traces zum Debuggen, führt Evaluationen auf Datensätzen durch und ermöglicht Echtzeit-Monitoring für Produktionssysteme. Zu den Kernfunktionen gehören Experiment-Pipelines und selbst gehostete Observability ohne Vendor-Lock-in.

Schnellinstallation

Claude Code

Empfohlen
Primär
npx skills add davila7/claude-code-templates -a claude-code
Plugin-BefehlAlternativ
/plugin add https://github.com/davila7/claude-code-templates
Git CloneAlternativ
git clone https://github.com/davila7/claude-code-templates.git ~/.claude/skills/phoenix-observability

Kopieren Sie diesen Befehl und fügen Sie ihn in Claude Code ein, um diese Fähigkeit zu installieren

GitHub Repository

davila7/claude-code-templates
Pfad: cli-tool/components/skills/ai-research/observability-phoenix
0
anthropicanthropic-claudeclaudeclaude-code

Verwandte Skills

railway-metrics

Andere

This skill queries Railway service metrics including CPU, memory, network, and disk usage to monitor performance and debug issues. It's triggered when developers ask about resource utilization or service performance, and requires environment and service IDs from the Railway CLI. The skill provides actionable insights through Bash commands that fetch real-time analytics data.

Skill ansehen

evaluating-code-models

Meta

This skill benchmarks code generation models using industry-standard evaluations like HumanEval and MBPP across multiple programming languages. It calculates pass@k metrics for comparing model performance, testing multi-language support, and measuring code quality. Developers should use it when rigorously evaluating or comparing coding models, as it's the same tool powering HuggingFace's code leaderboards.

Skill ansehen

langsmith-observability

Meta

LangSmith provides LLM observability for tracing, evaluating, and monitoring AI applications. Developers should use it for debugging prompts and chains, systematic output evaluation, and monitoring production systems. Its key capabilities include performance tracing, dataset testing, and analysis of latency and token usage.

Skill ansehen

evaluating-llms-harness

Testen

This skill runs standardized LLM evaluations across 60+ academic benchmarks like MMLU and GSM8K using the industry-standard lm-evaluation-harness. Use it for benchmarking model quality, comparing different models, or tracking training progress with support for HuggingFace, vLLM, and API-based models. It provides a consistent, widely-adopted method for reporting academic results.

Skill ansehen