LangSight vs Langfuse vs LangWatch vs LangTrace vs Arize Phoenix

Not another LLM eval platform.
Agent runtime reliability, built for MCP.

Langfuse and LangWatch are excellent tools for LLM evaluation and prompt quality. LangSight solves a different problem: monitoring and securing the tools your agents call at runtime — the MCP servers, HTTP APIs, and functions that break silently without anyone noticing.

LangSight

MCP-native · runtime

Agent runtime reliability for AI toolchains. Purpose-built for MCP health monitoring, security scanning, circuit breakers, loop detection, and tool call tracing.

Langfuse

LLM eval & tracing

Strong platform for LLM prompt engineering, evals, and cost tracking. Not designed for MCP server health or security — complementary to LangSight.

LangWatch

LLM quality & guardrails

Focuses on LLM output quality, guardrails, and safety evaluations. Does not cover MCP infrastructure health, CVE scanning, or tool-level security.

LangTrace

OTEL-native tracing

OpenTelemetry-native tracing for LLMs. Good for span capture and latency visibility. Does not do runtime guardrails, MCP health checks, or security scanning.

Arize Phoenix

LLM eval & RAG

Strong for RAG pipeline evaluation, retrieval quality, and LLM tracing. Does not cover MCP server health, circuit breakers, or agent runtime security.

LangSmith

LangChain-native

LangChain's observability platform for prompt debugging, dataset management, and evals. Tightly coupled to LangChain/LangGraph. No MCP health monitoring or security scanning.

What each platform covers

Green rows are unique to LangSight — capabilities no other platform offers today.

FeatureLangSightLangfuseLangWatchLangTraceArize Phoenix
MCP server health monitoringLangSight is the only platform with native MCP health checks.Yes
MCP security scanning (CVE + OWASP)CVE detection and 5 of 10 OWASP MCP checks, built-in.Yes
Tool poisoning detectionInjection, unicode, and base64-encoded payload detection.Yes
Schema drift detectionAlerts when a tool's schema changes unexpectedly between scans.Yes
Loop detection + auto-killArgument-hash and sliding-window detection, configurable terminate action.YesPartial
Budget guardrails (cost limits)Yes
Tool-level circuit breakersYes
Agent tool call tracingYesYesYesYesYes
LLM input / output captureYesYesYesYesYes
Multi-agent call treeYesPartialPartialPartialPartial
Cost attribution per tool callYesYesPartialPartial
Anomaly detectionYesPartialPartial
SLO trackingYesPartial
CI/CD security gate (--ci flag)Yes
Self-hosted (free forever)YesYesYesYesYes
LicenseApache 2.0MIT / ELv2Apache 2.0Apache 2.0ELv2
Primary focusAgent runtime reliabilityLLM evals + tracingLLM quality + guardrailsOTEL-native tracingLLM eval + RAG quality

Comparison based on publicly available documentation as of March 2026. Features may change — check each project's docs for the latest.

LangSight + Langfuse work great together.

Use Langfuse for prompt evaluation and LLM quality. Use LangSight for the runtime layer — MCP health, security, and tool call tracing. They solve different problems at different layers of the stack. No overlap, no conflict.