Evaluation Context Protocol (ECP)
View on GitHub | Quickstart | Specification
Portable evaluations for AI agents.
ECP is a vendor-neutral protocol for testing agent outputs, tool calls, and evaluator-visible audit context across frameworks, models, eval platforms, and CI systems.
Think of it as the evaluation contract layer: MCP gives agents a common way to use tools; ECP gives evaluators a common way to inspect what an agent returned, what tools it used, and what audit evidence it exposed.
Why ECP Exists
Agent evaluations are still too tied to individual frameworks, tracing tools, and hosted platforms. Those tools are useful, but the evaluation contract itself should be portable.
ECP separates the protocol from the platform:
- run evals locally or in CI
- wrap agents built with plain Python, LangChain, LlamaIndex, CrewAI, or PydanticAI
- grade final outputs, tool calls, and
evaluation_context - emit JSON and HTML reports that other systems can ingest
- implement the same JSON-RPC contract in another language or runtime
What ECP Checks
Most evals start with the final answer. ECP also checks the behavior behind that answer.
| Evaluation Need | ECP Surface |
|---|---|
| Did the user-visible answer satisfy the task? | public_output |
| Did the agent call the required tool with the right arguments? | tool_calls |
| Did the agent expose evaluator-safe audit evidence? | evaluation_context |
| Can this run in CI and fail a build? | ecp run --manifest ... |
private_thought is still accepted as a compatibility alias, but new agents should use evaluation_context. ECP is not asking providers to expose raw chain-of-thought; it is asking agents to expose evaluator-safe evidence.
Developer Path
pip install "ecp-runtime==0.3.1" "ecp-sdk==0.3.1"
ecp init
ecp validate ecp_eval/manifest.yaml
ecp run --manifest ecp_eval/manifest.yaml --json
For a realistic example that shows why output-only evals are not enough:
ecp run --manifest examples/customer_support_demo/manifest.yaml --report report.html
What Is In This Repo
sdk/python/src/ecp- Python SDK for implementing ECP agentsruntime/python/src/ecp_runtime- reference runtime andecpCLIexamples/customer_support_demo- flagship policy/tool-use demoexamples/*_demo- framework integrationsschema/- JSON Schema contracts for manifests, agent results, tool calls, and reportsspec/anddocs/spec.md- protocol specification
Go to Quickstart to run your first eval, Examples for integration patterns, or Specification to implement ECP in another runtime.