AI agent evaluation is evolving — it’s no longer just about what the AI agent outputs, but how it got there. In this Part 3 webinar of the community series with Arize AI, we will dive into advanced AI agent evaluation techniques, including path-based reasoning, convergence analysis, and even using agents to evaluate other agents.
Explore how to measure the efficiency and structure of agent reasoning paths, assess collaboration in multi-agent systems, and evaluate the quality of planning in complex setups like hierarchical or crew-based frameworks. You will also get a look at emerging techniques like self-evaluation, peer review, and agent-as-judge models — where agents critique and improve each other in real time.
What We Will Cover:
Missed the earlier parts? Catch up on Part 1 and Part 2 of the series!
Head of Developer Relations at Arize AI