8 posts tagged with "agents"
Agent Optimization Pipeline
Build a tool-calling agent, evaluate it with domain-specific judges, align those judges to expert feedback, and optimize the system prompt with GEPA.evaluationoptimizationagentsprompts
Tracing and Evaluating a LangGraph Agent
Build a tool-calling travel planning agent with LangGraph, trace every step with MLflow, and evaluate tool selection accuracy.agentstracingevaluationlanggraph
Evaluating a Multi-Turn Conversational Agent
Evaluate multi-turn customer support chat quality with MLflow's conversational scorers.agentsevaluationmulti-turn
Tracing and Evaluating OpenAI Agents
Build an e-commerce agent with OpenAI function calling, trace it with MLflow, and evaluate tool selection accuracy.agentstracingevaluationopenai
Evaluating Databricks Genie Spaces
A complete pipeline for tracing, evaluating, and improving a Databricks Genie space using MLflow.databricksgenieevaluationtracingagents
Genie Space Improvement Generator
Take traces that failed evaluation, combine them with your Genie space config, and generate copy-paste-ready fixes with an LLM.databricksgenieevaluationagents
Genie Evaluation with LLM Judges
Score Genie traces with built-in and custom judges to find quality issues in responses and SQL generation.databricksgenieevaluationagents
Genie Conversation Tracing Pipeline
Pull conversations from a Genie space and log each one as an MLflow trace for inspection and evaluation.databricksgenietracingagents