MLflow

Skip to main content

8 posts tagged with "agents"

Agent Optimization Pipeline

Build a tool-calling agent, evaluate it with domain-specific judges, align those judges to expert feedback, and optimize the system prompt with GEPA.

evaluationoptimizationagentsprompts

Tracing and Evaluating a LangGraph Agent

Build a tool-calling travel planning agent with LangGraph, trace every step with MLflow, and evaluate tool selection accuracy.

agentstracingevaluationlanggraph

Evaluating a Multi-Turn Conversational Agent

Evaluate multi-turn customer support chat quality with MLflow's conversational scorers.

agentsevaluationmulti-turn

Tracing and Evaluating OpenAI Agents

Build an e-commerce agent with OpenAI function calling, trace it with MLflow, and evaluate tool selection accuracy.

agentstracingevaluationopenai

Evaluating Databricks Genie Spaces

A complete pipeline for tracing, evaluating, and improving a Databricks Genie space using MLflow.

databricksgenieevaluationtracingagents

Genie Space Improvement Generator

Take traces that failed evaluation, combine them with your Genie space config, and generate copy-paste-ready fixes with an LLM.

databricksgenieevaluationagents

Genie Evaluation with LLM Judges

Score Genie traces with built-in and custom judges to find quality issues in responses and SQL generation.

databricksgenieevaluationagents

Genie Conversation Tracing Pipeline

Pull conversations from a Genie space and log each one as an MLflow trace for inspection and evaluation.

databricksgenietracingagents