Evaluation & Analysis Guides
============================

**Problem-solving guides** for running experiments and evaluating LLM outputs in HoneyHive.

.. tip::
   **New to experiments?** Start with the :doc:`../../tutorials/05-run-first-experiment` tutorial first.
   It walks you through running your first experiment with evaluators in 15 minutes!

Overview
--------

Experiments in HoneyHive help you systematically test and improve AI applications. These guides show you how to solve specific evaluation challenges.

**What You Can Do:**

- Run experiments with the ``evaluate()`` function
- Create custom evaluators to measure quality
- Compare experiments to track improvements
- Manage datasets for systematic testing
- Evaluate multi-step pipelines and agents
- Analyze results to identify patterns
- Apply best practices for reliable evaluation

See the guides below for specific evaluation scenarios.

.. toctree::
   :maxdepth: 1
   :caption: Experiments & Evaluation

   running-experiments
   creating-evaluators
   comparing-experiments
   dataset-management
   dataset-crud
   server-side-evaluators
   multi-step-experiments
   result-analysis
   best-practices
   troubleshooting
