This repository contains the code for the LLM Apps: Evaluation course.
Learn to build reliable evaluation pipelines for LLM applications by combining programmatic checks with LLM-based judges. Develop techniques for automated evaluation, from writing effective criteria to aligning automated scores with human judgment.
For more LLM, MLOps and W&B platform courses visit AI Academy.
- Create a new conda environment using the provided
requirements.txt
:
conda create --name eval-course --file requirements.txt