Creating a LLM-as-a-Judge that drives business results

from blog Simon Willison's Weblog, | ↗ original
Creating a LLM-as-a-Judge that drives business results Hamel Husain's sequel to Your AI product needs evals. This is packed with hard-won actionable advice. Hamel warns against using scores on a 1-5 scale, instead promoting an alternative he calls "Critique Shadowing". Find a domain expert (one is better than many, because you want to keep their...