LLM search relevance judge that double checks its work

from blog Doug Turnbull, | ↗ original
Expanding on my previous post, I show the impact of checking both directions in pairwise evaluation