Quantization matters

from blog Simon Willison's Weblog, | ↗ original
Quantization matters What impact does quantization have on the performance of an LLM? been wondering about this for quite a while, now here are numbers from Paul Gauthier. He ran differently quantized versions of Qwen 2.5 32B Instruct through his Aider code editing benchmark and saw a range of scores. The original released weights (BF16) scored...