ngrok

ngrokHQ

Quantization can make an LLM 4x smaller and 2x faster, with barely any quality loss. But what is it? @samwhoo crafted a beautiful interactive essay explaining it from first principles, aimed at coders, not mathematicians.ngrok.com/blog/quantizat… Posted Mar 25, 2026 at 4:35PM