← back to glossary

models

Quantization

A technique that reduces the memory and compute footprint of an AI model by representing its weights with lower numerical precision, making it faster and cheaper to run.

Last updated 2026-05-12