← back to glossary

models

Distillation

A technique for training a smaller, faster model to mimic the behavior of a larger, more capable one — trading some performance for dramatically lower cost and latency.

Last updated 2026-05-12