Google has released Quantization Aware Training (QAT) API, which enables developers to train and deploy models with the performance benefits of quantization — the process of mapping input values from a large set to output values in a smaller set — while retaining close to their original accuracy. The goal is to support the development of smaller, faster, and more efficient machine learning models well-suited to run on off-the-shelf machines, such as those in medium- and small-business environments where computation resources are at a premium.