Intel® Low Precision Optimization Tool (Intel® LPOT)

Speed Up Inference Deployment without Sacrificing Accuracy

Features

Supports Automatic Accuracy-Driven Tuning Strategies

Implements unified low-precision inference APIs to auto-tune, generate, and deploy a low-precision inference model with a pretrained FP32 model to achieve product performance and accuracy goals. 
 

Optimizes for Performance, Model Size, and Memory Footprint

Supports FP32, BF16, Int8, and mixed precisions on Intel platforms during tuning and optimizes for both post-training quantization and quantization aware training. 
 

Provides Easy Extension Capability 

Delivers an extensible API design to add new tuning strategies, framework backends, metrics, and objectives.