Paper list¶

The paper list is not comprehensive, and the prior work is divided into several categories based on personal understanding.

Master Branch for Me¶

Deep learning with limited numerical precision. 2015 IBM
DoReFa-Net: Training low bit-width convolutional neural networks with low bit-width gradients. 2016
BNN: Binarized Neural Networks. NIPS2016
TWNs: Ternary weight networks. NIPS2016 ucas
XNOR-Net: ImageNet Classification using binary convolutional neural networks. ECCV2016 washington
Hardware-oriented approximation of convolutional neural networks. ICLR2016
Quantized convolutional neural networks for mobile devices. CVPR2016 nlpr

Flexpoint: an adaptive numerical format for efficient training of deep neural networks. 2017 intel
INQ: Incremental network quantization, towards lossless CNNs with low-precision weights. ICLR2017 intel labs china
TTQ: Trained ternary quantization. ICLR2017 stanford
WRPN: wide reduced-precision networks. 2017 Accelerator Architecture Lab, Intel
HWGQ: Deep Learning with Low Precision by Half-wave Gaussian Quantization. CVPR2017
A Survey of Model Compression and Acceleration for Deep Neural Networks. 2017
LP-SGD Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent ISCA2017
How to Train a Compact Binary Neural Network with High Accuracy? NLPR MicroSoft

VNQ: Variational network quantization. ICLR2018
WAGE: Training and Inference with Integers in Deep Neural Networks. ICLR2018 oral tsinghua 不仅量化了weight,activation还量化了error, gradient.
Alternating multi-bit quantization for recurrent neural networks. ICLR2018 alibaba
Mixed Precision Training. FP16 training ICLR2018 baidu
Model Compression via distillation and quantization. ICLR2018 google
Quantized back-propagation: training binarized neural networks with quantized gradients. ICLR2018

Clip-Q: Deep network compression learning by In-Parallel Pruning Quantization. CVPR2018 SFU
ELQ: Explicit loss-error-aware quantization for low-bit deep neural networks. CVPR2018 intel tsinghua
Quantization and training of neural networks for efficient integer-arithmetic-only inference. CVPR2018 Google
TSQ: two-step quantization for low-bit neural networks. CVPR2018
SYQ: learning symmetric quantization for efficient deep neural networks. CVPR2018 xilinx
Towards Effective Low-bitwidth Convolutional Neural Networks. CVPR2018

LQ-NETs: learned quantization for highly accurate and compact deep neural networks. ECCV2018 Microsoft
Bi-Real Net: Enhancing the performance of 1-bit CNNs with improved Representational capability and advanced training algorithm. ECCV2018 HKU
V-Quant: Value-aware quantization for training and inference of neural networks. ECCV2018 facebook

Heterogeneous Bitwidth Binarization in Convolutional Neural Networks. NIPS2018 microsoft
HAQ: Hardware-Aware automated quantization. NIPS workshop 2018 mit
Scalable methods for 8-bits training of neural networks. NIPS2018 intel

Synergy: Algorithm-hardware co-design for convnet accelerators on embedded FPGAs. 2018 UC Berkeley
Efficient Non-uniform quantizer for quantized neural network targeting Re-configurable hardware. 2018
HALP: High-Accuracy Low-Precision Training. 2018 stanford
PACT: parameterized clipping activation for quantized neural networks. 2018 IBM
QUENN: Quantization engine for low-power neural networks. CF18ACM
UNIQ: Uniform noise injection for non-uniform quantization of neural networks. 2018
Training competitive binary neural networks from scratch. 2018
A white-paper: Quantizing deep convolutional networks for efficient inference. 2018 google

====ICLR

ACIQ: analytical clipping for integer quantization of neural networks. ICLR2019 Intel
Per-Tensor Fixed-point quantization of the back-propagation algorithm. ICLR2019
RQ: Relaxed Quantization for disretized NNs. ICLR2019

==== NIPS

Post training 4-bit quantization of convolution networks for rapid-deployment. NIPS 2019 AIPG, Intel

==== ICCV

DSQ: Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks. ICCV2019 SenseTime, Beihang

==== CVPR

FQN: Fully Quantized Network for Object Detection. CVPR2019
QIL: Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss CVPR2019 ==== Other
SAWB: Accurate and efficient 2-bit quantized neural networks. sysml2019
SQuantizer: Simultaneous Learning for Both Sparse and Low-precision Neural Networks. 2019 AIPG, Intel
Distributed Low Precision Training Without Mixed Precision. Oxford snowcloud.ai
Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers. ICT Cambricon int8 for weights and activations, int16 for most of the gradients.
WAGEUBN: Training High-Performance and Large-Scale Deep Neural Networks with Full 8-bit Integers. 应该算是WAGE的进阶版了

LSQ: Learned Step Size Quantization ICLR2020
Mixed Precision DNNs: All you need is a good parametrization ICLR2020 sony
HAWQv2: Hessian Aware trace-Weighted Quantization of Neural Networks
LLSQ: Learned Symmetric Quantization of Neural Networks for Low-precision Integer Hardware. ICLR2020 ICT

Fixed point quantization of deep convolutional networks. 2016
Training a binary weight object detector by knowledge transfer for autonomous driving. 2018
Low-bit Quantization of Neural Networks for Efﬁcient Inference. 2019 huawei