Paper list¶
The paper list is not comprehensive, and the prior work is divided into several categories based on personal understanding.
Master Branch for Me¶
2015-2016¶
- Deep learning with limited numerical precision. 2015 IBM
- DoReFa-Net: Training low bit-width convolutional neural networks with low bit-width gradients. 2016
- BNN: Binarized Neural Networks. NIPS2016
- TWNs: Ternary weight networks. NIPS2016 ucas
- XNOR-Net: ImageNet Classification using binary convolutional neural networks. ECCV2016 washington
- Hardware-oriented approximation of convolutional neural networks. ICLR2016
- Quantized convolutional neural networks for mobile devices. CVPR2016 nlpr
2017¶
- Flexpoint: an adaptive numerical format for efficient training of deep neural networks. 2017 intel
- INQ: Incremental network quantization, towards lossless CNNs with low-precision weights. ICLR2017 intel labs china
- TTQ: Trained ternary quantization. ICLR2017 stanford
- WRPN: wide reduced-precision networks. 2017 Accelerator Architecture Lab, Intel
- HWGQ: Deep Learning with Low Precision by Half-wave Gaussian Quantization. CVPR2017
- A Survey of Model Compression and Acceleration for Deep Neural Networks. 2017
- LP-SGD Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent ISCA2017
- How to Train a Compact Binary Neural Network with High Accuracy? NLPR MicroSoft
2018¶
ICLR2018¶
- VNQ: Variational network quantization. ICLR2018
- WAGE: Training and Inference with Integers in Deep Neural Networks. ICLR2018 oral tsinghua 不仅量化了weight,activation还量化了error, gradient.
- Alternating multi-bit quantization for recurrent neural networks. ICLR2018 alibaba
- Mixed Precision Training. FP16 training ICLR2018 baidu
- Model Compression via distillation and quantization. ICLR2018 google
- Quantized back-propagation: training binarized neural networks with quantized gradients. ICLR2018
CVPR2018¶
- Clip-Q: Deep network compression learning by In-Parallel Pruning Quantization. CVPR2018 SFU
- ELQ: Explicit loss-error-aware quantization for low-bit deep neural networks. CVPR2018 intel tsinghua
- Quantization and training of neural networks for efficient integer-arithmetic-only inference. CVPR2018 Google
- TSQ: two-step quantization for low-bit neural networks. CVPR2018
- SYQ: learning symmetric quantization for efficient deep neural networks. CVPR2018 xilinx
- Towards Effective Low-bitwidth Convolutional Neural Networks. CVPR2018
ECCV2018¶
- LQ-NETs: learned quantization for highly accurate and compact deep neural networks. ECCV2018 Microsoft
- Bi-Real Net: Enhancing the performance of 1-bit CNNs with improved Representational capability and advanced training algorithm. ECCV2018 HKU
- V-Quant: Value-aware quantization for training and inference of neural networks. ECCV2018 facebook
NIPS2018¶
- Heterogeneous Bitwidth Binarization in Convolutional Neural Networks. NIPS2018 microsoft
- HAQ: Hardware-Aware automated quantization. NIPS workshop 2018 mit
- Scalable methods for 8-bits training of neural networks. NIPS2018 intel
AAAI2018¶
- From Hashing to CNNs: training Binary weights vis hashing. AAAI2018 nlpr
Other¶
- Synergy: Algorithm-hardware co-design for convnet accelerators on embedded FPGAs. 2018 UC Berkeley
- Efficient Non-uniform quantizer for quantized neural network targeting Re-configurable hardware. 2018
- HALP: High-Accuracy Low-Precision Training. 2018 stanford
- PACT: parameterized clipping activation for quantized neural networks. 2018 IBM
- QUENN: Quantization engine for low-power neural networks. CF18ACM
- UNIQ: Uniform noise injection for non-uniform quantization of neural networks. 2018
- Training competitive binary neural networks from scratch. 2018
- A white-paper: Quantizing deep convolutional networks for efficient inference. 2018 google
2019¶
====ICLR
- ACIQ: analytical clipping for integer quantization of neural networks. ICLR2019 Intel
- Per-Tensor Fixed-point quantization of the back-propagation algorithm. ICLR2019
- RQ: Relaxed Quantization for disretized NNs. ICLR2019
==== NIPS
- Post training 4-bit quantization of convolution networks for rapid-deployment. NIPS 2019 AIPG, Intel
==== ICCV
- DSQ: Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks. ICCV2019 SenseTime, Beihang
==== CVPR
- FQN: Fully Quantized Network for Object Detection. CVPR2019
- QIL: Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss CVPR2019 ==== Other
- SAWB: Accurate and efficient 2-bit quantized neural networks. sysml2019
- SQuantizer: Simultaneous Learning for Both Sparse and Low-precision Neural Networks. 2019 AIPG, Intel
- Distributed Low Precision Training Without Mixed Precision. Oxford snowcloud.ai
- Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers. ICT Cambricon int8 for weights and activations, int16 for most of the gradients.
- WAGEUBN: Training High-Performance and Large-Scale Deep Neural Networks with Full 8-bit Integers. 应该算是WAGE的进阶版了
2020¶
- LSQ: Learned Step Size Quantization ICLR2020
- Mixed Precision DNNs: All you need is a good parametrization ICLR2020 sony
- HAWQv2: Hessian Aware trace-Weighted Quantization of Neural Networks
- LLSQ: Learned Symmetric Quantization of Neural Networks for Low-precision Integer Hardware. ICLR2020 ICT
Nonclear for Me¶
- Fixed point quantization of deep convolutional networks. 2016
- Training a binary weight object detector by knowledge transfer for autonomous driving. 2018
- Low-bit Quantization of Neural Networks for Efficient Inference. 2019 huawei