Paper list

The paper list is not comprehensive, and the prior work is divided into several categories based on personal understanding.

Master Branch for Me

2015-2016

  1. Deep learning with limited numerical precision. 2015 IBM
  2. DoReFa-Net: Training low bit-width convolutional neural networks with low bit-width gradients. 2016
  3. BNN: Binarized Neural Networks. NIPS2016
  4. TWNs: Ternary weight networks. NIPS2016 ucas
  5. XNOR-Net: ImageNet Classification using binary convolutional neural networks. ECCV2016 washington
  6. Hardware-oriented approximation of convolutional neural networks. ICLR2016
  7. Quantized convolutional neural networks for mobile devices. CVPR2016 nlpr

2017

  1. Flexpoint: an adaptive numerical format for efficient training of deep neural networks. 2017 intel
  2. INQ: Incremental network quantization, towards lossless CNNs with low-precision weights. ICLR2017 intel labs china
  3. TTQ: Trained ternary quantization. ICLR2017 stanford
  4. WRPN: wide reduced-precision networks. 2017 Accelerator Architecture Lab, Intel
  5. HWGQ: Deep Learning with Low Precision by Half-wave Gaussian Quantization. CVPR2017
  6. A Survey of Model Compression and Acceleration for Deep Neural Networks. 2017
  7. LP-SGD Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent ISCA2017
  8. How to Train a Compact Binary Neural Network with High Accuracy? NLPR MicroSoft

2018

ICLR2018

  1. VNQ: Variational network quantization. ICLR2018
  2. WAGE: Training and Inference with Integers in Deep Neural Networks. ICLR2018 oral tsinghua 不仅量化了weight,activation还量化了error, gradient.
  3. Alternating multi-bit quantization for recurrent neural networks. ICLR2018 alibaba
  4. Mixed Precision Training. FP16 training ICLR2018 baidu
  5. Model Compression via distillation and quantization. ICLR2018 google
  6. Quantized back-propagation: training binarized neural networks with quantized gradients. ICLR2018

CVPR2018

  1. Clip-Q: Deep network compression learning by In-Parallel Pruning Quantization. CVPR2018 SFU
  2. ELQ: Explicit loss-error-aware quantization for low-bit deep neural networks. CVPR2018 intel tsinghua
  3. Quantization and training of neural networks for efficient integer-arithmetic-only inference. CVPR2018 Google
  4. TSQ: two-step quantization for low-bit neural networks. CVPR2018
  5. SYQ: learning symmetric quantization for efficient deep neural networks. CVPR2018 xilinx
  6. Towards Effective Low-bitwidth Convolutional Neural Networks. CVPR2018

ECCV2018

  1. LQ-NETs: learned quantization for highly accurate and compact deep neural networks. ECCV2018 Microsoft
  2. Bi-Real Net: Enhancing the performance of 1-bit CNNs with improved Representational capability and advanced training algorithm. ECCV2018 HKU
  3. V-Quant: Value-aware quantization for training and inference of neural networks. ECCV2018 facebook

NIPS2018

  1. Heterogeneous Bitwidth Binarization in Convolutional Neural Networks. NIPS2018 microsoft
  2. HAQ: Hardware-Aware automated quantization. NIPS workshop 2018 mit
  3. Scalable methods for 8-bits training of neural networks. NIPS2018 intel

AAAI2018

  1. From Hashing to CNNs: training Binary weights vis hashing. AAAI2018 nlpr

Other

  1. Synergy: Algorithm-hardware co-design for convnet accelerators on embedded FPGAs. 2018 UC Berkeley
  2. Efficient Non-uniform quantizer for quantized neural network targeting Re-configurable hardware. 2018
  3. HALP: High-Accuracy Low-Precision Training. 2018 stanford
  4. PACT: parameterized clipping activation for quantized neural networks. 2018 IBM
  5. QUENN: Quantization engine for low-power neural networks. CF18ACM
  6. UNIQ: Uniform noise injection for non-uniform quantization of neural networks. 2018
  7. Training competitive binary neural networks from scratch. 2018
  8. A white-paper: Quantizing deep convolutional networks for efficient inference. 2018 google

2019

====ICLR

  1. ACIQ: analytical clipping for integer quantization of neural networks. ICLR2019 Intel
  2. Per-Tensor Fixed-point quantization of the back-propagation algorithm. ICLR2019
  3. RQ: Relaxed Quantization for disretized NNs. ICLR2019

==== NIPS

  1. Post training 4-bit quantization of convolution networks for rapid-deployment. NIPS 2019 AIPG, Intel

==== ICCV

  1. DSQ: Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks. ICCV2019 SenseTime, Beihang

==== CVPR

  1. FQN: Fully Quantized Network for Object Detection. CVPR2019
  2. QIL: Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss CVPR2019 ==== Other
  3. SAWB: Accurate and efficient 2-bit quantized neural networks. sysml2019
  4. SQuantizer: Simultaneous Learning for Both Sparse and Low-precision Neural Networks. 2019 AIPG, Intel
  5. Distributed Low Precision Training Without Mixed Precision. Oxford snowcloud.ai
  6. Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers. ICT Cambricon int8 for weights and activations, int16 for most of the gradients.
  7. WAGEUBN: Training High-Performance and Large-Scale Deep Neural Networks with Full 8-bit Integers. 应该算是WAGE的进阶版了

2020

  1. LSQ: Learned Step Size Quantization ICLR2020
  2. Mixed Precision DNNs: All you need is a good parametrization ICLR2020 sony
  3. HAWQv2: Hessian Aware trace-Weighted Quantization of Neural Networks
  4. LLSQ: Learned Symmetric Quantization of Neural Networks for Low-precision Integer Hardware. ICLR2020 ICT

Nonclear for Me

  1. Fixed point quantization of deep convolutional networks. 2016
  2. Training a binary weight object detector by knowledge transfer for autonomous driving. 2018
  3. Low-bit Quantization of Neural Networks for Efficient Inference. 2019 huawei