CMALab - Computing and Memory Architecture Lab.

Deep learning software optimization

Neural networks are known to be redundant and, thus, can be optimized to reduce computation while preserving output quality. Previously, we proved the potential of software optimization [ICLR2016]. Our current focus is quantization which can reduce data bitwidth for both activation and weights in neural networks [CVPR2017].

[ICLR2016] Y. Kim, E. Park, S. Yoo, T. Lee, L. Yang, D. Shin, "Compression of Deep Convolutional Neural Networks for Fast and Low Power Applications," Proc. International Conference on Learning and Representation (ICLR), May 2016.

[CVPR2017] E. Park, J. Ahn, S. Yoo, "Weighted Entropy-based Quantization for Deep Neural Networks," Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.