系統識別號 U0026-1308201815172600
論文名稱(中文) 具自適應輸出範圍之雙通道二值化網路
論文名稱(英文) Dual Path Binary Neural Network with Adaptive Output Range
校院名稱 成功大學
系所名稱(中) 資訊工程學系
系所名稱(英) Institute of Computer Science and Information Engineering
學年度 106
學期 2
出版年 107
研究生(中文) 游輝亮
研究生(英文) Hui-Liang Yu
學號 P76054096
學位類別 碩士
語文別 英文
論文頁數 34頁
口試委員 指導教授-陳培殷
中文關鍵字 圖片分類  類神經網路  模型壓縮  二值化網路 
英文關鍵字 image classification  neural network  model compression  binary neural network 
中文摘要 近年來Deep Neural Network (DNN)不斷在圖片辨識、語意分割、語音辨識與自動翻譯等等領域,皆有重大的突破。然而較新的DNN通常有大量的參數和複雜的計算,例如AlexNet擁有6千多萬個參數,總記憶體使用量為249MB,辨識一張圖片需要15億次浮點數運算,所以需要Graphics Processing Units (GPUs)幫忙加速訓練過程與縮短推論時間,但是對於嵌入式裝置,像手機或是物聯網 (Internet of things) 這些裝置,只有少量的記憶體空間、電池電力與計算資源,因此把DNN佈署到這些裝置上是有難度的,如何把模型有效率的應用到嵌入式裝置已經成為熱門的研究議題。在模型縮減的領域中,二值化網路是一個非常有希望的技術,兼具低功耗與低儲存空間使用量,但是跟全精度網路相比,預測正確率有不小的差距。本論文改進此缺點,提出一個儲存空間使用量大致相同但預測正確率接近全精度網路的二值化網路。
英文摘要 In recent years, deep neural networks (DNNs) have achieved state-of-the-art results in the fields of image recognition, semantic segmentation and machine translation. However, powerful DNNs usually have a large number of parameters and complex calculations. For instance, ImageNet classification challenge winner in 2012, Alex Net, has a model size of about 249MB and 60 million parameters, which needs to perform about 1.5 billion FLOPs to classify a 224 x 224 image. While perform such complex computations, GPUs based machines usually used to speed up training process and inference time. However, for embedded devices, such as smart phones or Internet of Things, there is only a small amount of memory, battery power and computing resources, so it is difficult to deploy DNN to these devices. In the field of model compression, the binary neural network (BNN) is a very promising method, which features are low power consumption and low storage usage, but there is a large gap in prediction accuracy compared with full-precision networks. This thesis proposed a BNN that about the same storage usage as other BNNs and prediction accuracy is close to full-precision network.
The method proposed in this thesis has three characteristics: First, the convolution layers have two input sources by dual path method. Second, round the batch normalization output. Third, adjust each layer output by a trainable parameter.
The experiments show, our model size is about equal to other BNNs, but the prediction accuracy is much higher. In CIFAR-10 dataset, the prediction accuracy is at least 2.85% higher than other BNNs, even better than ternary network, only 0.69% loss compared to full-precision network. In SVHN dataset, the prediction accuracy is at least 0.21% higher than other BNNs, and even more than 0.58% compared to full-precision network.
論文目次 摘要 I
Abstract II
誌謝 III
Contents IV
Table Captions VI
Figure Captions VII
Chapter 1. Introduction 1
Chapter 2. Background 5
2.1 Neural network 5
2.2 Convolution layer 6
2.3 Fully Connected layer 7
2.4 Batch Normalization layer 8
2.5 Binary Connect 10
2.6 Binary Neural Network (BNN) 13
2.7 XNOR net 14
Chapter 3. Proposed Method 17
3.1 Dual path 17
3.2 The round function 19
3.3 1.5-bit method 21
Chapter 4. Experiments and Comparisons 24
4.1 Experiments configuration 24
4.2 Accuracy 26
4.3 Storage usage 28
4.4 Execution time 29
Chapter 5. Conclusion and Future Work 31
References 32
參考文獻 [1] Alex Krizhevsky, Sutskever Ilya, and E. Hinton Geoffrey. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
[2] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
[3] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. Technical report, arXiv:1409.4842, 2014
[4] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. In CVPR, 2015.
[5] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and L. Yuille Alan. Semantic image segmentation with deep convolutional nets and fully connected crfs. In ICLR, 2015.
[6] Geoffrey Hinton, Li Deng, George E. Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 29(6):82–97, Nov. 2012.
[7] Tara Sainath, Abdel rahman Mohamed, Brian Kingsbury, and Bhuvana Ramabhadran. Deep convolutional neural networks for LVCSR. In ICASSP 2013, 2013.
[8] Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, and John Makhoul. Fast and robust neural network joint models for statistical machine translation. In Proc. ACL’2014, 2014.
[9] Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neural networks. In NIPS’2014, 2014.
[10] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. In ICLR’2015, arXiv:1409.0473, 2015.
[11] Vanhoucke Vincent, Senior Andrew, and Mao Mark Z. Improving the speed of neural networks on cpus. In Proc. Deep Learning and Unsupervised Feature Learning NIPS Workshop, volume 1, 2011.
[12] Farabet Clement, LeCun Yann, Kavukcuoglu Koray, Culurciello Eugenio, Martini Berin, Ak-selrod Polina, and Talay Selcuk. Large-scale fpga-based convolutional networks. Scaling up Machine Learning: Parallel and Distributed Approaches, pp. 399–419, 2011.
[13] Pham Phi-Hung, Jelaca Darko, Farabet Clement, Martini Berin, LeCun Yann, and Culurciello Eugenio. Neuflow: Dataflow vision processing system-on-a-chip. In Circuits and Systems (MWSCAS), 2012 IEEE 55th International Midwest Symposium on, pp. 1044–1047. IEEE, 2012.
[14] Chen Yunji, Luo Tao, Liu Shaoli, Zhang Shijin, He Liqiang, Wang Jia, Li Ling, Chen Tianshi, Xu Zhiwei, Sun Ninghui, et al. Dadiannao: A machine-learning supercomputer. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 609–622. IEEE Computer Society, 2014
[15] Hanson S.J., Pratt L.Y.: Comparing biases for minimal network construction with backpropagation. In: Advances in neural information processing systems. (1989) 177–185
[16] Han Song, Pool, Jeff Tran, John and Dally William. Learning both weights and connections for efficient neural network. In Advances in Neural Information Processing Systems, pp. 1135–1143, 2015.
[17] Yiwen Guo, Anbang Yao, and Yurong Chen. Dynamic network surgery for efficient dnns. In NIPS, 2016.
[18] Lin M., Chen Q., Yan S.: Network in network. In ICLR, 2014.
[19] He K., Zhang, X., Ren S., Sun J.: Identity Mappings in Deep Residual Networks. In European Conference on Computer Vision, 2016.
[20] Iandola F.N., Moskewicz M.W., Ashraf K., Han S., Dally W.J., Keutzer K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and¡ 1mb model size. arXiv preprint arXiv:1602.07360 (2016)
[21] Gong Yunchao, Liu Liu, Yang Ming and Bourdev Lubomir. Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115, 2014.
[22] Han Song, Mao Huizi and Dally William J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015.
[23] Wenlin Chen, James T. Wilson, Stephen Tyree, Kilian Q. Weinberger, and Yixin Chen. Compressing neural networks with the hashing trick. In ICML, 2015.
[24] Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. Deep learning with limited numerical precision. In ICML, 2015.
[25] I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv and Y. Bengio. Quantized neural networks: Training neural networks with low precision weights and activations. arXiv preprint arXiv:1609.07061, 2016.
[26] Courbariaux M., Bengio Y., David J.P.: Binaryconnect: Training deep neural networks with binary weights during propagations. In: Advances in Neural Information Processing Systems. 2015.
[27] Courbariaux M., Bengio Y.: Binarynet: Training deep neural networks with weights and activations constrained to +1 or -1. CoRR, 2016.
[28] Bengio Yoshua. Estimating or propagating gradients through stochastic neurons. Technical Report arXiv:1305.2982, Universite de Montreal, 2013.
[29] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi. Xnor-net: Imagenet classification using binary convolutional neural networks. In European Conference on Computer Vision, pages 525–542. Springer, 2016.
[30] Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. Cifar-10 (canadian institute for advanced research). 2012. URL http://www.cs.toronto.edu/~kriz/cifar.html.
[31] Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, Andrew Y. Ng Reading Digits in Natural Images with Unsupervised Feature Learning NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011.
[32] Fengfu Li and Bin Liu. Ternary weight networks. arXiv preprint arXiv:1605.04711v1, 2016
[33] K. Chellapilla, S. Puri, P. Simard, et al. High performance convolutional neural networks for document processing. In Tenth International Workshop on Frontiers in Handwriting Recognition, 2006.
  • 同意授權校內瀏覽/列印電子全文服務,於2023-08-12起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2023-08-12起公開。

  • 如您有疑問,請聯絡圖書館