進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-1112202010162100
論文名稱(中文) 卷積神經網路模型壓縮與量化感知訓練
論文名稱(英文) Convolutional Neural Network Model Compression and Quantization-aware Training
校院名稱 成功大學
系所名稱(中) 電機工程學系
系所名稱(英) Department of Electrical Engineering
學年度 109
學期 1
出版年 109
研究生(中文) 賴昱安
研究生(英文) Yu-An Lai
學號 N26061725
學位類別 碩士
語文別 中文
論文頁數 90頁
口試委員 指導教授-郭致宏
口試委員-陳中和
口試委員-邱瀝毅
中文關鍵字 卷積神經網路  模型量化  參數剪枝  量化感知訓練 
英文關鍵字 Convolutional Neural Network  Model Quantization  Parameter Pruning  Quantization-aware Training 
學科別分類
中文摘要 深度卷積神經網路 (Deep Convolutional Neural Network, DCNN)在電腦視覺、語音辨識、自然語音處理等領域上取得重大發展。然而,先進的DCNN通常擁有大量的參數和需要龐大的運算資源,因此如何將其部署在終端裝置成為一大挑戰。
本論文提出DCNN剪枝和量化方法,降低了模型的複雜度和運算所需的資源。我們藉由參數剪枝減少了DCNN所需的參數,而量化方案使得DCNN在進行推論 (Inference) 時能以8位元的整數進行卷積運算,比起浮點運算更有效率。藉由所提出的剪枝和量化方法,我們在YOLOv3-tiny模型上能將模型參數儲存量減少89.75%、位元運算量 (Bit operation) 減少97%,且僅有2.58%精度損失。
本論文另外採用量化感知訓練 (Quantization-Aware Training) 方法於網路訓練時量化神經網路,同時在訓練時模擬類比電路的乘加行為,有效將模型量化到低位元並同時適應類比電路中乘加運算的非線性行為。我們根據文獻 [39]及文獻 [40] 所提出的類比乘加器架構設計專用的網路訓練流程。
英文摘要 Deep Convolutional Neural Networks (DCNNs) have made significant progress in computer vision, speech recognition, and natural speech processing, etc. Advanced DCNNs usually contain big amounts of parameters and require huge computation resources. Therefore, deploying the DCNN to the edge devices has become a big challenge.
This thesis proposes DCNN pruning and quantization methods, which reduce the complexity of the model and computation costs. Parameter pruning techniques are used to reduce the parameters of DCNN and the quantization scheme enables DCNN to perform convolution operations with 8-bit integers during inference, which is more efficient than floating-point inference. With the proposed pruning and quantization methods, we can reduce the model parameters by 89.75%, bit operations by 97%, and only 2.58% mAP loss on the YOLOv3-tiny model.
This thesis also adopts quantization-aware training technique to quantize the neural network during network training and simulates the computation behavior of the analog circuits simultaneously. We design a specified network training process based on the analog MAC architectures proposed in [39] and [40].
論文目次 中文摘要 I
致謝 XII
目錄 XIII
表目錄 XV
圖目錄 XVII
第1章 緒論 1
1-1 前言 1
1-2 研究動機 1
1-3 研究貢獻 2
1-4 論文架構 3
第2章 相關研究背景介紹 4
2-1 神經網路 (Neural Network) 4
2-2 YOLO 物件偵測 (Object Detection) 網路 8
2-3 神經網路壓縮 11
2-4 類比訊號運算神經網路 12
第3章 文獻回顧 13
3-1 參數剪枝 13
3-2 模型量化 19
第4章 卷積神經網路壓縮與量化感知訓練 23
4-1 基於敏感度分析之濾波器剪枝 23
4-2 基於變分推論之濾波器剪枝 35
4-3 INT8-16-8卷積神經網路量化 42
4-4 混訊 (Mixed-Signal) 運算神經網路之量化感知訓練 52
4-5 CIM (Computer in Memory) 運算[40]之量化感知訓練 59
第5章 實驗環境與數據分析 65
5-1 實驗環境 65
5-2 基於敏感度分析剪枝實驗結果 65
5-3 基於變分推斷剪枝實驗結果 69
5-4 INT8-16-8網路量化 77
5-5 混訊量化感知訓練實驗結果 81
5-6 CIM量化感知實驗結果 82
第6章 結論與未來展望 83
6-1 結論 83
6-2 未來展望 83
參考文獻 85

參考文獻 [1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
[2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, no. 6, pp. 84–90, 2017.
[3] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255, Ieee, 2009.
[4] J. Redmon and A. Farhadi, “YOLOv3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018.
[5] F. Rosenblatt, “The perceptron: a probabilistic model for informationstorage and organization in the brain.,” Psychological review, vol. 65, no. 6, p. 386, 1958.
[6] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788, 2016.
[7] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263–7271, 2017.
[8] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587, 2014.
[9] R. Girshick, “Fast R-CNN,” in Proceedings of the IEEE international conference on computer vision, pp. 1440–1448, 2015.
[10] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” in Advances in neural information processing systems, pp. 91–99, 2015.
[11] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: Single shot multibox detector,” in European conference on computer vision, pp. 21–37, Springer, 2016.
[12] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125, 2017.
[13] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, 2015.
[14] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, “Inception-v4, Inception-Resnet and the impact of residual connections on learning,” arXiv preprint arXiv:1602.07261, 2016.
[15] B. Wu, F. Iandola, P. H. Jin, and K. Keutzer, “Squeezedet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 129–137, 2017.
[16] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258, 2017.
[17] E. A. Vittoz, “Future of analog in the VLSI environment,” in IEEE International Symposium on Circuits and Systems, pp. 1372–1375, IEEE, 1990.
[18] S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally, “EIE: efficient inference engine on compressed deep neural network,” ACM SIGARCH Computer Architecture News, vol. 44, no. 3, pp. 243–254, 2016.
[19] S. Han, J. Pool, J. Tran, and W. Dally, “Learning both weights and connections for efficient neural network,” in Advances in neural information processing systems, pp. 1135–1143, 2015.
[20] Y. Guo, A. Yao, and Y. Chen, “Dynamic network surgery for efficient DNNs,” in Advances in neural information processing systems, pp. 1379–1387, 2016.
[21] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The journal of machine learning research, vol. 15, no. 1, pp. 1929–1958, 2014.
[22] A. N. Gomez, I. Zhang, K. Swersky, Y. Gal, and G. E. Hinton, “Targeted dropout,” in 2018 CDNNRIA Workshop at the 32nd Conference on Neural Information Processing Systems. NeurIPS, 2018.
[23] W.Wen, C.Wu, Y.Wang, Y. Chen, and H. Li, “Learning structured sparsity in deep neural networks,” in Advances in neural information processing systems, pp. 2074–2082, 2016.
[24] H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning filters for efficient ConvNets,” arXiv preprint arXiv:1608.08710, 2016
[25] Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang, “Learning efficient convolutional networks through network slimming,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 2736–2744, 2017.
[26] J.-H. Luo and J. Wu, “An entropy-based pruning method for CNN compression,” arXiv preprint arXiv:1706.05791, 2017.
[27] H. Hu, R. Peng, Y.-W. Tai, and C.-K. Tang, “Network trimming: A data driven neuron pruning approach towards efficient deep architectures,” arXiv preprint arXiv:1607.03250, 2016.
[28] J.-H. Luo, J. Wu, and W. Lin, “Thinet: A filter level pruning method for deep neural network compression,” in Proceedings of the IEEE international conference on computer vision, pp. 5058–5066, 2017.
[29] Y. He, P. Liu, Z. Wang, Z. Hu, and Y. Yang, “Filter pruning via geometric median for deep convolutional neural networks acceleration,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4340–4349, 2019.
[30] M. Lin, R. Ji, Y. Wang, Y. Zhang, B. Zhang, Y. Tian, and L. Shao, “Hrank: Filter pruning using high-rank feature map,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1529–1538, 2020.
[31] S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding,” arXiv preprint arXiv:1510.00149, 2015.
[32] A. Zhou, A. Yao, Y. Guo, L. Xu, and Y. Chen, “Incremental network quantization: Towards lossless CNNs with low-precision weights,” arXiv preprint arXiv:1702.03044, 2017.
[33] M. Courbariaux, Y. Bengio, and J.-P. David, “Binaryconnect: Training deep neural networks with binary weights during propagations,” in Advances in neural information processing systems, pp. 3123–3131, 2015.
[34] F. Li, B. Zhang, and B. Liu, “Ternary weight networks,” arXiv preprint arXiv:1605.04711, 2016.
[35] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “XNOR-Net: Imagenet classification using binary convolutional neural networks,” in European conference on computer vision, pp. 525–542, Springer, 2016.
[36] S. Migacz, “Nvidia 8-bit inference width TensorRT,” in GPU Technology Conference, 2017.
[37] B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, “Quantization and training of neural networks for efficient integer-arithmetic-only inference,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2704–2713, 2018.
[38] D. P. Kingma, T. Salimans, and M. Welling, “Variational dropout and the local reparameterization trick,” in Advances in neural information processing systems, pp. 2575–2583, 2015.
[39] 林柏翰。「一個基於八位元連續漸進逼近式類比數位轉換器且操作於一億赫茲之混訊神經網路加速器」。碩士論文,國立成功大學電機工程學系,2020。
[40] 林育緯。「可應用於高能量效益多位元卷積神經網路的邊緣裝置之8T SRAM記憶體內運算」。碩士論文,國立成功大學電機工程學系,2020。
[41] Y. Bengio, N. Léonard, and A. Courville, “Estimating or propagating gradients through stochastic neurons for conditional computation,” arXiv preprint arXiv:1308.3432, 2013.
[42] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., “Pytorch: An imperative style, high-performance deep learning library,” in Advances in neural information processing systems, pp. 8026–8037, 2019.
[43] J. Redmon, “Darknet: Open source neural networks in C,” [Online]. Available: http://pjreddie.com/darknet/, 2013–2016.
[44] 曾微中。「深度卷積網路之逐層定點數量化方法與實作YOLOv3推論引擎」。碩士論文,國立成功大學電腦與通信工程研究所,2019。
[45] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, “The Pascal visual object classes challenge,” International journal of computer vision, vol. 88, no. 2, pp. 303–338, 2010.
[46] A. Krizhevsky, V. Nair, and G. Hinton, “CIFAR-10 (Canadian Institute for Advanced Research),” [Online]. Available: http://www.cs.toronto.edu/kriz/cifar. html, vol. 5, 2010.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2025-12-01起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2025-12-01起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw