進階搜尋


 
系統識別號 U0026-2810201513542400
論文名稱(中文) 基於深度學習之靜態影像超解析度技術
論文名稱(英文) Image Super Resolution Based on Deep Learning
校院名稱 成功大學
系所名稱(中) 電機工程學系
系所名稱(英) Department of Electrical Engineering
學年度 104
學期 1
出版年 104
研究生(中文) 郭柏宏
研究生(英文) Po-Hung Kuo
學號 N26024901
學位類別 碩士
語文別 中文
論文頁數 61頁
口試委員 指導教授-郭致宏
口試委員-雷曉方
口試委員-張敏寬
口試委員-葉家宏
中文關鍵字 超解析度技術  深度學習  卷積受限玻爾茲曼機  對比分歧  卷積神經網路  反向傳播法  平行化運算 
英文關鍵字 Super Resolution  Deep Learning  Convolutional Restricted Boltzmann Machine  Convolutional Neural Network  Parallel Computing 
學科別分類
中文摘要 深度學習是一種人工神經網路(Artificial neural network,ANN),主旨為模仿生物的中樞神經系統,尤其是大腦,學習多層次(Hierarchical)的抽象概念。雖然常見的深度學習相關應用是分類辨識,但也有些研究發現它在超解析度應用也能有相當好的表現。在本篇論文中,我們先以單層卷積受限玻爾茲曼機(Convolutional restricted Boltzmann machine, CRBM)實現超解析度技術,探討卷積受限玻爾茲曼機在此應用上的潛力。接著將此架構改造成卷積神經網路(Convolutional neural network, CNN),以取得更好的超解析度效果。人工神經網路的訓練相當耗時。得益於現今圖形處理器(Graphic processing unit, GPU)強大的通用運算效能,訓練時間得以大幅縮短。本篇論文將卷積受限玻爾茲曼機與卷積神經網路的訓練過程針對圖形處理器高度平行化,並以 OpenCL 實現。
深度學習依學習方式可分成監督式學習(Supervised learning)與非監督式學習(Unsupervised learning)。在分類任務中,監督式學習使用反向傳播法(Backpropagation)學習,需要大量有標籤的訓練集(Labeled training dataset),也就是輸出結果。但有標記的數據量通常不多,大多數是無標記數據,因此必須先使用非監督式學習以無標記數據進行預訓練(Pre-training)。非監督式學習通常透過受限玻爾茲曼機(Restricted Boltzmann machine, RBM)訓練,所使用的學習演算法為對比分歧法(Contrastive divergence, CD)或是以其為基礎的變化型演算法。
受限玻爾茲曼機的資料型態都是向量形式。在影像相關的應用中,這樣的方式沒有考慮到影像中的二維關係,於是誕生了卷積受限玻爾茲曼機。它使用了權重分享(Weight sharing)技術,原本的向量內積運算變成了卷積運算,考慮到了影像中的二維關係,使演算法更加有效率,網路大小變得有彈性。
實驗結果顯示,我們的卷積神經網路的超解析度效果相當接近稀疏編碼,但執行時間僅需一百二十六分之一左右。卷積受限玻爾茲曼機與卷積神經網路的訓練演算法以圖形處理器平行化處理後,訓練速度加快了 69 倍至 171 倍不等。
英文摘要 We develop two super resolution methods by different deep learning architecture. The first is the convolutional restricted Boltzmann machine (CRBM), the second is the convolutional neural network (CNN). To accelerate the training procedure, we implement the paralleled training algorithms by a GPU. Our experiments reveals that the super resolution performance of our works is equivalent to that of sparse coding while our processing speed is much faster.
論文目次 中文摘要 ......I
目錄 ....... X
圖目錄 ....... XII
表目錄 ....... XIV
第一章 緒論 ..... 1
1-1 前言 ...... 1
1-2 研究動機 ...... 1
1-3 研究貢獻 ...... 2
1-4 論文架構 ...... 3
第二章 相關研究背景介紹 ..... 4
2-1 超解析度技術 (Super resolution technique) .. 4
2-1-1 靜態影像超解析度 .... 5
2-1-2 動態影像超解析度 .... 7
2-1-3 時域超解析度 ...... 7
2-1-4 稀疏表示 ...... 8
2-2 深度學習 (Deep learning) .... 10
2-2-1 受限玻爾茲曼機 ..... 11
2-2-2 對比分歧(Contrastive divergence) ... 15
2-2-3 深度信念網路 ..... 17
2-2-4 卷積受限玻爾茲曼機 ..... 18
2-2-5 卷積神經網路 ..... 20
2-2-6 反向傳播法 ..... 21
2-3 文獻回顧 ..... 22
2-3-1 基於受限玻爾茲曼機之超解析度技術 ... 22
2-3-2 基於深度信念網路之超解析度技術 ... 23
2-3-3 基於卷積神經網路之超解析度技術 ... 23
第三章 用於影像超解析度技術之神經網路 .. 25
3-1 用於超解析度之單層卷積受限玻爾茲曼機 ... 25
3-2 稀疏性 ..... 28
3-3 用於影像超解析度之卷積受限玻爾茲曼機訓練演算法 . 29
3-4 用於超解析度技術之卷積神經網路 ... 33
3-5 用於影像超解析度之卷積神經網路訓練演算法 ... 34
第四章 深度學習系統實現及模擬結果 .... 36
4-1 OpenCL ..... 36
4-1-1 卷積運算 ..... 38
4-1-2 加總運算 ..... 39
4-2 實驗結果 0000..... 41
4-2-1 用於影像超解析度之卷積受限玻爾茲曼機 ... 41
4-2-2 用於影像超解析度之卷積神經網路 ... 46
4-2-3 CPU 與 GPU 執行速度比較 .... 54
第五章 結論與未來展望 .... 56
5-1 結論 ....... 56
5-2 未來展望 ..... 57
參考文獻 ...... 58
參考文獻 [1] G. Desjardins and Y. Bengio, "Empirical evaluation of convolutional RBMs for vision," Technical report, University of Montreal, Montreal, Quebec, Canada, 2008.
[2] J. Gao, Y. Guo and M. Yin, "Restricted Boltzmann machine approach to couple dictionary training for image superresolution," in IEEE Int. Conference on Image Process., 2013.
[3] G. E. Hinton, "Training products of experts by minimizing contrastive divergence," Neural computation, pp. 1771-1800, 2002.
[4] T. Tieleman, "Training restricted Boltzmann machines using approximations to the likelihood gradient," in Proc. of the 25th Int. Conference on Machine Learning, 2008.
[5] T. Tieleman and G. E. Hinton, "Using fast weights to improve persistent contrastive divergence," in Proc. of the 26th Annu. Int. Conference on Machine Learning, 2009.
[6] J. Yang, J. Wright, T. Huang and Y. Ma, "Image super-resolution via sparse representation," IEEE Tans. Image Process., vol. 19, no. 11, pp. 2861-2873, 2010.
[7] S. Pelletier and J. R. Cooperstock, "Fast super-resolution for rational magnification factors," in IEEE Int. Conference on Image Process., 2007.
[8] C. Liu and D. Sun, "On Bayesian adaptive video super resolution," IEEE Trans. on Pattern Anal. Mach. Intell., vol. 36, no. 2, pp. 346-360, Feb. 2014.
[9] O. Shahar, A. Faktor and M. Irani, "Space-time super-resolution from a single video," in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2011.
[10] W. T. Freeman, T. T. Jones and E. C. Pasztor, "Example-based super-resolution," IEEE Comput. Graph. and Appl., vol. 22, no. 2, pp. 56-65, Mar/Apr 2002.
[11] Y. Ogawa, T. Ariki and T. Takiguchi, "Super-resolution by GMM based conversion using self-reduction image," in Proc. IEEE Int. Conference on Acoust., Speech and Signal Process., 2012.
[12] R. Zeyde, M. Elad and M. Protter, "On single image scale-up using sparse-representations," in Proc. of the 7th Int. Conference on Curves and Surfaces, Avignon, France, 2012.
[13] H. Chang, D. Y. Yeung and Y. Xiong, "Super-resolution through neighbor embedding," in IEEE Conference on Computer Vision and Pattern Recognition, 2004.
[14] M. Bevilacqua, A. Roumy, C. Guillemot and M. L. A. Morel, "Low-complexity single image super-resolution based on nonnegative neighbor embedding," in British Machine Vision Conference, 2012.
[15] R. Timofte, V. De and L. Van Gool, "Anchored neighborhood regression for fast example-based super-resolution," in IEEE Int. Conference on Computer Vision,
2013.
[16] K. Fukushima, "Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position," Biological Cybernetics, vol. 36, no. 4, pp. 193-202, 1980.
[17] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard and L. D. Jackel, "Backpropagation applied to handwritten zip code recognition," Neural Computation, vol. 1, no. 4, pp. 541-551, 1989.
[18] G. E. Hinton, S. Osindero and Y. W. Teh, "A fast learning algorithm for deep belief nets," Neural Computation, vol. 18, pp. 1527-1554, Jul. 2006.
[19] P. Smolensky, "Information processing in dynamical systems: Foundations of harmony theory," in Parallel distributed processing: explorations in the microstructure of cognition, vol. 1, 1986, pp. 194-281.
[20] G. E. Hinton, "A Practical Guide to Training Restricted Boltzmann Machines," 2 Aug. 2010. [Online]. Available:
https://www.cs.toronto.edu/~hinton/absps/guideTR.pdf.
[21] O. Woodford, "Notes on contrastive divergence," 2006. [Online]. Available: http://www.robots.ox.ac.uk/~ojw/files/NotesOnCD.pdf.
[22] P. J. Werbos, Beyond regression: New tools for prediction and analysis in the behavioral sciences, PhD thesis, Harvard University, 1974.
[23] H. Lee, C. Ekanadham and A. Y. Ng, "Sparse deep belief network model for visual area V2," in Advances in Neural Information Processing Systems, 2008.
[24] T. Nakashika, T. Takiguchi and Y. Ariki, "High-frequency restoration using deep belief nets for super-resolution," in IEEE Int. Conference on Signal-Image
Technology & Internet-Based Systems, 2013.
[25] C. Dong, C. C. Loy, K. He and X. Tang, "Learning a deep convolutional network for image super-resolution," in European Conference on Computer Vision, 2014.
[26] H. Lee, R. Grosse, R. Ranganath and A. Y. Ng, "Unsupervised learning of hierarchical representations with convolutional deep belief networks," Communications of the ACM, vol. 54, no. 10, pp. 95-103, Oct. 2011.
[27] V. Nair and G. E. Hinton, "Rectified linear units improve restricted Boltzmann machines," in Proc. of the 27th Int. Conference on Machine Learning, Israel, 2010.
[28] "CompuBench," Kishonti, 2015. [Online]. Available:
https://compubench.com/result.jsp.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2017-11-09起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2017-11-09起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw