進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-1808202016280300
論文名稱(中文) 雙向引導擴散式神經網路應用於立體視覺影像匹配
論文名稱(英文) Dual-GDNet: Dual Guided-diffusion Network for Stereo Image Dense Matching
校院名稱 成功大學
系所名稱(中) 測量及空間資訊學系
系所名稱(英) Department of Geomatics
學年度 108
學期 2
出版年 109
研究生(中文) 王瑞評
研究生(英文) Ruei-Ping Wang
學號 P66074086
學位類別 碩士
語文別 英文
論文頁數 71頁
口試委員 指導教授-林昭宏
口試委員-張智安
口試委員-蔡富安
口試委員-曾義星
中文關鍵字 立體視覺影像匹配  深度卷積神經網路 
英文關鍵字 Stereo matching  Deep Convolutional Neural Network 
學科別分類
中文摘要 在3D 資訊重建中,立體視覺影像密匹配為關鍵步驟之一,其在攝影測量和計算機視覺領域方面,仍然是一項艱鉅的任務。除了基於罩窗的匹配之外,在機器學習的最新研究中,其通過使用深度卷積神經網絡(DCNN),在密匹配取得了很大進展。雙向引導擴散式神經網絡(Dual-GDNet),於本研究中提出,其不僅在網絡設計和訓練過程中,採用由左影像至右影像的傳統匹配架構,並於訓練中加入由右至左的左右一致性檢測之特性,意於減少錯誤匹配的機率。此外,抑制型回歸(Suppressed Regression) 亦於本研究中提出,其通過在視差回歸前,去除無關機率信息來估計視差,避免在多峰機率分布中,估計出不屬於任何一峰的視差結果。Dual-GDNet中的左右一致性概念,可應用於現有的DCNN 模型,以進一步改善視差估計。為了評估各新型神經網路設計之性能,選擇了GANet作為骨幹和主要比較對象,並採納Scene Flow 和KITTI 2015 立體數據集,做為訓練與評估模型的資料集。實驗結果證明,在端點誤差(EPE Error)、大於一之像素誤差比率(Error Rate) 和top2誤差方面,與相關模型比較之下,表現較為優異,其中Scene Flow 數據集的改進為2-10%,KITTI 2015 數據集的改進為2-8%。
英文摘要 Stereo dense matching which plays a key role in 3D reconstruction still remains a challenging task in photogrammetry and computer vision. In addition to block-based matching, recent studies based on machine learning have achieved great progress in stereo dense matching by using deep convolutional neural networks (DCNN). In this paper, a novel neural network called dual guided-diffusion network (Dual-GDNet) is proposed, which utilizes not only left-to-right but right-to-left image matchings in the network design and training with a consistentization process to reduce the possibilities of mis-matching. In addition, suppressed regression is proposed to refine disparity estimation by removing unrelated information before regression to prevent ambiguous predictions on multi-peaks probability distributions. The proposed Dual-GDNet can be applied to existing DCNN models for further improvement on disparity estimation. To estimate the performance, GA-Net is selected as the backbone, and the model was evaluated on the stereo datasets including Scene Flow and KITTI 2015. Experimental results demonstrate the superior, in terms of end-point-error, > 1 pixel error rate, and top-2 error, of the proposed model, compared with related models. An improvement of 2-10% on Scene Flow and 2-8% on KITTI 2015 datasets were obtained.
論文目次 摘要 - i
Abstract - ii
致謝 - iii
Table of Contents - iv
List of Tables - vi
List Figures - vii
Chapter 1. Introduction - 1
Chapter 2. Related Work - 5
Chapter 3. Background - 7
Semi-Global Matching (SGM) - 7
Matching cost calculation - 7
Cost aggregation - 17
Disparity computation/optimization - 19
Disparity refinement - 19
GANet - 21
Feature extraction - 22
Cost volume construction - 24
Guidance network - 25
Semi-global aggregation (SGA) - 26
Local guided aggregation (LGA) - 29
Disparity regression - 29
Chapter 4. Methodology - 32
Dual-GDNet - 34
Visible and invisible pixels problem - 36
Probability distribution with higher generalizability - 38
Flipped training - 39
Guided Diffusion Layer (GD) - 44
Training in Cross Entropy with Softmax - 46
Suppressed Regression - 50
Chapter 5. Experimental Results - 53
Evaluation on Scene Flow Dataset - 54
Evaluation on KITTI 2015 Dataset - 56
Effect on Flipped Training - 64
Chapter 6. Conclusions - 68
References - 69
參考文獻 Atienza, R. (2018). Fast disparity estimation using dense networks. CoRR,abs/1805.07499. Retrieved from http://arxiv.org/abs/1805.07499

Žbontar, J., & LeCun, Y. (2015). Stereo matching by training a convolutional neural network to compare image patches. CoRR, abs/1510.05970. Retrieved from http://dblp.uni-trier.de/db/journals/corr/corr1510.html#ZbontarL15

Chang, J., & Chen, Y. (2018). Pyramid stereo matching network. CoRR,abs/1803.08669. Retrieved from http://arxiv.org/abs/1803.08669

Chen, J., & Yuan, C. (2016, 09). Convolutional neural network using multiscale information for stereo matching cost computation. In (p. 34243428). doi: 10.1109/ICIP.2016.7532995

Cheng, F., He, X., & Zhang, H. (2017, 08). Learning to refine depth for robust stereo estimation. Pattern Recognition, 74. doi: 10.1016/j.patcog.2017.07.027

Cheng, X., Wang, P., & Yang, R. (2018). Learning depth with convolutional spatial propagation network. arXiv preprint arXiv:1810.02695

Godard, C., Aodha, O. M., & Brostow, G. J. (2016). Unsupervised monocular depth estimation with leftright consistency. CoRR, abs/1609.03677. Retrieved from http://arxiv.org/abs/1609.03677

He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. CoRR, abs/1512.03385. Retrieved from http://arxiv.org/abs/1512.03385

Hinton, G. E., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. ArXiv, abs/1503.02531

Hirschmüller, H. (2008). Stereo Processing by SemiGlobal Matching and Mutual Information. IEEE TRANSACTIONS ON PATTERN 69 doi:10.6844/NCKU202002215 ANALYSIS AND MACHINE INTELLIGENCE, 30(2), 328–341

Huang, G., Liu, Z., & Weinberger, K. Q. (2016). Densely connected convolutional networks. CoRR, abs/1608.06993. Retrieved from http://arxiv.org/abs/1608.06993

Kang, J., Chen, L., Deng, F., & Heipke, C. (2019, 09). Context pyramidal network for stereo matching regularized by disparity gradients. ISPRS Journal of Photogrammetry and Remote Sensing, 157. doi:10.1016/j.isprsjprs.2019.09.012

Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A., & Bry, A. (2017). Endtoend learning of geometry and context for deep stereo regression. CoRR, abs/1703.04309. Retrieved from http://arxiv.org/abs/1703.04309

Kingma, D., & Ba, J. (2014, 12). Adam: A method for stochastic optimization. International Conference on Learning Representations

Liu, S., De Mello, S., Gu, J., Zhong, G., Yang, M.H., & Kautz, J. (2017). Learning affinity via spatial propagation networks. In I. Guyon et al. (Eds.), Advances in neural information processing systems 30 (pp. 1520–1530). Curran Associates, Inc. Retrieved from http://papers.nips.cc/paper/6750-learning-affinity-via-spatial-propagation-networks.pdf

Luo, W., Schwing, A. G., & Urtasun, R. (2016). Efficient deep learning for stereo matching. In 2016 ieee conference on computer vision and pattern recognition (cvpr) (p. 56955703)

Mayer, N., Ilg, E., Häusser, P., Fischer, P., Cremers, D., Dosovitskiy, A., & Brox, T. (2015). A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. CoRR, abs/1512.02134. Retrieved from http://arxiv.org/abs/
1512.02134

Menze, M., & Geiger, A. (2015). Object scene flow for autonomous vehicles. In 2015 ieee conference on computer vision and pattern recognition (cvpr) (p. 30613070)

Mozerov, M., & Weijer, J. (2015, 01). Accurate stereo matching by twostep energy minimization. IEEE transactions on image processing : a 70 doi:10.6844/NCKU202002215 publication of the IEEE Signal Processing Society, 24. doi: 10.1109/TIP.2015.2395820

PierrotDeseilligny, M., & Paparoditis, N. (2006). A multiresolution and optimizationbased image matching approach: An application to surface reconstruction from SPOT5HRS stereo imagery. The International Archives of Photogrammetry, Remote Sensing andSpatial Information Sciences, 36(1/W41), 328–341

Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47(13),7–42

Seki, A., & Pollefeys, M. (2017). A. Seki and M. Pollefeys. Sgmnets:Semiglobal matching with neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6640–6649

Wolf, P. R., & Dewitt, B. A. (2000). Elements of Photogrammetry: with applications in GIS. New York: McGrawHill, 30

Zhang, F., Prisacariu, V., Yang, R., & Torr, P. H. (2019). GANet:Guided Aggregation Net for Endtoend Stereo Matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 185–194
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2022-08-19起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2022-08-19起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw