進階搜尋


下載電子全文  
系統識別號 U0026-1906201812051600
論文名稱(中文) 以自動編碼器及生成對抗網路系統建置高解析三維點雲模型
論文名稱(英文) Reconstruction of high resolution 3D point cloud models based on Auto-encoder and Generative Adversarial Networks System
校院名稱 成功大學
系所名稱(中) 工程科學系
系所名稱(英) Department of Engineering Science
學年度 106
學期 2
出版年 107
研究生(中文) 陳致佑
研究生(英文) Chih-Yu Chen
學號 N96054154
學位類別 碩士
語文別 中文
論文頁數 54頁
口試委員 指導教授-黃悅民
口試委員-李維聰
口試委員-張傳育
口試委員-黃顯詔
口試委員-陳靜茹
中文關鍵字 稀疏點雲  稠密點雲  自動編碼器  生成對抗網路  三維重建 
英文關鍵字 sparse point clouds  dense point clouds  Autoencoder  GANs  3D reconstruction 
學科別分類
中文摘要 隨著機器人、智慧城市、無人駕駛及擴增實境、虛擬實境等相關領域蓬勃發展,真實世界中有效獲得三維空間的訊息是各領域都在努力探討的部分,使用視覺或Lidar的方式掃描物體重建時常遭遇到被遮蔽以及破洞問題,當掃描的點雲較為稀少時,稀疏點雲特徵將無法有效地重建三維物體,必須耗費更多的時間成本才能獲得較多的點雲資料填滿整個物體完成較高解析度的重建。

本論文中提出3D生成系統,使用端對端的自動編碼器(Autoencoder)及生成對抗網路(GANs)系統將三維稀疏點雲重建至高解析的三維稠密點雲,展示了本架構在三維點雲上的無監督式學習過程以及如何生成資料,其網路架構的關鍵想法是藉由自動編碼器以及循環生成對抗網路(CycleGANs)共同訓練而成,輸入的稀疏點雲是源自於真實點雲中亂數取樣一千點,其訓練轉換的過程中兩者間的關聯性相當重要,使用樣本配對訓練的方式能有效地學習彼此間轉換的資訊,目的是將稀疏點雲學習到稠密風格轉換並重建至高解析的稠密點雲。

最後驗證此系統的結果,證明本網路架構藉由循環生成對抗的方式學習到真實三維點雲的分布轉換,亦能夠透過單一類別訓練學習稠密風格的轉換,未訓練之類別重建皆有不錯的效果,透過IoU驗證指標證明其訓練的網路非常具有高適用性,最重要的是當輸入一千點的三維稀疏點雲即可恢復破碎的表面以及細節,有效地重建至高解析的三維稠密點雲。
英文摘要 In this thesis, a 3D generative system which reconstructs the complete 3D structure of high resolution point clouds from sparse point clouds using an end-to-end autoencoder and generative adversarial networks is proposed. We present the deeplearning and data generation process of the 3D generative system. The key idea of the system is to combine autoencoder and cycle generative adversarial networks framework.

The input sparse point clouds are derived from random sampling of a thousand of points in the ground truth point clouds. A paired training approach is used, which allows the system to learn the mapping between an input sparse point clouds and an output dense point clouds because the relationship between the data is very important. The goal is to translate the dense style that sparse point clouds learn into reconstructed high resolution dense point clouds.

Finally, the results prove that the network can translate sparse point clouds to dense distribution by CycleGANs training. The network it trained on single category can be applied to other untrained categories. The results have excellent scores in IoU metrics. In summary, the 3D generative system in this thesis can recover broken surface and details to reconstruct high resolution dense point clouds effectively with a thousand sparse point clouds.
論文目次 摘要 I
Extended Abstract II
誌謝 X
目錄 XI
表目錄 XIV
圖目錄 XV
第一章、緒論 1
1-1研究動機與背景 1
1-2研究目的 4
1-3章節編排 4
第二章、文獻探討 5
2-1三維空間感測技術 5
2-1-1立體視覺演算法 5
2-1-2時差測距Time of flight方法 7
2-1-3結構光Structured light方法 9
2-2基於幾何學的3D重建法 12
2-3基於深度學習架構的3D重建法 14
2-3-1深度學習之多視覺重建法 14
2-3-2深度學習之3D資料集重建法 15
第三章、開發測試平台及環境資料介紹 17
3-1 Tensortflow / Keras深度學習架構的開發環境 17
3-1-1 Tensorflow 17
3-1-2 Keras 18
3-2 NVIDIA Tesla K20 GPU 19
3-3 Meshlab 20
3-4 ModelNet dataset 數據集 21
第四章、系統設計與實作 22
4-1系統架構及運作流程 22
4-2完整的目標函數 24
4-3背景知識及數學模型 24
4-3-1 Autoencoder[22] 24
4-3-2 GAN Adversarial loss 25
4-3-3 Cycle consistency loss 26
4-4網路架構 27
4-4-1生成器Autoencoder架構 27
4-4-2判別器Discriminator架構 28
4-5訓練過程 29
第五章、實驗與結果分析 31
5-1調整CycleGANs訓練之實驗 31
5-2 IoU驗證指標 33
5-2-1與3D-RecGAN論文IoU評比 33
5-2-2單類別/完整類別訓練之驗證 36
5-2-3完整類別重建之探討 38
5-3完整類別訓練之CycleGANs生成結果 40
5-4稀疏點雲與真實點雲透過X2Y自動編碼器輸出結果 41
5-5 Meshlab點雲可視化結果 42
第六章、結論與未來展望 45
參考文獻 46
附錄 52
附錄一、所有類別之可視化驗證 52
參考文獻 [1]"Augmented reality", Wikipedia, 2018. [Online]. Available: https://en.wikipedia.org/wiki/Augmented_reality. [Accessed: Apr. 2018].
[2]"Virtual reality", Wikipedia, 2018. [Online]. Available: https://en.wikipedia.org/wiki/Virtual_reality. [Accessed: Apr. 2018].
[3]"105 年度發展無人飛行載具系統測繪作業", 內政部國土測繪中心, 2016. [Online]. Available: https://www.nlsc.gov.tw/uploadfile/272110.pdf. [Accessed: Apr. 2018].
[4]"傾斜攝影技術-傾斜攝影三維建模無人機", 每日頭條, 2017. [Online]. Available: https://kknews.cc/tech/a8vz89v.html. [Accessed: Apr. 2018].
[5]"光學雷達", Wikipedia, 2018. [Online]. Available: https://zh.wikipedia.org/zh-tw/%E5%85%89%E5%AD%A6%E9%9B%B7%E8%BE%BE. [Accessed: Apr. 2018].
[6]"Point cloud", Wikipedia, 2018. [Online]. Available: https://en.wikipedia.org/wiki/Point_cloud. [Accessed: Apr. 2018].
[7]"運用雷射雷達感測功能提升智慧汽車效能", 財團法人車輛研究測試中心, 2015. [Online]. Available: https://www.artc.org.tw/upfiles/ADUpload/knowledge/tw_knowledge_501525402.pdf. [Accessed: Apr. 2018].
[8]"特斯拉和Google,誰更靠近「完全自動駕駛」?", 數位時代, 2016. [Online]. Available: https://www.bnext.com.tw/article/42245/tesla-google-competition-of-full-self-driving-tech. [Accessed: Apr. 2018].
[9]"提供自動駕駛車輛使用的高精度地圖", Mercedes-benz Taiwan, 2017. [Online]. Available: https://www.mercedes-benz.com.tw/content/taiwan/mpc/mpc_taiwan_website/twng/home_mpc/passengercars/home/world/innovation/news/high_precision_maps.html. [Accessed: Apr. 2018].
[10]"AI帶動機器人商機爆發,LiveWorx大會五大趨勢", 數位時代, 2017. [Online]. Available: https://www.bnext.com.tw/article/44813/ai-5-robotics-trends-from-liveworx-2017. [Accessed: Apr. 2018].
[11]"掃地機器人成為智慧家庭發展焦點", 財團法人資訊工業策進會專家觀點, 2017. [Online]. Available: https://www.iii.org.tw/Focus/FocusDtl.aspx?fm_sqno=12&f_sqno=RPyus7CNaWZtx/fF4x98cQ__. [Accessed: Apr. 2018].
[12]"淺談室內三維模型重建之技術", 國家實驗研究院儀器科技研究中心, 2013. [Online]. Available: http://www.itrc.narl.org.tw/Publication/Newsletter/no115/p08.php. [Accessed: Apr. 2018].
[13]"三角測量", Wikipedia, 2018. [Online]. Available: https://zh.wikipedia.org/wiki/%E4%B8%89%E8%A7%92%E6%B8%AC%E9%87%8F. [Accessed: Apr. 2018].
[14]"DEPTH SENSOR SHOOTOUT", Stimulant, 2016. [Online]. Available: https://stimulant.com/depth-sensor-shootout-2/. [Accessed: Apr. 2018].
[15]"Time of flight", Wikipedia, 2018. [Online]. Available: https://en.wikipedia.org/wiki/Time_of_flight. [Accessed: Apr. 2018].
[16]"Structured light", Wikipedia, 2018. [Online]. Available: https://en.wikipedia.org/wiki/Structured_light. [Accessed: Apr. 2018].
[17]Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
[18]Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
[19]Szegedy, Christian, et al. "Going deeper with convolutions." Cvpr, 2015.
[20]He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[21]Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.
[22]Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the dimensionality of data with neural networks." science 313.5786 (2006): 504-507.
[23]Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
[24]Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114 (2013).
[25]Makhzani, Alireza, et al. "Adversarial autoencoders." arXiv preprint arXiv:1511.05644 (2015).
[26]Isola, Phillip, et al. "Image-To-Image Translation With Conditional Adversarial Networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
[27]Zhu, Jun-Yan, et al. "Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
[28]Kim, Taeksoo, et al. "Learning to discover cross-domain relations with generative adversarial networks." arXiv preprint arXiv:1703.05192 (2017).
[29]Yi, Zili, et al. "Dualgan: Unsupervised dual learning for image-to-image translation." arXiv preprint (2017).
[30]Wu, Zhirong, et al. "3d shapenets: A deep representation for volumetric shapes." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
[31]Wu, Jiajun, et al. "Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling." Advances in Neural Information Processing Systems. 2016.
[32]Dai, Angela, Charles Ruizhongtai Qi, and Matthias Nießner. "Shape completion using 3d-encoder-predictor cnns and shape synthesis." Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Vol. 3. 2017.
[33]Hegde, Vishakh, and Reza Zadeh. "Fusionnet: 3d object classification using multiple data representations." arXiv preprint arXiv:1607.05695 (2016).
[34]Shi, Baoguang, et al. "Deeppano: Deep panoramic representation for 3-d shape recognition." IEEE Signal Processing Letters 22.12 (2015): 2339-2343.
[35]Garcia-Garcia, Alberto, et al. "Pointnet: A 3d convolutional neural network for real-time object class recognition." Neural Networks (IJCNN), 2016 International Joint Conference on. IEEE, 2016.
[36]"3D感測行動發展,人臉辨識領先鋒", 工商時報, 2017. [Online]. Available:http://www.chinatimes.com/newspapers/20170507000772-260204. [Accessed: Apr. 2018].
[37]"透過立體視覺打造3D影像", NI LabVIEW, 2017. [Online]. Available: http://www.ni.com/white-paper/14103/zht/. [Accessed: Apr. 2018].
[38]"Time-of-Flight Camera – An Introduction", TEXAS INSTRUMENTS, 2014. [Online]. Available: http://www.ti.com/lit/wp/sloa190b/sloa190b.pdf. [Accessed: Apr. 2018].
[39]羅至中, and 張文鐘. "單視域之遞迴式深度估測補償. " Diss. 2012.
[40]Geng, Jason. "Structured-light 3D surface imaging: a tutorial." Advances in Optics and Photonics 3.2 (2011): 128-160
[41]Hartley, Richard, and Andrew Zisserman. "Multiple view geometry in computer vision." Cambridge university press, 2003.
[42]Simakov, Denis, Darya Frolova, and Ronen Basri. "Dense Shape Reconstruction of a Moving Object under Arbitrary, Unknown Lighting." ICCV. Vol. 3. 2003.
[43]Zollhöfer, Michael, et al. "Real-time non-rigid reconstruction using an RGB-D camera." ACM Transactions on Graphics (TOG)33.4 (2014): 156.
[44]Henry, Peter, et al. "RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments." The International Journal of Robotics Research 31.5 (2012): 647-663.
[45]Mitra, Niloy J., Leonidas J. Guibas, and Mark Pauly. "Partial and approximate symmetry detection for 3D geometry." ACM Transactions on Graphics (TOG) 25.3 (2006): 560-568.
[46]Pauly, Mark, et al. "Discovering structural regularity in 3D geometry." ACM transactions on graphics (TOG) 27.3 (2008): 43.
[47]Li, Yangyan, et al. "Database‐Assisted Object Retrieval for Real‐Time 3D Reconstruction." Computer Graphics Forum. Vol. 34. No. 2. 2015.
[48]Rock, Jason, et al. "Completing 3D object shape from one depth image." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
[49]Soltani, Amir Arsalan, et al. "Synthesizing 3d shapes via modeling multi-view depth maps and silhouettes with deep generative networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
[50]Gadelha, Matheus, Subhransu Maji, and Rui Wang. “3d shape induction from 2d views of multiple objects.” arXiv preprint arXiv:1612.05872 (2016).
[51]Rezende, Danilo Jimenez, et al. "Unsupervised learning of 3d structure from images." Advances In Neural Information Processing Systems. 2016.
[52]Su, Hang, et al. “Multi-view convolutional neural networks for 3d shape recognition.” Proceedings of the IEEE international conference on computer vision. 2015.
[53]Choy, Christopher B., et al. “3d-r2n2: A unified approach for single and multi-view 3d object reconstruction.” European Conference on Computer Vision. Springer, Cham, 2016.
[54]Tatarchenko, Maxim, Alexey Dosovitskiy, and Thomas Brox. “Multi-view 3d models from single images with a convolutional network.” European Conference on Computer Vision. Springer, Cham, 2016.
[55]Yan, Xinchen, et al. “Perspective transformer nets: Learning single-view 3d object reconstruction without 3d supervision.” Advances in Neural Information Processing Systems. 2016.
[56]Lun, Zhaoliang, et al. “3D shape reconstruction from sketches via multi-view convolutional networks.” arXiv preprint arXiv:1707.06375 (2017).
[57]Yang, Bo, et al. “3d object reconstruction from a single depth view with adversarial learning.” arXiv preprint arXiv:1708.07969 (2017).
[58]Achlioptas, Panos, et al. "Representation learning and adversarial generation of 3D point clouds." arXiv preprint arXiv:1707.02392 (2017).
[59]Sharma, Abhishek, Oliver Grau, and Mario Fritz. "Vconv-dae: Deep volumetric shape learning without object labels." European Conference on Computer Vision. Springer, Cham, 2016.
[60]Brock, Andrew, et al. "Generative and discriminative voxel modeling with convolutional neural networks." arXiv preprint arXiv:1608.04236 (2016).
[61]Smith, Edward, and David Meger. "Improved adversarial systems for 3D object generation and reconstruction." arXiv preprint arXiv:1707.09557 (2017).
[62]"TensorFlow", TensorFlow, 2018. [Online]. Available: https://www.tensorflow.org/. [Accessed: Apr. 2018].
[63]"Keras", Keras, 2018. [Online]. Available: https://keras.io/. [Accessed: Apr. 2018].
[64]"Maximus 工作站中的 Tesla", NVIDIA, 2018. [Online]. Available: http://www.nvidia.com.tw/object/workstation-solutions-tesla-tw.html. [Accessed: Apr. 2018].
[65]"Tesla K20 工作站加速卡規格", NVIDIA, 2013. [Online]. Available: http://www.nvidia.com.tw/content/PDF/kepler/Tesla-K20-Active-BD-06499-001-v04.pdf. [Accessed: Apr. 2018].
[66]"Meshlab", Meshlab, 2018. [Online]. Available: http://www.meshlab.net/. [Accessed: Apr. 2018].
[67]"ModelNet", Princeton ModelNet, 2018. [Online]. Available: http://modelnet.cs.princeton.edu/. Accessed: Apr. 2018].
[68]Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.
[69]"Intersection over Union for object detection", Pyimagesearch, 2016. [Online]. Available: https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/. [Accessed: May. 2018]
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2018-07-11起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2018-07-11起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw