進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-2408202022352800
論文名稱(中文) 基於RGBD-CNN及卷積自編碼之3D物品方位辯識及抓取點生成法則
論文名稱(英文) 3D Object Model Aided RGBD-CNN Object Orientation Justification and Convolutional Autoencoder Grasping Points Generation Method
校院名稱 成功大學
系所名稱(中) 電機工程學系
系所名稱(英) Department of Electrical Engineering
學年度 107
學期 2
出版年 108
研究生(中文) 張凱傑
研究生(英文) Kai-Chieh Chang
學號 N26064202
學位類別 碩士
語文別 中文
論文頁數 102頁
口試委員 指導教授-李祖聖
口試委員-余國瑞
口試委員-林惠勇
口試委員-邱俊賢
口試委員-郭昭霖
中文關鍵字 卷積自編碼  卷積類神經網路  三維KD樹 
英文關鍵字 3D KD-Tree  Convolutional Autoencoder  RGBD-CNN 
學科別分類
中文摘要 本論文透過建立物品的3D點雲圖模型,輔助估測物品的姿態,以及抓取點生成,讓機器人可以建立物品座標系,規劃抓取點,抓取與擺正物品。為了估測物品的姿態,首先透過物件辨識方法,獲得物品的位置以及邊界框,以此濾除背景,並進行正規化。正規化的彩度與深度影像,作為深度影像卷積神經網路的輸入,訓練物品旋轉角度的類別。取得角度類別後,將物品模型旋轉平移,並取出推估可觀測範圍的點雲圖,再利用迭代最近點法,比對物品模型可觀測點雲圖與實際觀測的點雲圖,推估物品的旋轉矩陣,得出物品的姿態。如此,便可建立物品座標系,並將模型轉換至三維空間中,補足未觀測到的部分。接著,本論文進一步提出卷積自動編碼器網路,學習物品可抓取的範圍。首先建立物品的法向量圖及深度圖,並定義夾爪抓取的矩形範圍,在物品圖上分別採樣每個抓取矩形的正規化影像,標註是否為可抓取點。接著透過卷積自動編碼器網路,將影像編碼至三維表徵空間,讓抓取點群聚在此空間當中。因此,驗證時即可利用三維KD樹比對事先建立的抓取點資料庫,評估此點抓取的可行性。實驗結果展示了本論文所提出的方法,可以有效地推估物品的旋轉角度,並建立物品坐標系,利用推估的坐標系,規劃物品抓取點,以及擺放方位,將物品擺正至定義之面向。
英文摘要 A 3D object grasping point learning system is proposed in this thesis, which contains object coordinate construction and grasping point learning. For constructing the object coordinate, the pose of the object needs to be estimated. A RGBD Convolutional Neural Network (RGBD-CNN) is proposed to classify the orientation type of the object. An object model and the iterative closest point algorithm (ICP) are then applied to estimate the object pose. Hence, the object coordinate can be constructed in the end. For learning object grasping region, the normal vector images and depth image of the object are obtained first. Then, the grasping range of the end effector (palm) will be simulated on these images. Finally, Convolutional Autoencoder (CAE) is applied to encode physical characteristics of simulated palm image. By comparing the features of simulated palm in the database through 3D KD-tree, the proposed method can evaluate the grasping points. Through integrating object coordinate and learnt grasping point, the robot plans a suitable grasping point based on the appointed task. It is worth mentioning that most of the researches only put emphasis on either object orientation justification or grasp points generation. However, this research considers the problem of object pose estimation and grasping points generation together. Therefore, the robot can real-time adapt to different task situations. The first experiment shows that the robot is able to understand the spatial relationship between each object by object coordinate system. In the second experiment, the robot successfully puts the object from random pose to the assigned pose.
論文目次 摘要 I
Abstract II
Acknowledgement III
Contents IV
List of Tables VI
List of Figures VII
List of Variables XI
Chapter 1 1
1.1 Motivation 1
1.2 Related Work 2
1.2.1 Object Orientation Justification 3
1.2.2 Grasping Points Learning 5
1.3 System Overview 6
1.4 Thesis Organization 7
Chapter 2 9
2.1 Introduction 9
2.2 Feature Extraction and Pre-processing 10
2.2.1 Modeling Platform 10
2.2.2 Feature Extraction 12
2.3 Points Cloud Concatenation and model result 13
Chapter 3 15
3.1 Introduction 15
3.2 Image Pre-Processing 17
3.3 RGBD-CNN Structure and Training 20
3.3.1 Training Premise 20
3.3.2 RGBD-CNN Structure 22
3.4 ICP Based Compensation 27
3.4.1 Model Occlusion 28
3.4.2 Iterative Closest Point (ICP) 30
3.5 Simulations 36
Chapter 4 38
4.1 Introduction 38
4.2 Image Pre-Processing 41
4.3 CAE Training Data Generation and Testing Data Detection 43
4.3.1 Stage I 43
4.3.2 Stage II 44
4.4 Autoencoder Structure and Training 54
4.5 Evaluation and Results 60
Chapter 5 65
5.1 Introduction 65
5.2 Experiment I 68
5.2.1 Experiment I-I 70
5.2.2 Experiment I-II 76
5.3 Experiment II 81
Chapter 6 89
6.1 Conclusion 89
6.2 Future Works & Discussion 90
References 92
Appendix 97
參考文獻 [1] R. Girshick, “Fast R-CNN,” in Proceedings of IEEE International Conference on Computer Vision, pp 1440-1448, Santiago, 2015.
[2] S. Ren, K. He, R. Girshick and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE transactions on pattern analysis and machine intelligence, 2017, 39, (6), pp. 1137–1149.
[3] K. He, G. Gkioxari, P. Dollr and R. Girshick, “Mask R-CNN, ” in Proceedings of IEEE International Conference on Computer Vision, pp. 2961-2969, Oct 2017.
[4] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, “You Only Look Once: Unified Real-time Object Detection,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788, 2016.
[5] J. Redmon and A. Farhadi, “YOLO9000: Better Faster Stronger,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263-7271, 2017.
[6] Y. Guo, M. Bennamoun, F. Sohel, M. Lu and J. Wan, “An Integrated Framework for 3-D Modeling Object Detection and Pose Estimation From Point-Clouds,” IEEE Transactions on Instrumentation and Measurement, vol. 64, no. 3, pp. 683-693, March 2015.
[7] C.-Y. Tsai and S.-H. Tsai, “Simultaneous 3D Object Recognition and Pose Estimation Based on RGB-D Images,” IEEE Access, vol. 6, pp. 28859-28869, 2018.
[8] Z. Teng and J. Xiao, “Surface-based Detection and 6-Dof Pose Estimation of 3-D Objects in Cluttered Scenes,” IEEE Transactions on Robotics, vol. 32, no. 6, pp. 1347-1361, 2016.
[9] V. Lepetit, J. Pilet and P. Fua, “Point Matching as a Classification Problem for Fast and Robust Object Pose Estimation,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, 2004.
[10] A. Tejani, R. Kouskouridas, A. Doumanoglou, D. Tang and T. K. Kim, “Latent-Class Hough Forests for 6 DoF Object Pose Estimation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 1, pp. 119-132, 2018.
[11] M. J. Landau and P. A. Beling, “Optimal Model-Based 6-D Object Pose Estimation with Structured-light Depth Sensors,” IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 58-73, 2017.
[12] A. Nigam, A. Penate-Sanchez and L. Agapito, “Detect Globally, Label Locally: Learning Accurate 6-DOF Object Pose Estimation by Joint Segmentation and Coordinate Regression,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3960-3967, 2018.
[13] P. Wohlhart and V. Lepetit, “Learning Descriptors for Object Recognition and 3D Pose Estimation,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3109-3118, 2015.
[14] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean and M. Kudlur, “Tensorflow: A System for Large-scale Machine Learning,” in Proceedings of Symposium on Operating Systems Design and Implementation, pp. 265-283, 2016.
[15] Keras. [Online]. Available: https://keras.io/
[16] Morvanzhou. [Online]. Available: https://morvanzhou.github.io/
[17] Y. Xiang, T. Schmidt, V. Narayanan and D. Fox, “PoseCNN: A Convolutional Neural Network for 6d Object Pose Estimation in Cluttered Scenes,” arXiv preprint arXiv:1711.00199v3, 2018.
[18] Y. Liu, L. Zhou, H. Zong, X. Gong, Q. Wu, Q. Liang and J. Wang, “Regression-based 3D Pose Estimation for Texture-less Objects,” IEEE Transactions on Multimedia (Early Access), DOI:10.1109/TMM.2019.2913321, 2019.
[19] M. Schwarz, H. Schulz, S. Behnke, “RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features,” Proc. IEEE Int. Conf. Robot. Autom. (ICRA), Seattle, May 2015.
[20] A. Kanezaki, Y. Matsushita, Y. Nishida et al., “RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 5010-5019, 2018.
[21] C. M. Lin, C. Y. Tsai, Y. C. Lai, S. A. Li, and C. C. Wong, ‘‘Visual Object Recognition and Pose Estimation Based on a Deep Semantic Segmentation Network,’’ IEEE Sensors Journal, vol. 18, no. 22, pp. 9370-9381, 2018.
[22] A. Ten Pas, M. Gualtieri, K. Saenko and R. Platt, ‘‘Grasp Pose Detection in Point Clouds,’’ The International Journal of Robotics Research, vol. 36, no. 13-14, pp. 1455-1473,2017.
[23] W.-H. Yen, “Enhanced Grey Wolf Optimizer based Multiple Object Grasping Poses for Home Service Robot,” Dept. of Electrical Eng., National Cheng Kung University, Tainan, Taiwan, R.O.C., 2016.
[24] C. Liu, B. Fang, F. Sun, X. Li and W. Huang, “Learning to Grasp Familiar Objects Based on Experience and Objects’ Shape Affordance,” IEEE Transactions on Systems, Man, and Cybernetics: Systems (Early Access), DOI: 10.1109/TSMC.2019.2901955, 2019.
[25] L. Li, W. Wang, Y. Su and Z. Du, “A Data-driven Grasp Planning Method Based on Gaussian Process Classifier,” in Proceedings of IEEE International Conference on Mechatronics and Automation , pp. 2626-2631, Beijing, China, 2015.
[26] H. Zhang, X. Zhou, X. Lan, J. Li, Z. Tian and N. Zheng, “A Real-Time Robotic Grasping Approach With Oriented Anchor Box,” arXiv preprint arXiv:1809.03873, 2018.
[27] Z. Xu, Y. Zheng and S. Rawashdeh, “A simple Robotic Fingertip Sensor Using Imaging and Shallow Neural Networks,” IEEE Sensors Journal (accepted), 2019.
[28] F. H. Zunjani, S. Sen, H. Shekhar, A. Powale, D. Godnaik, and G. C. Nandi, “Intent-based Object Grasping by a Robot Using Deep Learning,” in Proceedings of IEEE International Advance Computing Conference, pp. 246-251, 2018.
[29] Y. H. Na, H. Jo and J. B. Song, “Learning to Grasp Objects Based on Ensemble Learning Combining Simulation Data and Real Data,” in Proceedings of IEEE International Conference on Control, Automation and Systems, pp. 1030-1034, 2017.
[30] F. Sun, C. Liu, W. Huang and J. Zhang, “Object Classification and Grasp Planning Using Visual and Tactile Sensing,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 46, no. 7, pp. 969-979, 2016.
[31] U. Asif, M. Bennamoun and F. A. Sohel, “RGB-D Object Recognition and Grasp Detection Using Hierarchical Cascaded Forests,” IEEE Transactions on Robotics, vol. 33, no. 3, pp. 547-564, 2017.
[32] I. Lenz, H. Lee and A. Saxena, “Deep Learning for Detecting Robotic Grasps,” The International Journal of Robotics Research, vol. 34, no. 4-5, pp. 705-724, 2015.
[33] C. Choi, W. Schwarting, J. DelPreto and D. Rus, “Learning Object Grasping for Soft Robot Hands,” IEEE Robotics and Automation Letter, vol. 3, no. 3, pp. 2370-2377, 2018.
[34] Z. Han, Z. Liu, J. Han, C. M. Vong, S. Bu and X. Li, “Unsupervised 3D Local Feature Learning by Circle Convolutional Restricted Boltzmann Machine,” IEEE Transactions on Image Processing, vol. 25, no. 11, pp. 5331-5344, 2016.
[35] Point Cloud Library (PCL). [Online]. Available: http://pointclouds.org/
[36] KD-Tree. [Online]. Available: http://pointclouds.org/documentation/tutorials/kdtree_search.php
[37] Intel D435i. [Online]. Available: https://www.intelrealsense.com/depth-camera-d435i/
[38] MeshLab. [Online]. Available: http://www.meshlab.net/
[39] Opencv. [Online]. Available: https://opencv.org/
[40] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” arXiv preprint arXiv:1412.6980v9, 2017.
[41] P. J. Besl and N.D. McKay, “A Method for Registration of 3-D shapes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 2, pp. 239-256, 1992.
[42] E. Shireen, A. Farag and A. Farag (2009). Iterative Closed Point: A Tutorial on Rigid Registration. [Online]. Available: http://www.sci.utah.edu/~shireen/pdfs/tutorials/Elhabian_ICP09.pdf
[43] R. B. Rusu, “Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments,” KI-Künstliche Intelligenz, vol. 24, no. 4, pp. 345-348, 2010.
[44] ROS. [Online]. Available: https://www.ros.org/
[45] ROBOTIS. [Online]. Available: http://www.robotis.us/
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2025-08-01起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2025-08-01起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw