進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-0309202013254300
論文名稱(中文) 開發一套用於手持治療機器人之手勢辨識演算法
論文名稱(英文) Development of a Gesture Recognition Algorithm for Therapeutic Holding Robot
校院名稱 成功大學
系所名稱(中) 生物醫學工程學系
系所名稱(英) Department of BioMedical Engineering
學年度 108
學期 2
出版年 109
研究生(中文) 陳瑋泰
研究生(英文) Wei-Tai Chen
學號 P86071058
學位類別 碩士
語文別 英文
論文頁數 85頁
口試委員 指導教授-蘇芳慶
口試委員-郭立杰
口試委員-徐秀雲
口試委員-揭小鳳
口試委員-林倩如
中文關鍵字 機器學習  深度學習  手勢模式  手勢辨認 
英文關鍵字 Machine Learning  Deep Learning  Hand Gesture Pattern  Gesture Recognition 
學科別分類
中文摘要 根據世界衛生組織的報告,近年來老年人口的比例大幅增加。維持身心健康對於未來的政策規劃至關重要。近來許多研究指出,陪伴機器人對老年人的身心健康具有正面的影響,因為這種機器人會根據人類給予刺激而提供各種回饋。因此,識別來自人類的刺激是邁向人機互動的第一步。其中一種人機互動的方法是觸覺互動,它被認為是傳達親密情感的首選管道。所以,本研究著重在使用機器學習和深度學習來辨識人類社交觸摸的手勢。
眾所周知,機器學習和深度學習是完成分類任務的強大工具。本研究將使用支撐向量機(SVM)、隨機森林(RF)、一維卷積神經網路(1D-CNN)、二維卷積神經網路(2D-CNN)和三維卷積神經網路(3D-CNN)來識別6種社交觸摸手勢,分別是拍、握、戳、抓和無接觸。數據集總共有17716個樣本。受試者以2種不同的姿態執行手勢,分別是物體靜置在桌面上和手持裝置,裝置上有壓力感測元件。
交叉驗證用於評估每個模型的表現,SVM、RF、1D-CNN、2D-CNN和3D-CNN的準確度分別是16.76%、49.37%、70.51%、70.46%、73.63%。辨識結果表示所有模型均可利用壓力數據辨識不同手勢。未來將增加其他感測元件以提高辨識準確度。此外更進一步的研究手勢和情緒的關係是必要的。
英文摘要 According to the World Health Organization’s report, the percentage of older adults has significantly increased in recent years. Maintaining their physical and mental health is essential for future policy planning. Recently, many studies have found that companion robots had beneficial effects on physical and mental health for older adults. This is explained by the fact that robots can offer various responses according to the stimulation from humans. Thus, recognition of these stimulations is the first step towards human–robot interaction. An example is a tactile interaction, which is the preferred channel to communicate intimate emotions. Therefore, this study focused on the hand gesture recognition of social touch in humans using machine learning and deep learning.
Machine learning and deep learning, powerful tools for classification, have been widely developed to recognize the types of social touch gestures. In this study, five algorithms, support vector machines (SVM), random forest (RF), and three convolutional neural networks, one-dimensional (1D-CNN), two-dimensional (2D-CNN), and three-dimensional (3D-CNN), were used for analysis and comparison of their performance on hand gesture recognition. These models recognized six types of social touch gestures, the pat, stroke, grab, poke, scratch, and no touch. The dataset included 17,716 samples of these six gestures. All gestures were performed in two postures, stationary and holding, on a pressure mapping sensor mat attached on a cylinder-shaped companion robot simulator.
Ten-fold cross-validation was used to evaluate the performance of all models. The final accuracy percentages for SVM, RF, 1D-CNN, 2D-CNN, and 3D-CNN were 16.76%, 49.37%, 70.51%, 70.46%, and 75.78%, respectively. The results indicate that the models could classify hand gestures based on pressure data. Future work is required to increase the accuracy either by adding database size or utilizing high-resolution pressure sensors. Furthermore, the relationships between hand gestures and emotion states should also be considered.
論文目次 摘要 I
Abstract II
致謝 IV
Content V
List of Figures VII
List of Tables X
Chapter 1 Introduction 1
1.1 Aging Population Problems 1
1.2 Animal-Assisted Therapy and Activities 2
1.3 Robotic Pets in Healthcare 2
1.4 How Robots Interact with Humans 3
1.5 State-of-the-Art on Gesture Recognition 4
1.5.1 Gesture Recognition with Machine Learning 4
1.5.2 Gesture Recognition with Deep Learning 5
1.6 Motivation 7
1.7 Research Questions and Hypotheses 8
Chapter 2  Materials and Methods 9
2.1 Pressure Sensor and Microcontroller Board 9
2.2 Data Acquisition 13
2.2.1 Experiment Setup 13
2.2.2 Subjects and Gestures 15
2.3 Pre-processing and Feature Extraction 17
2.4 Machine Learning Models 20
2.4.1 Support Vector Machine (SVM) 20
2.4.2 Decision Tree and Random Forest 21
2.5 Deep Learning Models 22
2.5.1 Convolutional Neural Networks 22
2.5.2 Model Architecture 25
2.6 Training Strategy 37
2.7 K-Fold Cross-Validation 38
Chapter 3 Results 43
3.1 Model Performance for Different Postures 43
3.1.1 Recognition Accuracy 43
3.1.2 Loss and Accuracy Curves 46
3.1.3 Confusion Matrices 49
3.2 Inter-Subjects’ Model Performance 54
3.2.1 Recognition Accuracy 54
3.2.2 Loss and Accuracy Curves 58
3.2.3 Confusion Matrices 63
Chapter 4 Discussion 70
4.1 Influence of Different Postures on Model’s Performance 70
4.2 Gesture Movement Patterns 72
4.3 Robust Algorithms 73
4.4 Model Comparison 74
4.4.1 Machine Learning Models 74
4.4.2 Deep Learning Models 76
4.4.3 Model Comparison 76
4.4.4 Comparison of Algorithms and Humans 78
Chapter 5 Conclusion 79
References 80

參考文獻 [1] U. Nations, "World population prospects 2019," 2019.
[2] W. H. Organization, World report on ageing and health. World Health Organization, 2015.
[3] L. Grenade and D. Boldy, "Social isolation and loneliness among older people: issues and future challenges in community and residential settings," Australian Health Review, vol. 32, no. 3, pp. 468-478, 2008.
[4] U. S. D. o. H. H. Services. [Online]. Available: https://www.womenshealth.gov/mental-health/good-mental-health/good-mental-health-every-age.
[5] T. F. Garrity, L. F. Stallones, M. B. Marx, and T. P. Johnson, "Pet ownership and attachment as supportive factors in the health of the elderly," Anthrozoös, vol. 3, no. 1, pp. 35-44, 1989.
[6] D. Lago, M. Delaney, M. Miller, and C. Grill, "Companion animals, attitudes toward pets, and health outcomes among the elderly: A long-term follow-up," Anthrozoös, vol. 3, no. 1, pp. 25-34, 1989.
[7] J. Gammonley and J. Yates, "Pet projects: Animal assisted therapy in nursing homes," Journal of gerontological nursing, vol. 17, no. 1, pp. 12-15, 1991.
[8] D. Society, "Standards of practice for animal assisted activities and animal assisted therapy," ed: Delta Society Renton (WA), 1996.
[9] D. Feil-Seifer and M. J. Matarić, "Socially assistive robotics," IEEE Robotics & Automation Magazine, vol. 18, no. 1, pp. 24-31, 2011.
[10] T. Shibata, "An overview of human interactive robots for psychological enrichment," Proceedings of the IEEE, vol. 92, no. 11, pp. 1749-1758, 2004.
[11] S. T. Hansen, H. J. Andersen, and T. Bak, "Practical evaluation of robots for elderly in Denmark—an overview," in 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2010: IEEE, pp. 149-150.
[12] L. Odetti et al., "Preliminary experiments on the acceptability of animaloid companion robots by older people with early dementia," in 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2007: IEEE, pp. 1816-1819.
[13] K. Wada, T. Shibata, K. Sakamoto, and K. Tanie, "Quantitative analysis of utterance of elderly people in long-term robot assisted activity," in ROMAN 2005. IEEE International Workshop on Robot and Human Interactive Communication, 2005., 2005: IEEE, pp. 267-272.
[14] G. Huisman, "Social touch technology: a survey of haptic technology for social touch," IEEE transactions on haptics, vol. 10, no. 3, pp. 391-408, 2017.
[15] M. J. Hertenstein, J. M. Verkamp, A. M. Kerestes, and R. M. Holmes, "The communicative functions of touch in humans, nonhuman primates, and rats: a review and synthesis of the empirical research," Genetic, social, and general psychology monographs, vol. 132, no. 1, pp. 5-94, 2006.
[16] D. Silvera-Tawil, D. Rye, and M. Velonaki, "Artificial skin and tactile sensing for socially interactive robots: A review," Robotics and Autonomous Systems, vol. 63, pp. 230-243, 2015.
[17] B. App, D. N. McIntosh, C. L. Reed, and M. J. Hertenstein, "Nonverbal channel use in communication of emotion: How may depend on why," Emotion, vol. 11, no. 3, p. 603, 2011.
[18] N.-E. Ayat, M. Cheriet, L. Remaki, and C. Y. Suen, "KMOD-A new support vector machine kernel with moderate decreasing for pattern recognition. Application to digit image recognition," in Proceedings of Sixth International Conference on Document Analysis and Recognition, 2001: IEEE, pp. 1215-1219.
[19] L. Deng and X. Li, "Machine learning paradigms for speech recognition: An overview," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 5, pp. 1060-1089, 2013.
[20] Y.-M. Kim, S.-Y. Koo, J. G. Lim, and D.-S. Kwon, "A robust online touch pattern recognition for dynamic human-robot interaction," IEEE Transactions on Consumer Electronics, vol. 56, no. 3, pp. 1979-1987, 2010.
[21] K. Altun and K. E. MacLean, "Recognizing affect in human touch of a robot," Pattern Recognition Letters, vol. 66, pp. 31-40, 2015.
[22] A. Flagg and K. MacLean, "Affective touch gesture recognition for a furry zoomorphic machine," in Proceedings of the 7th International Conference on Tangible, Embedded and Embodied Interaction, 2013, pp. 25-32.
[23] S. Albawi, O. Bayat, S. Al-Azawi, and O. N. Ucan, "Social touch gesture recognition using convolutional neural network," Computational intelligence and neuroscience, vol. 2018, 2018.
[24] D. Hughes, A. Krauthammer, and N. Correll, "Recognizing social touch gestures using recurrent and convolutional neural networks," in 2017 IEEE International Conference on Robotics and Automation (ICRA), 2017: IEEE, pp. 2315-2321.
[25] J. Sun, S. Redyuk, E. Billing, D. Högberg, and P. Hemeren, "Tactile interaction and social touch: Classifying human touch using a soft tactile sensor," in Proceedings of the 5th International Conference on Human Agent Interaction, 2017, pp. 523-526.
[26] D. Singh, E. Merdivan, S. Hanke, J. Kropf, M. Geist, and A. Holzinger, "Convolutional and recurrent neural networks for activity recognition in smart environment," in Towards integrative machine learning and knowledge extraction: Springer, 2017, pp. 194-205.
[27] J.-H. Park, J.-H. Seo, Y.-H. Nho, and D.-S. Kwon, "Touch Gesture Recognition System based on 1D Convolutional Neural Network with Two Touch Sensor Orientation Settings," in 2019 16th International Conference on Ubiquitous Robots (UR), 2019: IEEE, pp. 65-70.
[28] H. J. Ku, J. J. Choi, S. Jang, W. Do, S. Lee, and S. Seok, "Online Social Touch Pattern Recognition with Multi-modal-sensing Modular Tactile Interface," in 2019 16th International Conference on Ubiquitous Robots (UR), 2019: IEEE, pp. 271-277.
[29] N. Zhou and J. Du, "Recognition of social touch gestures using 3D convolutional neural networks," in Chinese Conference on Pattern Recognition, 2016: Springer, pp. 164-173.
[30] R. Pascanu, T. Mikolov, and Y. Bengio, "On the difficulty of training recurrent neural networks," in International conference on machine learning, 2013, pp. 1310-1318.
[31] D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, "Learning spatiotemporal features with 3d convolutional networks," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 4489-4497.
[32] B. Shi, X. Bai, and C. Yao, "An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition," arXiv preprint arXiv:1507.05717, 2015.
[33] H. Iwata and S. Sugano, "Human-robot-contact-state identification based on tactile recognition," IEEE Transactions on Industrial Electronics, vol. 52, no. 6, pp. 1468-1477, 2005.
[34] M. Kaboli, A. Long, and G. Cheng, "Humanoids learn touch modalities identification via multi-modal robotic skin and robust tactile descriptors," Advanced Robotics, vol. 29, no. 21, pp. 1411-1425, 2015.
[35] F. Naya, J. Yamato, and K. Shinozawa, "Recognizing human touching behaviors using a haptic interface for a pet-robot," in IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No. 99CH37028), 1999, vol. 2: IEEE, pp. 1030-1034.
[36] S. van Wingerden, T. J. Uebbing, M. M. Jung, and M. Poel, "A neural network based approach to social touch classification," in Proceedings of the 2014 workshop on Emotion Representation and Modelling in Human-Computer-Interaction-Systems, 2014, pp. 7-12.
[37] B. Zhou et al., "Textile pressure mapping sensor for emotional touch detection in human-robot interaction," Sensors, vol. 17, no. 11, p. 2585, 2017.
[38] M. D. Cooney, S. Nishio, and H. Ishiguro, "Recognizing affection for a touch-based interaction with a humanoid robot," in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012: IEEE, pp. 1420-1427.
[39] M. M. Jung, "Towards social touch intelligence: developing a robust system for automatic touch recognition," in Proceedings of the 16th International Conference on Multimodal Interaction, 2014, pp. 344-348.
[40] M. M. Jung, M. Poel, R. Poppe, and D. K. Heylen, "Automatic recognition of touch gestures in the corpus of social touch," Journal on multimodal user interfaces, vol. 11, no. 1, pp. 81-96, 2017.
[41] M. M. Jung, R. Poppe, M. Poel, and D. K. Heylen, "Touching the void--introducing CoST: corpus of social touch," in Proceedings of the 16th International Conference on Multimodal Interaction, 2014, pp. 120-127.
[42] W. D. Stiehl and C. Breazeal, "Affective touch for robotic companions," in International Conference on Affective Computing and Intelligent Interaction, 2005: Springer, pp. 747-754.
[43] S. Yohanan and K. E. MacLean, "The role of affective touch in human-robot interaction: Human intent and expectations in touching the haptic creature," International Journal of Social Robotics, vol. 4, no. 2, pp. 163-180, 2012.
[44] M. J. Hertenstein, R. Holmes, M. McCullough, and D. Keltner, "The communication of emotion via touch," Emotion, vol. 9, no. 4, p. 566, 2009.
[45] F. Pedregosa et al., "Scikit-learn: Machine learning in Python," Journal of machine learning research, vol. 12, no. Oct, pp. 2825-2830, 2011.
[46] C.-W. Hsu, C.-C. Chang, and C.-J. Lin, "A practical guide to support vector classification," ed: Taipei, 2003.
[47] c. Wikipedia. (22 March 2020 03:46 UTC). Decision tree learning [Online]. Available: https://en.wikipedia.org/w/index.php?title=Decision_tree_learning&oldid=940671669.
[48] F. Chollet, Deep Learning mit Python und Keras: Das Praxis-Handbuch vom Entwickler der Keras-Bibliothek. MITP-Verlags GmbH & Co. KG, 2018.
[49] T. Hung. 卷積神經網路(Convolutional neural network, CNN) — 卷積運算、池化運算 [Online]. Available: https://medium.com/@chih.sheng.huang821/%E5%8D%B7%E7%A9%8D%E7%A5%9E%E7%B6%93%E7%B6%B2%E8%B7%AF-convolutional-neural-network-cnn-%E5%8D%B7%E7%A9%8D%E9%81%8B%E7%AE%97-%E6%B1%A0%E5%8C%96%E9%81%8B%E7%AE%97-856330c2b703.
[50] S. Verma. Understanding 1D and 3D Convolution Neural Network | Keras [Online]. Available: https://towardsdatascience.com/understanding-1d-and-3d-convolution-neural-network-keras-9d8f76e29610.
[51] M. Lin, Q. Chen, and S. Yan, "Network in network," arXiv preprint arXiv:1312.4400, 2013.
[52] S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," arXiv preprint arXiv:1502.03167, 2015.
[53] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[54] C. Szegedy et al., "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1-9.
[55] H. Li, Z. Xu, G. Taylor, C. Studer, and T. Goldstein, "Visualizing the loss landscape of neural nets," in Advances in Neural Information Processing Systems, 2018, pp. 6389-6399.
[56] A. Mohan. Loss Visualization [Online]. Available: http://www.telesens.co/loss-landscape-viz/viewer.html.
[57] V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, "Efficient processing of deep neural networks: A tutorial and survey," Proceedings of the IEEE, vol. 105, no. 12, pp. 2295-2329, 2017.
[58] M. Abadi et al., "Tensorflow: A system for large-scale machine learning," in 12th [1] Symposium on Operating Systems Design and Implementation ([1] 16), 2016, pp. 265-283.
[59] L. Liu et al., "On the variance of the adaptive learning rate and beyond," arXiv preprint arXiv:1908.03265, 2019.
[60] M. Zhang, J. Lucas, J. Ba, and G. E. Hinton, "Lookahead Optimizer: k steps forward, 1 step back," in Advances in Neural Information Processing Systems, 2019, pp. 9593-9604.
[61] P. Ramachandran, B. Zoph, and Q. V. Le, "Searching for activation functions," arXiv preprint arXiv:1710.05941, 2017.
[62] K. He, X. Zhang, S. Ren, and J. Sun, "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1026-1034.
[63] T. Huang. (2018). 交叉驗證(Cross-validation, CV) [Online]. Available: https://medium.com/@chih.sheng.huang821/%E4%BA%A4%E5%8F%89%E9%A9%97%E8%AD%89-cross-validation-cv-3b2c714b18db.
[64] R. Zhang, "Making convolutional networks shift-invariant again," arXiv preprint arXiv:1904.11486, 2019.
[65] G. Cheng, P. Zhou, and J. Han, "Rifd-cnn: Rotation-invariant and fisher discriminative convolutional neural networks for object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2884-2893.
[66] G. Cheng, J. Han, P. Zhou, and D. Xu, "Learning rotation-invariant and fisher discriminative convolutional neural networks for object detection," IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 265-278, 2018.
[67] H. A. Elfenbein and N. Ambady, "On the universality and cultural specificity of emotion recognition: a meta-analysis," Psychological bulletin, vol. 128, no. 2, p. 203, 2002.
[68] K. R. Scherer, T. Johnstone, and G. Klasmeyer, Vocal expression of emotion. Oxford University Press, 2003.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2022-10-12起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2022-10-12起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw