系統識別號 U0026-1002201822172100
論文名稱(中文) 陪伴機器人
論文名稱(英文) Companion Robot
校院名稱 成功大學
系所名稱(中) 工程科學系
系所名稱(英) Department of Engineering Science
學年度 106
學期 1
出版年 107
研究生(中文) 蘇品儒
研究生(英文) Pin-Ju Su
學號 N96054405
學位類別 碩士
語文別 中文
論文頁數 98頁
口試委員 指導教授-周榮華
中文關鍵字 陪伴機器人  情緒辨識  語音辨識 
英文關鍵字 Companion Robot  Emotion Recognition  Speech Recognition 
中文摘要 本論文旨在研製一款陪伴型機器人,此機器人可以由語者辨認情緒或判斷是否為緊急狀態,並做出不同反應撫慰使用者的情緒或通知身旁人員協助使用者。
機器人內部支撐和手部構造由3D Printer列印,將微處理器、喇叭和馬達固定在機器人上。
語音情緒辨識採用Hilbert Huang Transform (HHT)分解訊號,將訊號分解成不同的模態函數,利用頻率能量分類情緒,整體辨識率為93.75%。
機器人藉由語音辨識辨認使用者是否說出救命。語音辨識採用Mel-Frequency Cepstral Coefficients取出39階特徵,並利用Dynamic Time Warping (DTW)分類是否為緊急狀態,最終辨識率為87.9%。
英文摘要 This thesis develops a companion robot for recognizing the user’s emotion and judging whether the user is in the urgent situation or not. The robot can react to the user in different ways, and these reactions have a comforting effect. It also can give the alarm to the people who can help the user in danger.
The arm structure and supporting construction with microcontrollers, speaker and motors are developed by 3D-printing.
The brain of this robot is microcontroller dsPIC30F4011 which can drive motors to move the arm structure, control the speaker to play the healing music and finally send the results to mobile application by Bluetooth.
Hilbert Huang Transform(HHT) is adopted to process the voice signals. It can transform the single voice signal into the multiple sine waves which mean different frequencies. By the energy of these different frequencies, the robot recognizes the emotion successfully. The recognition rate is 87.9%.
If user asks for help, the robot raises the alarm. To judge the user’s situation, the speech recognition is used. The feature extraction technique of speech recognition is Mel-Frequency Cepstral Coefficients (MFCC). With 39 order MFCC features and Dynamic Time Warping(DTW), the robot can classify if the user is in danger or not. The recognition rate for speech recognition is 93.75%.
論文目次 摘要I
Extended AbstractII
第一章 序論 1
1.1前言 1
1.2研究動機 3
1.3文獻回顧 3
1.3.1陪伴機器人文獻回顧 3
1.3.2語音辨識特徵擷取文獻回顧 7
1.3.3情緒辨識文獻回顧 11
1.4論文架構 15
第二章 系統架構與軟硬體介紹 16
2.1整體系統架構 16
2.2系統硬體介紹 17
2.2.1微處理器dsPIC30F4011 17
2.2.2直流降壓模組 20
2.2.3直流馬達 20
2.2.4馬達驅動晶片 TA7291P 21
2.2.5光耦合晶片 PC817 23
2.2.6 MicroSD Card模組 24
2.2.7 LM386 25
2.3機器人硬體設計 27
2.3.1機器人內部 28
2.3.2手臂連桿 29
2.4軟體規格 31
第三章 語音辨識 34
3.1簡介 34
3.1.1系統架構圖 35
3.1.2 倒頻譜(Cepstral) 36
3.1.3 Mel-Frequency Cepstral Coefficient (MFCC) 37
3.2端點偵測 37
3.3預強調 39
3.4音框化(Framing) 41
3.5漢明窗(Hamming Window) 42
3.6快速傅立葉轉換(FFT) 43
3.7三角帶通濾波器(Triangular Bandpass Filter) 44
3.8離散餘弦轉換(Discrete Cosine transform) 45
3.9對數能量(Log Energy) 46
3.10差量倒頻譜係數(Delta Cepstrum and Delta-Delta Cepstrum) 46
3.11動態時間校正(Dynamic Time Warping) 48
第四章 情緒辨識 50
4.1簡介 50
4.2希爾伯特黃轉換(HHT) 50
4.2.1經驗模態分解(EMD) 51
4.2.2希爾伯特轉換(Hilbert transform) 54
4.3邊際譜(Marginal Spectrum) 56
第五章 程式流程規劃 57
5.1整體軟體架構 57
5.2語音辨識程式規劃 59
5.3情緒辨識程式規劃 60
第六章 實驗方法和結果與討論 85
6.1實驗方法 85
6.1.1語音辨識實驗方法 85
6.1.2情緒辨識實驗方法 86
6.2結果與討論 86
6.2.1語音辨識結果和討論 86
6.2.2情緒辨識結果和討論 89
6.3機器人動作和APP顯示結果 90
第七章 結論與建議 94
7.1結論 94
7.2建議 94
參考文獻 95
參考文獻 [1]R. Aminuddin, A. Sharkey and L. Levita, “Interaction with the Paro Robot May Reduce Psychophysiological Stress Responses,” 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 593-594, 2016
[2]W.-L. Chang, S. Šabanović and L. Huber, “Use of Seal-Like Robot PARO in Sensory Group Therapy for Older Adults with Dementia,” IEEE 13th International Conference on Rehabilitation Robotics (ICORR), pp. 101-102, 2013
[3]http://tpcjournal.taipower.com.tw/article/index/id/181, November, 2017
[4]http://technews.tw/2015/02/25/robear/, November, 2017
[5]C. Jayawardena, I.-H. Kuo, U. Unger, A. Igic, R. Wong, C. I. Watson, R. Q. Stafford, E. Broadbent, P. Tiwari, J. Warren, J. Sohn and B. A. MacDonald, “Deployment of a Service Robot to Help Older People,” IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5990-5995, 2010
[6]P. Benavidez, M. Kumar, S. Agaian and M. Jamshidi, “Design of a Home Multi-Robot System for the Elderly and Disabled,” 10th System of Systems Engineering Conference (SoSE), pp. 392-397, 2011
[7]J. Chan and G. Nejat,” A Learning-based Control Architecture for an Assistive Robot Providing Social Engagement during Cognitively Stimulating Activities,” IEEE International Conference on Robotics and Automation, pp. 3928-3933, 2011
[9]J. Woo, K. Wada and N. Kubota, “Robot Partner System for Elderly People Care by Using Sensor Network,” The Fourth IEEE RAS/EMBS International Conference on Biomedical Robotics and Biomechatronics, pp. 1329-1334, 2012
[10]M. A. Anusuya and S. K. Katti, “Speech Recognition by Machine: A Review,” International Journal of Computer Science and Information Security (IJCSIS), pp. 181-205, 2009
[11]H. Gupta and D.N Gupta, “LPC and LPCC Method of Feature Extraction in Speech Recognition System,” IEEE 2016 6th International Conference - Cloud System and Big Data Engineering, pp. 498-502, 2016
[12]S. B. Davis and P. Mermelstein, “Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences,” IEEE Transactions on Acoustics, Speech, and Signal Processing, pp.357-366,1980
[13]N. S Nehe and R. S Holambe, “DWT and LPC Based Feature Extraction Methods for Isolated Word Recognition,” EURASIP Journal on Audio, Speech and Music Processing, 2012
[14]U. Shrawankar and V. Thakare, “Techniques for Feature Extraction in Speech Recognition System: A Comparative Study,” International Journal Of Computer Applications In Engineering, Technology and Sciences (IJCAETS), pp. 412-418, 2010
[15]P. Shen, C.-J.Zhou and X. Chen, “Automatic Speech Emotion Recognition Using Support Vector Machine,” Proceedings of 2011 International Conference on Electronic & Mechanical Engineering and Information Technology, pp. 621-625, 2011
[16]http://emodb.bilderbar.info/start.html, November, 2017
[17]https://www.csie.ntu.edu.tw/~cjlin/libsvm/, November, 2017
[18]A. Milton, S. S. Roy and S. T. Selvi, “SVM Scheme for Speech Emotion Recognition Using MFCC Feature,” International Journal of Computer Applications, pp. 34-39, 2013
[19]Shambhavi S. S and V. N Nitnaware, “Emotion Speech Recognition Using MFCC and SVM,” International Journal of Engineering Research & Technology, pp. 1067-1070, 2015
[20]W. Zhang, X.-Y. Zhang and Y. Sun, “Based on EEMD-HHT Marginal Spectrum of Speech Emotion Recognition,” 2012 International Conference on Computing, Measurement, Control and Sensor Network, pp. 91-94, 2012
[21]L. Xiang, W.-H. Xiong, J.-F. Li and R.-S. Ji, “Application of EEMD and Hilbert Marginal Spectrum in Speech Emotion Feature Extraction,” Control Conference, pp. 3686-3689, 2012
[22]Z.-L. Wang, H.-F. Li and L. Ma, “HHT Based Long Term Feature Extracting Method for Speech Emotion Classification,” Audio, Language and Image Processing, pp. 276-281, 2012
[23]http://ww1.microchip.com/downloads/en/devicedoc/70135C.pdf, November, 2017
[24]https://hobbytronics.com.pk/product/lm2596-adjustable-dc-dc-step-down-power-supply-module/, November, 2017
[25]http://goods.ruten.com.tw/item/show?21204135611196, November, 2017
[26]http://akizukidenshi.com/catalog/g/gI-02001/, November, 2017
[27]https://uge-one.com/pc817-optocoupler-optoisolator-dip-ic.html, November, 2017
[28]http://www.playrobot.com/storage/1758-micro-sd-tf-card-memory-shield-module-spi.html, November, 2017
[29]https://iamzxlee.wordpress.com/2014/07/20/lm386-low-voltage-audio-power-amplifier/, November, 2017
[30]https://lowvoltage.wordpress.com/tag/lm386/, November, 2017
[31]http://www.pmai.tn.edu.tw/df_ufiles/df_pics/32710%E7%AC%AC14%E7%AB%A0.pdf, November, 2017
[32]S. D. Dhingra, G. Nijhawan and P. Pandit, “Isolated Speech Recognition Using MFCC and DTW,” International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, pp. 4085-4092, 2013
[33]W. H. Abdulla, D. Chow and G. Sin, “Cross-Words Reference Tempiate for DTW Based Speech Recognition Systems,” TENCON 2003. Conference on Convergent Technologies for Asia-Pacific Region, pp. 1576-1579, 2003
[34]http://blog.csdn.net/zouxy09/article/details/9156785, November, 2017
[35]http://mirlab.org/jang/books/audiosignalprocessing/ptFreqDomainCepstrum.asp?title=7-8%20Cepstrum, November, 2017
[36]Y.-K. Lau and C.-K. Chan., “Speech Recognition Based on Zero Crossing Rate and Energy,” IEEE Transactions on Acoustics, Speech, and Signal Processing, pp. 320-323, 1985
[37]D. Jurafsky and J. H. Martin., “Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition,” ch.9 pp.1-72 ,2007
[39]http://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/, November, 2017
[40]S. Salvador and P. Chan, “FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space,” KDD Workshop on Mining Temporal and Sequential Data, pp. 70-80, 2004
[41]M. Ayadi, M. S.Kamel and F. Karray, “Survey on Speech Emotion Recognition: Features, Classification Schemes, and Databases,” Pattern Recognition, pp. 572-587, 2010
[42]N. E. Huang and S P Shen(2014), “Hilbert-Huang Transform and Its Applications,” World Scientific Pub Co Inc
[44]http://perso.ens-lyon.fr/patrick.flandrin/emd.html, November, 2017
[45]C.-D. Jiang, H.-W. Ko, C.-C. Wu, H.-C. Min, T.-J. Pyng, C.-W. Ling, “Applications of Hilbert-Huang Transform to Structural Damage Detection,” Structural Engineering and Mechanics, pp.1-20, 2011
[46]M.Kedadouche, M.Thomas and A.Tahan, “A Comparative Study between Empirical Wavelet Transforms and Empirical Mode Decomposition Methods: Application to Bearing Defect Diagnosis,” Mechanical Systems and Signal Processing, pp. 87-107, 2016
  • 同意授權校內瀏覽/列印電子全文服務,於2018-02-14起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2018-02-14起公開。

  • 如您有疑問,請聯絡圖書館