進階搜尋


 
系統識別號 U0026-2708201722371300
論文名稱(中文) 應用具語句關注之連續對話狀態追蹤與強化學習之面試訓練系統
論文名稱(英文) Sentence Attention-based Continuous Dialog State Tracking and Reinforcement Learning for Interview Coaching
校院名稱 成功大學
系所名稱(中) 資訊工程學系
系所名稱(英) Institute of Computer Science and Information Engineering
學年度 105
學期 2
出版年 106
研究生(中文) 陳垂康
研究生(英文) Chu-Kwang Chen
電子信箱 chuei1992@gmail.com
學號 p76044067
學位類別 碩士
語文別 英文
論文頁數 55頁
口試委員 指導教授-吳宗憲
口試委員-陳嘉平
口試委員-王家慶
口試委員-禹良治
口試委員-陳有圳
中文關鍵字 面試訓練  對話系統  對話管理  主題模型  注意力模型  長短期記憶遞歸神經網路  自編碼器  強化學習 
英文關鍵字 interview coaching  dialog system  dialog management  topic model  attention model  LSTM  autoencoder  reinforcement learning 
學科別分類
中文摘要 面試是一種很常被使用的入學管道,大家都知道面試的重要性,但是真正尋求面試專家進行面試練習的人卻很少。練習面試最直接的方式就是直接請專家來擔任面試官,但要花費的人力成本與時間配合都會是問題,而且學生難以反覆練習。本論文希望開發一個訓練面試的對話系統,可以更彈性地提供學生更多反覆練習面試的機會。
本論文之研究主題為對話管理。在對話系統中,需要對話管理來決定對話流程,其中包含對話狀態追蹤和對話決策。傳統的對話狀態追蹤須由人工定義要追蹤的語意欄位,而本論文以主題機率分佈作為句子語意表示進行對話追蹤,而當對話回合由多個句子所組成時,其中可能包含不相關之句子,本論文結合卷積張量神經網路(CNTN)以及主題機率分佈,對多語句對話進行語句關注(Sentence Attention),給予每一句子重要度權重,再透過基於長短期記憶遞歸神經網路的自編碼器(LSTM-based Autoencoder)分別建立句子和對話回合之間的轉換及累積關係,藉此得到對話狀態,最後本論文為此面試流程設計獎勵函數(Reward Function)並利用強化學習中的Double Q-learning來建立觀察狀態和系統動作之間的關係。
本論文共收集了260場面試對話,並採用五次交叉驗證來做實驗評估。從實驗結果顯示,本論文提出的方法相較與傳統方法,在一般問題的數量、追問問題的數量以及總問題的數量都更能達到語料中的平均值,而累積下來的總獎勵值也比傳統方法好。
英文摘要 Admission interviews are one of the most frequently used methods of student selection. Even though people know the importance of such interviews, very few people practice their interview skills effectively by seeking professional help. Many students thus lack interview experience, and are likely to be nervous during an interview. There are many ways that can improve students’ interview skills, one of which is to hire a professional interview coach. This is the most direct way to practice interview skills, but it is also rather expensive.
The main purpose of this thesis is thus to develop a dialog manager for an interview coaching system. In a dialog system, Dialog State Tracking (DST) and Dialog Policy (Policy) both are important tasks. The traditional approaches define the semantic slots manually for dialog state representation and tracking. This thesis adopts the topic profiles of the sentences as the representation of a dialog state. When the input sequence consists of several sentences, the summary vector is likely to contain noisy information from many irrelevant feature vectors. This thesis thus applies a sentence attention mechanism by combining the Convolutional Neural Tensor Network (CNTN) and Topic Profile for dialog state tracking. An LSTM-based autoencoder is used as dialog state tracker to model the transition and accumulation of dialog states. Finally, by applying Reinforcement Learning (RL) along with the designed reward functions, the agent learns its behavior from the interactions in an environment for making action decisions.
This study collected 260 interview dialogs containing 3,016 dialog turns. A five-fold cross validation scheme was employed for evaluation. The results show that the proposed method performed better than the semantic slot-based baseline method by comparing the statistical data on the number of normal taken actions, follow-up taken actions and accumulative reward by Dialog Policy in the collected corpus.
論文目次 摘要 I
Abstract II
誌謝 IV
Contents V
List of Tables VII
List of Figures VIII
Chapter 1 Introduction 1
1.1 Background 1
1.2 Motivation 2
1.3 Literature Review 3
1.3.1 Interview Coaching System 3
1.3.2 Dialog State Tracking 4
1.3.3 Attention Mechanisms 5
1.3.4 Dialog Policy 6
1.4 Problems and Proposed Methods 9
1.5 Research Framework 12
Chapter 2 MHMC Interview Database 13
2.1 Data Collection 13
2.2 Corpus Introduction 13
Chapter 3 Proposed Methods 16
3.1 Establishment of Topic Model 17
3.2 Sentence Attention Mechanism 20
3.2.1 Convolutional Neural Tensor Network (CNTN) 21
3.2.2 Sentence Attention – CNTN 23
3.2.3 Sentence Attention – Topic Profile 24
3.3 Dialog State Tracking 25
3.3.1 Long Short-Term Memory 26
3.3.2 LSTM-based Autoencoder 28
3.3.3 Establishment and Training of The DST 29
3.4 Reinforcement Learning in Dialog Policy 31
3.4.1 Agent of Reinforcement Learning 32
3.4.2 Reward Function 36
3.4.3 Policy Model Training 38
Chapter 4 Experimental Results and Discussion 40
4.1 Relevance Classification Performance 40
4.2 Evaluation of the LSTM-based Autoencoder 42
4.3 Evaluation of System Performance 45
4.3.1 Comparison of Topic Profile and Semantic Slot 45
4.3.2 Evaluation of Sentence Representation with Attention Mechanism 48
4.3.3 Discussion 49
Chapter 5 Conclusion and Future Work 51
References 53
參考文獻 [1] J. Williams, A. Raux, and M. Henderson, "The dialog state tracking challenge series: A review," Dialogue & Discourse, vol. 7, no. 3, pp. 4-33, 2016.
[2] (2016). 今年起大學個人申請最高占7成. Available: http://www.chinatimes.com/realtimenews/20160222005034-260405
[3] Palladian. Available: http://www.palladiancr.com/
[4] H. Jones and N. Sabouret, "TARDIS-A simulation platform with an affective virtual recruiter for job interviews," IDGEI (Intelligent Digital Games for Empowerment and Inclusion), 2013.
[5] M. E. Hoque, M. Courgeon, J.-C. Martin, B. Mutlu, and R. W. Picard, "Mach: My automated conversation coach," in Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing, 2013, pp. 697-706: ACM.
[6] M. J. Smith et al., "Virtual reality job interview training in adults with autism spectrum disorder," Journal of Autism and Developmental Disorders, vol. 44, no. 10, pp. 2450-2463, 2014.
[7] M. Henderson, "Machine learning for dialog state tracking: A review," in Proc. of The First International Workshop on Machine Learning in Spoken Language Processing, 2015.
[8] V. Zue et al., "JUPlTER: a telephone-based conversational interface for weather information," IEEE Transactions on speech and audio processing, vol. 8, no. 1, pp. 85-96, 2000.
[9] S. Larsson and D. R. Traum, "Information state and dialogue management in the TRINDI dialogue move engine toolkit," Natural language engineering, vol. 6, no. 3&4, pp. 323-340, 2000.
[10] J. D. Williams, "Web-style ranking and SLU combination for dialog state tracking," in Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), 2014, pp. 282-291.
[11] M. Henderson, B. Thomson, and S. J. Young, "Deep Neural Network Approach for the Dialog State Tracking Challenge," in SIGDIAL Conference, 2013, pp. 467-471.
[12] S. Lee, "Structured discriminative model for dialog state tracking," in Proceedings of the SIGDIAL 2013 Conference, 2013, pp. 442-451.
[13] H. Ren, W. Xu, Y. Zhang, and Y. Yan, "Dialog state tracking using conditional random fields," in Proceedings of SIGDIAL, 2013.
[14] M. Henderson, B. Thomson, and S. Young, "Word-based dialog state tracking with recurrent neural networks," in Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), 2014, pp. 292-299.
[15] L. Zilka and F. Jurcicek, "Incremental LSTM-based dialog state tracker," in Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on, 2015, pp. 757-762: IEEE.
[16] O. Plátek, P. Bělohlávek, V. Hudeček, and F. Jurčíček, "Recurrent Neural Networks for Dialogue State Tracking," arXiv preprint arXiv:1606.08733, 2016.
[17] S.-s. Shen and H.-y. Lee, "Neural attention models for sequence classification: Analysis and application to key term extraction and dialogue act detection," arXiv preprint arXiv:1604.00077, 2016.
[18] D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv:1409.0473, 2014.
[19] K. Xu et al., "Show, attend and tell: Neural image caption generation with visual attention," in International Conference on Machine Learning, 2015, pp. 2048-2057.
[20] 張俊林. (2016). 以Attention Model為例談談兩種研究創新模式. Available: http://blog.csdn.net/malefactor/article/details/50583474
[21] L. Shang, Z. Lu, and H. Li, "Neural responding machine for short-text conversation," arXiv preprint arXiv:1503.02364, 2015.
[22] M.-T. Luong, H. Pham, and C. D. Manning, "Effective approaches to attention-based neural machine translation," arXiv preprint arXiv:1508.04025, 2015.
[23] W.-N. Hsu, Y. Zhang, and J. Glass, "Recurrent Neural Network Encoder with Attention for Community Question Answering," arXiv preprint arXiv:1603.07044, 2016.
[24] Y. Cui, Z. Chen, S. Wei, S. Wang, T. Liu, and G. Hu, "Attention-over-attention neural networks for reading comprehension," arXiv preprint arXiv:1607.04423, 2016.
[25] C.-J. Lee, S.-K. Jung, K.-D. Kim, D.-H. Lee, and G. G.-B. Lee, "Recent approaches to dialog management for spoken dialog systems," Journal of Computing Science and Engineering, vol. 4, no. 1, pp. 1-22, 2010.
[26] M. F. McTear, "Modelling spoken dialogues with state transition diagrams: experiences with the CSLU toolkit," development, vol. 5, no. 7, 1998.
[27] E. Levin, R. Pieraccini, and W. Eckert, "Using Markov decision process for learning dialogue strategies," in Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, 1998, vol. 1, pp. 201-204: IEEE.
[28] S. Young, M. Gašić, B. Thomson, and J. D. Williams, "Pomdp-based statistical spoken dialog systems: A review," Proceedings of the IEEE, vol. 101, no. 5, pp. 1160-1179, 2013.
[29] V. Mnih et al., "Human-level control through deep reinforcement learning," Nature, vol. 518, no. 7540, pp. 529-533, 2015.
[30] S. Gu, T. Lillicrap, I. Sutskever, and S. Levine, "Continuous deep q-learning with model-based acceleration," arXiv preprint arXiv:1603.00748, 2016.
[31] Z. Wang, T. Schaul, M. Hessel, H. van Hasselt, M. Lanctot, and N. de Freitas, "Dueling network architectures for deep reinforcement learning," arXiv preprint arXiv:1511.06581, 2015.
[32] H. Van Hasselt, A. Guez, and D. Silver, "Deep Reinforcement Learning with Double Q-Learning," in AAAI, 2016, pp. 2094-2100.
[33] I. Szita and A. Lörincz, "Learning Tetris using the noisy cross-entropy method," Neural computation, vol. 18, no. 12, pp. 2936-2941, 2006.
[34] T. P. Lillicrap et al., "Continuous control with deep reinforcement learning," arXiv preprint arXiv:1509.02971, 2015.
[35] V. Mnih et al., "Asynchronous methods for deep reinforcement learning," in International Conference on Machine Learning, 2016, pp. 1928-1937.
[36] H. Cuayáhuitl, "Simpleds: A simple deep reinforcement learning dialogue system," in Dialogues with Social Robots: Springer, 2017, pp. 109-118.
[37] D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent dirichlet allocation," Journal of machine Learning research, vol. 3, no. Jan, pp. 993-1022, 2003.
[38] Wikipedia contributors. Latent Dirichlet allocation. Available: https://en.wikipedia.org/w/index.php?title=Latent_Dirichlet_allocation&oldid=786924256
[39] yangliuy. (2012). 概率語言模型及其變形系列-LDA及Gibbs Sampling. Available: http://www.52nlp.cn/%E6%A6%82%E7%8E%87%E8%AF%AD%E8%A8%80%E6%A8%A1%E5%9E%8B%E5%8F%8A%E5%85%B6%E5%8F%98%E5%BD%A2%E7%B3%BB%E5%88%97-lda%E5%8F%8Agibbs-sampling
[40] X. Qiu and X. Huang, "Convolutional Neural Tensor Network Architecture for Community-Based Question Answering," in IJCAI, 2015, pp. 1305-1311.
[41] R. Socher, D. Chen, C. D. Manning, and A. Ng, "Reasoning with neural tensor networks for knowledge base completion," in Advances in neural information processing systems, 2013, pp. 926-934.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2019-08-31起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2019-08-31起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw