||Attention-based Response Template Generation with Deep Reinforcement Learning for Conversational Systems
||Institute of Computer Science and Information Engineering
deep reinforcement learning
本論文收集了旅遊對話語料和閒聊語料，並將旅遊對話語料依照使用者意圖和系統動作進行標記。為獲得使用者文句的語義關係，我們藉由CKIP剖析器得到使用者文句之語意樹，並且使用規則轉換語意樹成為語意結構圖，最後將詞語使用分群之群組替代，並且使用Preorder遍歷方式取得語意關係序列。我們使用GRU自編碼器取得之使用者回合和上下文之表達式串接與使用者意圖以及對話狀態，送入對話決策網路(Parallel Double Q Network)中進行對話動作決策，最後使用者文句以及決策後的動作送入具注意力機制之文句生成網路(Transformer)進行回應模板句生成，並且經由規則進行填充，得到最後的回應句。
本論文共收集418場旅遊相關對話及447組閒聊問答配對，並採用5次交叉驗證做實驗評估。實驗結果顯示，使用語意關係資訊比起使用文字資訊在意圖偵測部分提升4.3%正確率，對話policy部分任務完成度也提升1.4%。另外，本論文採用Parallel Double Q Network模型使得任務完成度達到87.6%，比起Double Q Network模型提升13.9%。最後，使用Transformer模型生成模板獲得13.6之Bleu分數，比起Sequence-to-Sequence模型提升7.8。主觀評估上，對話policy和文句生成也較baseline獲得更高之適當性和文法正確性分數，兩個模組分別在適當性獲得55.2%及62.0%之+1分數以及文句生成在文法正確性部分獲得86.0%之+1分數。
Recently, conversational systems have become a popular topic. Many specialists have contributed and developed the applications in this field, but still some modules in the conversational systems need improvement in their performance. If a conversational system can have an appropriate and suitable response, people will be more willing to talk to it. This thesis investigates to develop a conversational system on travel domain; it considers the descriptions or questions about tours in their conversations, and gives appropriate and suitable responses to the users. In addition, the conversational system can also serve as a chitchat bot to improve the richness of the conversations and let users have more interaction with the system.
We collected a task-oriented corpus on travel domain, which was tagged with user’s intents and system’s dialogue acts, and a chitchat corpus. Considering the semantic dependency representation of the users’ input sentence, the CKIP parser is adopted to derive the semantic tree of the input sentence. The semantic tree is then converted into a semantic graph, in which the words are replaced by word clusters. Thereafter, a preorder traversal method is applied to the semantic graph to obtain a semantic dependency sequence. We use a GRU Autoencoder to derive the user’s turn representation as well as context representation, and utilize the user’s intent and the dialog state as the input to determine the dialogue act using a Parallel Double Q Network. Finally, the user’s input and the dialogue act serve as the input of the attention-based Transformer model to generate the response template. By filling in the slots with their corresponding values into the generated response template, the response to the users is thus obtained.
This thesis collected 418 dialogues on travel domain and 447 chitchat question-answer pairs as the evaluation corpus. Five-fold cross validation was employed for performance evaluation. Experimental results showed that using semantic dependencies could achieve a better performance than that using words in the input sentence. Regarding intent detection, the proposed approach increased the accuracy by 4.3%. On dialogue policy decision, the task success rate was increased by 1.4%. On the other hand, the Parallel Double Q Network achieved 87.6% on task success rate, which was 13.9% higher than the Double Q Network model. Finally, using the attention-based Transformer model for response template generation obtained 13.6 on Bleu score, improved by 7.8 compared to the Sequence-to-Sequence model. In subjective evaluation, both the dialogue policy and sentence generation model achieved a higher appropriateness and grammatical correctness scores than the baseline system. The two modules obtained 55.2% and 62.0% on +1 appropriateness, respectively. In addition, response generation obtained 86.0% on +1 grammatical correctness.
List of Tables IX
List of Figures X
Chapter 1 Introduction 1
1.1 Background 1
1.2 Motivation 2
1.3 Literature Review 2
1.3.1 Sentence Feature Extraction 2
1.3.2 Dialogue Policy 4
1.3.3 Response Generation 5
1.4 Problems and Proposed Methods 7
1.5 Research Framework 8
Chapter 2 Corpus Design and Collection 10
2.1 Corpus Collection 10
2.1.1 Travel Domain Multi-turn Dialogue Collection 10
2.1.2 Single-turn Chitchat Collection 11
2.2 Corpus Arrangement and Analysis 11
2.2.1 Multi-turn Travel Domain Dialogue Corpus 11
2.2.2 Single-turn Chitchat Corpus 14
Chapter 3 Proposed Methods 15
3.1 Spoken Language Understanding 16
3.1.1 Semantic Embedding 17
3.1.2 User Intent Detection 19
3.1.3 Slot Extraction 20
3.2 Turn Embedding 21
3.2.1 GRU 21
3.2.2 GRU Autoencoder 23
3.3 Context Tracking 25
3.4 Dialogue Policy 26
3.4.1 Double Q Network 27
3.4.2 Parallel Double Q Network 28
3.4.3 Reward Functions and Training 30
3.5 Response Generation 34
3.5.1 Preprocessing 34
3.5.2 Template Generation 34
3.5.3 Slot Filling 38
3.5.4 Chitchat Response Generation 39
Chapter 4 Experimental Results and Discussion 40
4.1 Turn Embedding and Context Tracking 40
4.2 Overall System Performance Evaluation 41
4.2.1 Sentence Features for Intent Detection 42
4.2.2 Evaluation on Dialogue Policy Decision Methods 42
4.2.3 Sentence Generation Method Evaluation 44
4.3 Discussion 46
Chapter 5 Conclusion and Future Work 48
 Messenger Bots. Available: https://messenger.fb.com/
 Google Assistant. Available: https://assistant.google.com/#?modal_active=none
 Microsoft Cortana. Available: https://www.microsoft.com/en-us/cortana
 C. Newton. (2016). There are now more than 11,000 bots on Facebook Messenger. Available: https://www.theverge.com/2016/7/1/12072456/facebook-messenger-bot-growth
 F. K. Alice Raffin, Quentin Nicolle-Chalot. (2017). Facebook Messenger Chatbots: year one in review. Available: http://teahouse.fifty-five.com/en/post/facebook-messenger-chatbots-year-one-in-review
 K. Johnson. (2017). Facebook Messenger hits 100,000 bots. Available: https://venturebeat.com/2017/04/18/facebook-messenger-hits-100000-bots/
 Apple Siri. Available: https://www.apple.com/tw/ios/siri/
 Amazon Alexa. Available: https://developer.amazon.com/alexa?cid=a
 H. Chen, X. Liu, D. Yin, and J. Tang, "A Survey on Dialogue Systems: Recent Advances and New Frontiers," ArXiv e-prints, Accessed on: November 01, 2017
 S. E. A., "Sequencing in Conversational Openings1," American Anthropologist, vol. 70, no. 6, pp. 1075-1095, 1968.
 T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," in Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, Lake Tahoe, Nevada, 2013, pp. 3111-3119, 2999959: Curran Associates Inc.
 T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient Estimation of Word Representations in Vector Space," ArXiv e-prints, Accessed on: January 01, 2013
 J. Pennington, R. Socher, and C. D. Manning, "GloVe: Global Vectors for Word Representation," in Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1532-1543.
 M. H. Goker, P. Langley, and C. A. Thompson, "A Personalized System for Conversational Recommendations," ArXiv e-prints, Accessed on: June 01, 2011
 J. Bang, H. Noh, Y. Kim, and G. G. Lee, "Example-based chat-oriented dialogue system with personalized long-term memory," in 2015 International Conference on Big Data and Smart Computing (BIGCOMP), 2015, pp. 238-243.
 Y. Kim, J. Bang, J. Choi, S. Ryu, S. Koo, and G. G. Lee, "Acquisition and Use of Long-Term Memory for Personalized Dialog Systems," in International Workshop on Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction, Cham, 2015, pp. 78-87: Springer International Publishing.
 S. Young, M. Gašić, B. Thomson, and J. D. Williams, "POMDP-Based Statistical Spoken Dialog Systems: A Review," in Proceedings of the IEEE, 2013, vol. 101, no. 5, pp. 1160-1179.
 H. Cuayáhuitl, "SimpleDS: A Simple Deep Reinforcement Learning Dialogue System," ArXiv e-prints, Accessed on: January 01, 2016
 Z. Yu, A. W. Black, and A. I. Rudnicky, "Learning Conversational Systems that Interleave Task and Non-Task Content," ArXiv e-prints, Accessed on: February 01, 2017
 B. Peng et al., "Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning," ArXiv e-prints, Accessed on: April 01, 2017
 H. van Hasselt, A. Guez, and D. Silver, "Deep Reinforcement Learning with Double Q-learning," ArXiv e-prints, Accessed on: September 01, 2015
 R. E. Frederking, "A Rule-based Conversation Participant," in 19th Annual Meeting of the Association for Computational Linguistics, Stanford, 1981.
 S. Hochreiter and J. Schmidhuber, "Long Short-term Memory," Neural Computation, vol. 9, no. 8, pp. 1735-80, 1997.
 I. J. Goodfellow et al., "Generative Adversarial Networks," ArXiv e-prints, Accessed on: June 01, 2014
 A. Vaswani et al., "Attention Is All You Need," ArXiv e-prints, Accessed on: June 01, 2017
 T.-H. Wen, M. Gasic, N. Mrksic, P.-H. Su, D. Vandyke, and S. Young, "Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems," ArXiv e-prints, Accessed on: August 01, 2015
 O. Vinyals and Q. Le, "A Neural Conversational Model," ArXiv e-prints, Accessed on: June 01, 2015
 L. Yu, W. Zhang, J. Wang, and Y. Yu, "SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient," ArXiv e-prints, Accessed on: September 01, 2016
 Y. Zhang et al., "Adversarial Feature Matching for Text Generation," ArXiv e-prints, Accessed on: June 01, 2017
 J. Guo, S. Lu, H. Cai, W. Zhang, Y. Yu, and J. Wang, "Long Text Generation via Adversarial Training with Leaked Information," ArXiv e-prints, Accessed on: September 01, 2017
 Z. Xu et al., "Neural Response Generation via GAN with an Approximate Embedding Layer," in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing Copenhagen, 2017, pp. 617-626: Association for Computational Linguistics.
 J. Yang. Gossiping-Chinese-Corpus. Available: https://github.com/zake7749/Gossiping-Chinese-Corpus
 V. Mnih et al., "Human-level control through deep reinforcement learning," Nature, vol. 518, p. 529, 02/25/online 2015.