||Empathetic Response Generation Using Dialogue Situation and Empathy Analysis in Conditional Transformer
||Institute of Computer Science and Information Engineering
Empathetic dialogue system
In recent years, human-machine dialogue systems have become a popular topic, and to make the response more human-like requires the use of communication skills--empathy. Empathetic dialogue system can offer lonely people in modern society to have a way to share their mind. The system is able to make the response related to the dialogue and keeps the dialogue continue without being boring. Thus, the goal of this thesis is to establish an open domain multi-turn empathetic dialogue system.
The contribution of this study is to propose a situation vector of the dialogue. This thesis uses SBERT, which has excellent performance in sentence embedding, during training to detect the situation vector of the dialogue from the user's historical sentences. This feature is adopted as one of the conditions in conditional Transformer, which is the generation model in the proposed system. In addition, this study also proposes the empathy analysis to fine-tune the generation model. The empathy analysis is based on definition of advanced-level empathy and includes two factors: the change of user's emotion valence and the change in the amount of sentence information. If the system can generate an empathetic response, both emotion valence and sentence information will have positive changes. Therefore, empathy analysis is employed to fine-tune the trained conditional Transformer.
This study uses EmpatheticDialogues database as the training corpus. To generate empathetic responses, the system adopts user’s emotions, dialogue topics, and dialogue situations as conditions in the conditional Transformer. According to the experimental results, when these conditions are added in the conditional Transformer, the BLEU score reaches 7.747 and improves 0.65 over the baseline. After fine-tuning with empathy analysis, BLEU score is then increased to 7.821. Both of the results show the improvement comparing to the baseline. Also, according to the result of human subjective evaluation, the three evaluation results of empathy, relevance and fluency are better than the baseline. Therefore, the situation vector of the dialogue and empathy analysis proposed in this thesis are effective in helping generate empathetic response sentence in the dialogue system.
List of Tables VII
List of Figures VIII
Chapter 1 Introduction 1
1.1 Background 1
1.2 Motivation 3
1.3 Literature Review 5
1.3.1 Empathy Dialogue System 5
1.3.2 Emotion Detection and Dialogue Topic Detection in Text 7
1.3.3 Sentence Embedding Model 9
1.3.4 Natural Language Generation 10
1.4 Problems 12
1.5 Proposed Methods 13
Chapter 2 System Framework 15
2.1 Natural Language Understanding 17
2.1.1 Emotion Detection and Dialogue Topic Detection 17
2.1.2 BERT 18
2.2 Dialogue State Tracking 25
2.2.1 Dialogue Situation Detection Model 25
2.3 Response Generation Model 29
2.3.1 Transformer 30
2.3.2 Information 37
2.3.3 Emotion Valence 39
2.3.4 Conditional Transformer with Empathy Analysis 40
Chapter 3 Experimental Results and Discussion 43
3.1 Evaluation Metrics 43
3.1.1 BLEU score 44
3.1.2 Human Subjective Evaluation 45
3.2 Dataset 47
3.2.1 EmpatheticDialogues Corpus 47
3.2.2 DailyDialog Corpus 48
3.3 Experimental Results and Discussion 50
3.3.1 Emotion Detection Model 50
3.3.2 Dialogue Topic Detection Model 52
3.3.3 Dialogue Situation Detection Model 53
3.3.4 Information Detection Model 54
3.3.5 Emotion Valence and Information Regression Model 57
3.3.6 Response Generation Using Conditional Transformer 59
Chapter 4 Conclusion and Future Work 63
 H. Chen, X. Liu, D. Yin, and J. Tang, "A survey on dialogue systems: Recent advances and new frontiers," Acm Sigkdd Explorations Newsletter, vol. 19, no. 2, pp. 25-35, 2017.
 M. Ochs, C. Pelachaud, and D. Sadek, "An empathic virtual dialog agent to improve human-machine interaction," in Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems-Volume 1, 2008, pp. 89-96.
 R. R. Carkhuff, "Helping and human relations: A primer for lay and professional helpers: I. Selection and training," 1969.
 P. Fung et al., "Zara the supergirl: An empathetic personality recognition system," in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, 2016, pp. 87-91.
 P. Fung, D. Bertero, P. Xu, J. H. Park, C.-S. Wu, and A. Madotto, "Empathetic dialog systems," in The International Conference on Language Resources and Evaluation. European Language Resources Association, 2018.
 H. Rashkin, E. M. Smith, M. Li, and Y.-L. Boureau, "Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset," in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 5370-5381.
 Z. Lin, A. Madotto, J. Shin, P. Xu, and P. Fung, "MoEL: Mixture of Empathetic Listeners," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 121-132.
 L. Zhou, J. Gao, D. Li, and H.-Y. Shum, "The design and implementation of xiaoice, an empathetic social chatbot," Computational Linguistics, vol. 46, no. 1, pp. 53-93, 2020.
 J. Shin, P. Xu, A. Madotto, and P. Fung, "Happybot: Generating empathetic dialogue responses by improving user experience look-ahead," arXiv preprint arXiv:1906.08487, 2019.
 H. C. Yu, K. Huang, and H. H. Chen, "Domain dependent word polarity analysis for sentiment classification," in 24th Conference on Computational Linguistics and Speech Processing, ROCLING 2012, 2012, pp. 30-31.
 J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018.
 A. Agrawal and A. An, "Unsupervised emotion detection from text using semantic and syntactic relations," in 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, 2012, vol. 1: IEEE, pp. 346-353.
 J. Pennington, R. Socher, and C. D. Manning, "Glove: Global vectors for word representation," in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532-1543.
 T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013.
 P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, "Enriching word vectors with subword information," Transactions of the Association for Computational Linguistics, vol. 5, pp. 135-146, 2017.
 R. Kiros et al., "Skip-thought vectors," in Advances in neural information processing systems, 2015, pp. 3294-3302.
 L. Logeswaran and H. Lee, "An efficient framework for learning sentence representations," arXiv preprint arXiv:1803.02893, 2018.
 N. Reimers and I. Gurevych, "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 3973-3983.
 S. Verberne, "Retrieval-based Question Answering for Machine Reading Evaluation," in CLEF (Notebook Papers/Labs/Workshop), 2011.
 D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," in 3rd International Conference on Learning Representations, ICLR 2015, 2015.
 A. Vaswani et al., "Attention is all you need," in Advances in neural information processing systems, 2017, pp. 5998-6008.
 黃惠惠, 助人歷程與技巧 (增訂版). 張老師, 1991.
 A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, "Improving language understanding by generative pre-training," ed, 2018.
 M. E. Peters et al., "Deep contextualized word representations," arXiv preprint arXiv:1802.05365, 2018.
 A. Holtzman, J. Buys, L. Du, M. Forbes, and Y. Choi, "The curious case of neural text degeneration," arXiv preprint arXiv:1904.09751, 2019.
 Y. Wu et al., "Google's neural machine translation system: Bridging the gap between human and machine translation," arXiv preprint arXiv:1609.08144, 2016.
 K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, "BLEU: a method for automatic evaluation of machine translation," in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311-318.
 J. Li, M. Galley, C. Brockett, J. Gao, and B. Dolan, "A diversity-promoting objective function for neural conversation models," arXiv preprint arXiv:1510.03055, 2015.
 T.-H. Wen, M. Gasic, N. Mrksic, P.-H. Su, D. Vandyke, and S. Young, "Semantically conditioned lstm-based natural language generation for spoken dialogue systems," arXiv preprint arXiv:1508.01745, 2015.
 C.-W. Liu, R. Lowe, I. V. Serban, M. Noseworthy, L. Charlin, and J. Pineau, "How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation," arXiv preprint arXiv:1603.08023, 2016.
 Y. Li, H. Su, X. Shen, W. Li, Z. Cao, and S. Niu, "Dailydialog: A manually labelled multi-turn dialogue dataset," arXiv preprint arXiv:1710.03957, 2017.
 H. Akoglu, "User's guide to correlation coefficients," Turkish journal of emergency medicine, vol. 18, no. 3, pp. 91-93, 2018.
 S. Mohammad, "Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 english words," in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 174-184.