進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-3007202011392400
論文名稱(中文) 情境感知知識編碼之類神經對話模型
論文名稱(英文) Context-Aware Knowledge Encoding for Neural Conversation Model
校院名稱 成功大學
系所名稱(中) 資訊工程學系
系所名稱(英) Institute of Computer Science and Information Engineering
學年度 108
學期 2
出版年 109
研究生(中文) 潘昌義
研究生(英文) Chang-Yi Pan
學號 P76074575
學位類別 碩士
語文別 英文
論文頁數 48頁
口試委員 指導教授-高宏宇
口試委員-謝孫源
口試委員-李政德
口試委員-王惠嘉
口試委員-蔡宗翰
中文關鍵字 對話模型  自然語言生成  注意力機制 
英文關鍵字 conversation model  natural language generation  attention mechanism 
學科別分類
中文摘要 如今,能夠與真人用戶進行多輪會話的智能助理系統變得越來越流行。憑藉自然語言處理的強力支持,這項技術透過會話模型廣泛實現在各個領域中,例如聊天機器人與客戶服務系統。現行研究將對話模型的主要分為兩類,分別為檢索式模型與生成式模型。檢索式模型自現有的對話資料集中,以文字比對的方法挑選出與當前輸入句最匹配的問句-回覆組合中的回覆作為答案;生成式模型則透過自然語言生成以序列到序列的方式產生一個全新的回覆。
雖然生成式模型能夠針對輸入句子提供更為量身訂做的回覆句,但是該回覆仍有資訊性不足與文不及義的現象。在這篇論文中,我們為具有檢索訊息的生成式神經會話模型提出了情境感知知識編碼器(Context-Aware Knowledge Encoder, CAKE)以解決上述的問題。情境感知的知識編碼器由兩階層的注意力機制的編碼器所構成,分別為透過卷積神經網路判斷對話歷史的重要性以建構完整情境的情境感知編碼器,並利用完整情境自外部資訊抽取重要但不常見的關鍵字的知識編碼器。實驗結果顯示透過句層級的情境感知編碼器,與字層級的知識編碼器皆能改善對話模型的效能。我們希望這項研究的結果能提供生成式對話模型全新的見解,從不同層面挖掘生成模型的潛力。
英文摘要 Intelligence assistant systems which can take multi-turns conversation with human become popular nowadays. With the power of natural language processing, the conversation model has been widely used in various fields to achieve this technique such as chatbot and customer service. Recent research has studied the conversation model by either retrieval-based method or generation-based method. The retrieval-based model chooses the best match response for input sentence from the existing repository via word matching, and the generation-based model conducts a new response through the natural language generation.
Though generation-based conversation model can provide responses tailored to the input sentence, the responses still have lacks information and context. To address these issues, we propose Context-Aware Knowledge Encoder (CAKE) for generation-based conversation model with retrieved information. CAKE composes of two-level attention-oriented encoder, which includes context-aware encoder for full context and knowledge encoder for keyword. Experiment results demonstrate that our model has better performance on both sentence-level context encoder and word-level knowledge encoder. We hope that findings in this study can provide new insights that focus on the potential of generation-based conversation model.
論文目次 中文摘要 I
Abstract II
誌謝 III
Table Listing VII
Figure Listing VIII
1. Introduction 1
1.1. Background 1
1.2. Motivation 4
1.3. Our Approach 8
1.4. Paper Structure 11
2. Related Work 12
2.1. Sequence-to-sequence model 12
2.1.1. Encoder-decoder structure 12
2.1.2. Attention mechanism 14
2.2. Neural conversation model 16
2.2.1. Generation-based method 16
2.2.2. Generation module in the hybrid architecture 17
3. Proposed Method 20
3.1. Preliminary 20
3.2. Model Overview 20
3.3. Preprocessing 21
3.4. Context-Aware Encoder 22
3.4.1. Context Convolution Neural Network (Context-CNN) 22
3.4.2. History attention with context representation 23
3.5. Knowledge Encoder 24
3.5.1. Word-level attention for keyword distribution 25
3.5.2. Keyword dictionary building 26
3.6. Fact Encoder 27
3.7. Response Decoder 27
4. Experiments 29
4.1. Dataset 29
4.2. Evaluation Metrics 30
4.3. Experiment setup 31
4.3.1. Preprocessing dataset for context-aware encoder 31
4.3.2. Competing methods 31
4.3.3. Parameter settings 32
4.4. Performance comparison 33
5. Analysis 35
5.1. Contribution of conversation history 35
5.2. Context selection of Context-aware encoder 36
5.3. Contribution of keyword distribution 38
5.4. Focus strategy for knowledge encoder 39
5.5. Cross-method study on context-aware encoder and knowledge encoder 40
5.6. Ablation study 41
5.7. Case study 42
6. Conclusion 44
6.1. Future work 45
7. References 46
參考文獻 [1] Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. Advanced in the International Conference on Learning Representations.
[2] Baheti, A., Ritter, A., Li, J., & Dolan, B. (2018). Generating more interesting responses in neural conversation models with distributional constraints. In Proceedings of the 2018 Conference on empirical methods in natural language processing (EMNLP) (pp. 3970-3980).
[3] Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
[4] Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1724-1734).
[5] Gao, C., & Ren, J. (2019). A topic-driven language model for learning to generate diverse sentences. Neurocomputing, 333, 374-380.
[6] Gao, J., Galley, M., & Li, L. (2018, June). Neural approaches to conversational AI. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (pp. 1371-1374).
[7] Ghazvininejad, M., Brockett, C., Chang, M. W., Dolan, B., Gao, J., Yih, W. T., & Galley, M. (2018, April). A knowledge-grounded neural conversation model. In Thirty-Second AAAI Conference on Artificial Intelligence.
[8] Ji, Z., Lu, Z., & Li, H. (2014). An information retrieval approach to short text conversation. CoRR.
[9] Kim, Y. (2014). Convolutional neural networks for sentence classification. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1746-1751)
[10] Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. Advanced in the International Conference on Learning Representations.
[11] Lin, C. Y. (2004, July). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out (pp. 74-81).
[12] Lowe, R., Pow, N., Serban, I., & Pineau, J. (2015). The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics (pp. 311-318).
[13] Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002, July). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics (pp. 311-318).
[14] Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
[15] Qu, C., Yang, L., Qiu, M., Croft, W. B., Zhang, Y., & Iyyer, M. (2019, July). BERT with history answer embedding for conversational question answering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1133-1136).
[16] Qu, C., Yang, L., Chen, C., Qiu, M., Croft, W. B., & Iyyer, M. (2020). Open-Retrieval Conversational Question Answering. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.
[17] Robertson, S. E., & Walker, S. (1994). Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR’94 (pp. 232-241). Springer, London.
[18] Song, Y., Yan, R., Li, C. T., Nie, J. Y., Zhang, M., & Zhao, D. (2018). An Ensemble of Retrieval-Based and Generation-Based Human-Computer Conversation Systems. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence Main track. (pp. 4382-4388).
[19] Sordoni, A., Galley, M., Auli, M., Brockett, C., Ji, Y., Mitchell, M., … & Dolan, B. (2015). A neural network approach to context-sensitive generation of conversational responses. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
[20] Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104-3112).
[21] Tian, Z., Yan, R., Mou, L., Song, Y., Feng, Y., & Zhao, D. (2017, July). How to make context more useful? an empirical study on context-aware neural conversational models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 231-236).
[22] Vinyals, O., & Le, Q. (2015). A neural conversational model. ICML Deep Learning Workshop.
[23] Sukhbaatar, S., Weston, J., & Fergus, R. (2015). End-to-end memory networks. Advances in neural information processing systems (pp. 2440-2448).
[24] Yang, L., Hu, J., Qiu, M., Qu, C., Gao, J., Croft, W. B., … & Liu, J. (2019, November). A hybrid retrieval-generation neural conversation model. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (pp. 1341-1350).
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2021-09-01起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2021-09-01起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw