進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-2910201908372700
論文名稱(中文) 階層式自注意力增強中文問題分類編碼器
論文名稱(英文) HAEE: Question Classification Using Hierarchical Intra-Attention Enhancement Encoder
校院名稱 成功大學
系所名稱(中) 電腦與通信工程研究所
系所名稱(英) Institute of Computer & Communication
學年度 108
學期 1
出版年 108
研究生(中文) 王仁暐
研究生(英文) Jen-Wei Wang
學號 Q36064052
學位類別 碩士
語文別 中文
論文頁數 44頁
口試委員 口試委員-高宏宇
口試委員-陳宜欣
口試委員-盧文祥
指導教授-黃仁暐
中文關鍵字 問題分類  雙向閘門控制循環單元  注意力機制 
英文關鍵字 Question Classification  Bidirectional Gated Recurrent Unit  Attention mechanism 
學科別分類
中文摘要 隨著電子商務的發展,自動問答在客服系統中扮演著相當重要的角色來降低人力。問題分類是根據答案類型為問題分配標籤,是問答系統中的任務之一。以前的方法通常使用人工定義的特徵,如命名實體識別,但它需要事前定義的字典或工具。近幾年,機器學習方法應用於該任務並得到很高的準確度。在本篇論文中,我們提出了HAEE,一種階層式自注意力增強編碼器,它由雙向閘門控制控循環單元和自注意力機制所組成。另外,我們採用字符的輸入來解決字詞未出現在字典裡的問題,並創建多個自注意力機制來模擬字符(中文)或單詞(英文)之間的關係,以增強每個字符對這句話的影響。我們利用實際的企業環境以及幾個資料集來評估HAEE模型。從實驗結果來看,在分類任務中,HAEE優於現存表現最為優異的幾個模型,特別是針對中文資料集。
英文摘要 Automated question-answering systems play an important role in e-commerce customer service systems. Question classification involves assigning labels to questions according to the type of answer required. Most previous approaches, such as named entity recognition, are based on a predefined dictionary in conjunction with machine learning to enhance accuracy. In this paper, we propose a hierarchical enhancement encoder featuring bidirectional gated recurrent networks and character input to address the out-of-vocabulary problem. We also created multiple intra-attentions to simulate relationships among characters (in Chinese) or words (in English) to enhance the influence of tokens within a sentence. In experiments conducted in a real-world corporate setting with several datasets, the proposed HAEE system outperformed existing state-of-the-art models in question classification tasks, particularly when applied to a Chinese corpus.
論文目次 中文摘要 i
Abstract ii
Acknowledgment iii
Table of Contents iv
List of Tables vi
List of Figures vii
1 Introduction 1
2 Related Work 4
3 Preliminaries 6
3.1 Sentence preprocess 7
3.1.1 Tokenization 7
3.1.2 Stemming and Lemmatization 8
3.2 Information retrieval (IR) 9
3.3 Classification 10
4 Methodology 12
4.1 Preprocessing 12
4.2 Character Level 14
4.2.1 Character Level Intra-Attention: 14
4.2.2 Character Level Encoder: 15
4.3 Semantic Level 19
Semantic Level Intra-Attention: 19
4.3.2 Semantic Level Encoder: 19
4.4 Deep Semantic Level 20
4.5 Prediction 21
5 Experiments 24
5.1 Dataset Description 24
5.2 Model Setup 25
5.3 Compare Methods 26
5.4 Evaluation Metric 27
5.5 Experiment Results and Analysis 27
5.5.1 Number of intra-attentions: 27
5.5.2 Number of levels: 29
5.5.3 Result for all datasets: 30
5.5.4 The speed of convergence: 31
5.5.5 Ablation analysis: 32
5.5.6 The need of word segmentation: 33
5.6 Conclusions 36
6 Future Works 37
7 Case Study 38
7.1 CSQC dataset analysis 38
7.2 Case study for CSQC dataset 39
Reference 42
參考文獻 [1] J. Ma, K. Ganchev, and D. Weiss, “State-of-the-art chinese word segmentation with bi-lstms.” in EMNLP, E. Riloff, D. Chiang, J. Hockenmaier, and J. Tsujii, Eds.
Association for Computational Linguistics, 2018, pp. 4902–4908. [Online]. Available:
http://dblp.uni-trier.de/db/conf/emnlp/emnlp2018.html#MaGW18
[2] Y. Tay, L. A. Tuan, and S. C. Hui, “Multi-cast attention networks.” in KDD, Y. Guo and F. Farooq, Eds. ACM, 2018, pp. 2299–2308. [Online]. Available: http://dblp.uni-trier.de/db/conf/kdd/kdd2018.html#TayTH18
[3] S. Yoon, F. Dernoncourt, D. S. Kim, T. Bui, and K. Jung, “A compare-aggregate model with latent clustering for answer selection.” CoRR, vol. abs/1905.12897, 2019. [Online]. Available: http://dblp.uni-trier.de/db/journals/corr/corr1905.html#abs-1905-12897
[4] Y. Fu and Y. Feng, “Natural answer generation with heterogeneous memory.” In NAACL-HLT, M. A. Walker, H. Ji, and A. Stent, Eds. Association for Computational Linguistics, 2018, pp. 185–195. [Online]. Available: http://dblp.uni-trier.de/db/conf/naacl/naacl2018-1.html#FuF18
[5] G. Liu and J. Guo, “Bidirectional lstm with attention mechanism and convolutional layer for text classification.” Neurocomputing, vol. 337, pp. 325–338, 2019. [Online]. Available: http://dblp.uni-trier.de/db/journals/ijon/ijon337.html#LiuG19
[6] W. Xia, W. Zhu, B. Liao, M. Chen, L. Cai, and L. Huang, “Novel architecture for long short-term memory used in question classification.” Neurocomputing, vol. 299, pp. 20–31, 2018. [Online]. Available: http://dblp.uni-trier.de/db/journals/ijon/ijon299.
html#XiaZLCCH18
[7] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Advances in Neural Information Processing Systems 26, C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, Eds., 2013, pp. 3111–3119. [Online]. Available: http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality
[8] J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation.” in EMNLP, vol. 14, 2014, pp. 1532–1543.
[9] Q. V. Le and T. Mikolov, “Distributed representations of sentences and documents.” In ICML, vol. 14, 2014, pp. 1188–1196.
[10] Y. Kim, “Convolutional neural networks for sentence classification.” in EMNLP, A. Moschitti, B. Pang, and W. Daelemans, Eds. ACL, 2014, pp. 1746–1751. [Online]. Available: http://dblp.uni-trier.de/db/conf/emnlp/emnlp2014.html#Kim14
[11] H. He, K. Gimpel, and J. J. Lin, “Multi-perspective sentence similarity modeling with convolutional neural networks.” in EMNLP, L. M`arquez, C. Callison-Burch, J. Su, D. Pighin, and Y. Marton, Eds. The Association for Computational Linguistics, 2015, pp. 1576–1586. [Online]. Available: http://dblp.uni-trier.de/db/conf/emnlp/emnlp2015.html#HeGL15
[12] T. Mikolov, “Recurrent neural network based language model.” in Interspeech, vol. 2, 2010,p. 3.
[13] Y. Bengio, R. Ducharme, and P. Vincent, “A neural probabilistic language model.” In NIPS, T. K. Leen, T. G. Dietterich, and V. Tresp, Eds. MIT Press, 2000, pp. 932–938. [Online]. Available: http://dblp.uni-trier.de/db/conf/nips/nips2000.html#BengioDV00
[14] M. Tan, B. Xiang, and B. Zhou, “Lstm-based deep learning models for nonfactoid answer selection.” CoRR, vol. abs/1511.04108, 2015. [Online]. Available: http://dblp.uni-trier.de/db/journals/corr/corr1511.html#TanXZ15
[15] G. Shen, Y. Yang, and Z.-H. Deng, “Inter-weighted alignment network for sentence pair modeling.” in EMNLP, M. Palmer, R. Hwa, and S. Riedel, Eds. Association for Computational Linguistics, 2017, pp. 1179–1189. [Online]. Available: http://dblp.uni-trier.de/db/conf/emnlp/emnlp2017.html#ShenYD17
[16] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” in Proc. of NAACL, 2018.
[17] R. Zhang, H. Lee, and D. R. Radev, “Dependency sensitive convolutional neural networks for modeling sentences and documents.” in HLT-NAACL, K. Knight, A. Nenkova, and O. Rambow, Eds. The Association for Computational Linguistics, 2016, pp. 1512–1521. [Online]. Available: http://dblp.uni-trier.de/db/conf/naacl/naacl2016.html#ZhangLR16
[18] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” 2014, cite arxiv:1409.0473Comment: Accepted at ICLR 2015 as oral presentation. [Online]. Available: http://arxiv.org/abs/1409.0473
[19] R. K. Srivastava, K. Greff, and J. Schmidhuber, “Highway networks,” 2015, cite arxiv:1505.00387Comment: 6 pages, 2 figures. Presented at ICML 2015 Deep Learning workshop. Full paper is at arXiv:1507.06228. [Online]. Available: http://arxiv.org/abs/1505.00387
[20] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need.” in NIPS, I. Guyon, U. von Luxburg, S. Bengio, H. M. Wallach, R. Fergus, S. V. N. Vishwanathan, and R. Garnett, Eds., 2017, pp. 5998–6008. [Online]. Available: http://dblp.uni-trier.de/db/conf/nips/nips2017.html#VaswaniSPUJGKP17
[21] C. Cortes and V. Vapnik, “Support-vector networks.” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995. [Online]. Available: http://dblp.uni-trier.de/db/journals/ml/ml20.html#CortesV95
[22] D. Gupta, R. Pujari, A. Ekbal, P. Bhattacharyya, A. Maitra, T. Jain, and S. Sengupta,“Can taxonomy help? improving semantic question matching using question taxonomy,” in Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 2018, pp. 499–513. [Online]. Available: http://aclweb.org/anthology/C18-1042
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2024-10-30起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2024-10-30起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw