系統識別號 U0026-2406201522460600
論文名稱(中文) 多情感辭典特徵層級情感分析之評論自動摘要
論文名稱(英文) Using Multi-Lexicons in Feature-Level Sentiment Analysis for Reviews Summarization
校院名稱 成功大學
系所名稱(中) 資訊管理研究所
系所名稱(英) Institute of Information Management
學年度 103
學期 2
出版年 104
研究生(中文) 孫義峰
研究生(英文) Yi-Feng Sun
學號 R76024132
學位類別 碩士
語文別 中文
論文頁數 54頁
口試委員 指導教授-王惠嘉
中文關鍵字 文字探勘  產品特徵擷取  情感分析 
英文關鍵字 text mining  product feature extraction  sentiment analysis 
中文摘要 現今社會中旅遊已經成為人們生活中的一部分,出外旅行時難免需要一個住所,因此住宿服務產業也就隨之蓬勃。由於網路資訊發達,消費者在規劃旅遊行程時已不必親自前往旅行社詢問旅遊資訊,自行在家中使用個人電腦瀏覽景點及旅館的資訊即可。
許多消費者在決定要入住哪間旅館前,需要取得旅館相關資訊以協助他們做決策,其中最重要的資訊來源就是之前的消費者所撰寫的評論。但Web 2.0盛行使網路上評論數量快速成長,消費者已不易以人工閱讀方式將所有評論看完,因此設計一套自動分析評論的系統就可以協助使用者更有效率的了解評論內容。
英文摘要 Nowadays, tourism has become a part of life. Before reserving hotels, customers need some information, which the most important source is online reviews, about hotels to help them make decisions. Due to the dramatic growing of online reviews, it is impossible for customers to read all reviews manually. Therefore, designing an automatic review analysis system, which summarizes reviews, is necessary. The main purpose of the system is to understand the opinion of reviews, which may be positive or negative. In other words, the system would analyze whether the customers who visited the hotel like it or not. Using sentiment analysis methods will help the system achieve the purpose. In sentiment analysis methods, the target of opinion (we call it “feature”) should be recognized to clarify the polarity of the opinion because polarity of the opinion may be ambiguous. Hence, we propose an unsupervised method using Part-Of-Speech pattern and multi-lexicons sentiment analysis to summarize all reviews and help customers know hotels as well as make decisions efficiently. Experimental results show that our method outperforms the state of the art methods with F-measure .628.
論文目次 第1章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 4
1.3 研究範圍與限制 5
1.4 研究流程 6
1.5 論文大綱 7
第2章 文獻探討 9
2.1 自然語言處理 9
2.1.1 詞性標記 9
2.1.2 向量空間模型 10
2.1.3 中文斷詞處理 11 基於統計資訊的機器學習方法 11 基於字典方法 13
2.2 機器學習 13
2.3 情感分析 14
2.3.1 廣義知網知識本體架構 15
2.4 產品特徵擷取 16
2.5 小結 18
第3章 研究方法 19
3.1 研究架構 19
3.2 資料前處理模組 21
3.3 特徵擷取模組 25
3.4 情感分析模組 27
3.5 小結 30
第4章 系統建置與驗證 32
4.1 系統環境建置 32
4.2 實驗方法 33
4.2.1 資料來源 34
4.2.2 評估指標 38
4.3 參數設定 39
4.4 實驗結果 40
4.4.1 實驗一 40
4.4.2 實驗二 41
4.4.3 實驗三 42
4.4.4 實驗四 42
4.4.5 實驗五 45
第5章 結論以及未來方向 47
5.1 研究成果 47
5.2 未來研究方向 49
參考文獻 51
參考文獻 Bagheri, A., Saraee, M., & de Jong, F. (2013). Care more about customers: Unsupervised domain-independent aspect detection for sentiment analysis of customer reviews. Knowledge-Based Systems, 52(0), 201-213. doi: http://dx.doi.org/10.1016/j.knosys.2013.08.011
Bravo-Marquez, F., Mendoza, M., & Poblete, B. (2014). Meta-level sentiment models for big social data analysis. Knowledge-Based Systems, 69(0), 86-99. doi: http://dx.doi.org/10.1016/j.knosys.2014.05.016
Brill, E. (2000). Part-of-speech tagging. Handbook of Natural Language Processing, 403-414.
Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems, 1.
Chang, P.-C., Galley, M., & Manning, C. D. (2008). Optimizing Chinese word segmentation for machine translation performance. In (Ed.), (pp. 224-232). Association for Computational Linguistics.
Chen, K.-J., & Bai, M.-H. (1998). Unknown word detection for Chinese by a corpus-based learning method. International Journal of Computational Linguistics and Chinese Language Processing, 3(1), 27-44.
Chen, K.-J., & Liu, S.-H. (1992). Word identification for Mandarin Chinese sentences. In (Ed.), (pp. 101-107). Association for Computational Linguistics.
Chen, K.-J., & Ma, W.-Y. (2002). Unknown word extraction for Chinese documents. In (Ed.), (pp. 1-7). Association for Computational Linguistics.
Fu, G., & Luke, K.-K. (2005). Chinese named entity recognition using lexicalized HMMs. ACM SIGKDD Explorations Newsletter, 7(1), 19-25.
Gretzel, U., Yoo, K. H., & Purifoy, M. (2007). Online travel review study: Role and impact of online travel reviews.
Hamburg, M. (1985). Basic Statistics : a modern approach: Harcourt Brace Jovanovich.
Levy, R., & Manning, C. (2003). Is it harder to parse Chinese, or the Chinese Treebank? In (Ed.), (pp. 439-446). Association for Computational Linguistics.
Li, Y., Bandar, Z. A., & McLean, D. (2003). An approach for measuring semantic similarity between words using multiple information sources. Knowledge and Data Engineering, IEEE Transactions on, 15(4), 871-882.
Lightspeed Research. (2011). Consumers reply on online reviews and price comparison to make purchase decisions. from http://www.lightspeedresearch.com/press-releases/consumers-rely-on-online-reviews-and-price-comparisons-to-make-purchase-decisions/
Litvin, S. W., Goldsmith, R. E., & Pan, B. (2008). Electronic word-of-mouth in hospitality and tourism management. Tourism Management, 29(3), 458-468. doi: http://dx.doi.org/10.1016/j.tourman.2007.05.011
Ma, W.-Y., & Chen, K.-J. (2003). A bottom-up merging algorithm for Chinese unknown word extraction. In (Ed.), (pp. 31-38). Association for Computational Linguistics.
Mauri, A. G., & Minazzi, R. (2013). Web reviews influence on expectations and purchasing intentions of hotel potential customers. International Journal of Hospitality Management, 34(0), 99-107. doi: http://dx.doi.org/10.1016/j.ijhm.2013.02.012
Mitchell, T. M. (1997). Machine learning. 1997. Burr Ridge, IL: McGraw Hill, 45.
Palmer, D. D. (1997). A trainable rule-based algorithm for word segmentation. In Cohen, P. R. & Wahlster, W. (Ed.), Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics (pp. 321-328). Association for Computational Linguistics.
Pedersen, T., Patwardhan, S., & Michelizzi, J. (2004). WordNet:: Similarity: measuring the relatedness of concepts. In (Ed.), (pp. 38-41). Association for Computational Linguistics.
Peng, F., Feng, F., & McCallum, A. (2004). Chinese segmentation and new word detection using conditional random fields. In Bird, S. (Ed.), Proceedings of the 20th international conference on Computational Linguistics (pp. 562). Association for Computational Linguistics.
PhoCusWright. (2010). Technology and Independent Distribution in the European Travel Industry. from http://www.phocuswright.com/free_reports/technology-and-independent-distribution-in-the-european-travel-industry
PhoCusWright. (2011). PhoCusWright's global online travel overview. from http://www.phocuswright.com/products/2716/
Qiu, G., Liu, B., Bu, J., & Chen, C. (2011). Opinion word expansion and target extraction through double propagation. Computational linguistics, 37(1), 9-27.
Quan, C., & Ren, F. (2014). Unsupervised product feature extraction for feature-oriented opinion determination. Information Sciences, 272(0), 16-28. doi: http://dx.doi.org/10.1016/j.ins.2014.02.063
Saha, S. K., Sarkar, S., & Mitra, P. (2009). Feature selection techniques for maximum entropy based biomedical named entity recognition. Journal of biomedical informatics, 42(5), 905-911.
Salton, G., Wong, A., & Yang, C.-S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613-620.
Serra Cantallops, A., & Salvi, F. (2014). New consumer behavior: A review of research on eWOM and hotels. International Journal of Hospitality Management, 36(0), 41-51. doi: http://dx.doi.org/10.1016/j.ijhm.2013.08.007
Sun, T., Youn, S., Wu, G., & Kuntaraporn, M. (2006). Online Word-of-Mouth (or Mouse): An Exploration of Its Antecedents and Consequences. Journal of Computer-Mediated Communication, 11(4), 1104-1127. doi: 10.1111/j.1083-6101.2006.00310.x
Torres, E. N., Adler, H., & Behnke, C. (2014). Stars, diamonds, and other shiny things: The use of expert and consumer feedback in the hotel industry. Journal of Hospitality and Tourism Management, 21(0), 34-43. doi: http://dx.doi.org/10.1016/j.jhtm.2014.04.001
TripAdvisor. (2014). TripAdvisor論壇. Retrieved 6/16, 2014, from http://www.tripadvisor.com.tw/
Wang, B., & Wang, H. (2008). Bootstrapping Both Product Features and Opinion Words from Chinese Customer Reviews with Cross-Inducing. In Lee, J.-H., Copestake, A. & Matsumoto, Y. (Ed.), Proceedings of the Third International Joint Conference on Natural Language Processing (pp. 289-295). Asian Federation of Natural Language Processing.
Wong, P.-k., & Chan, C. (1996). Chinese word segmentation based on maximum matching and word binding force. In Tsujii, J. (Ed.), Proceedings of the 16th conference on Computational linguistics-Volume 1 (pp. 200-203). Association for Computational Linguistics.
Wu, Z., & Tseng, G. (1993). Chinese text segmentation for text retrieval: Achievements and problems. Journal of the American Society for Information Science, 44(9), 532-542.
Xue, N. (2003). Chinese word segmentation as character tagging. Computational Linguistics and Chinese Language Processing, 8(1), 29-48.
Xue, N., Chiou, F.-D., & Palmer, M. (2002). Building a large-scale annotated Chinese corpus. In (Ed.), (pp. 1-8). Association for Computational Linguistics.
Xue, N., Xia, F., Chiou, F.-D., & Palmer, M. (2005). The Penn Chinese TreeBank: Phrase structure annotation of a large corpus. Natural language engineering, 11(02), 207-238.
Yeh, C.-L., & Lee, H.-J. (1991). Rule-based word identification for Mandarin Chinese sentences-A unification approach. Computer Processing of Chinese and Oriental Languages, 5(2), 97-118.
中央研究院中文詞知識庫小組. (2011). 廣義知網知識本體架構. 2015, from http://ehownet.iis.sinica.edu.tw/
台灣趨勢研究. (2012). 住宿服務業發展趨勢. from http://www.twtrend.com/share_cont.php?id=37
行政院主計總處. (2014). 國民所得統計摘要. 2014, from http://www.dgbas.gov.tw/ct.asp?xItem=33338&ctNode=3099&mp=1
高照明. (2012). 語料庫建構技術 -研究報告. from http://wd.naer.edu.tw/project/NAER-101-12-F-2-03-00-2-01.pdf
  • 同意授權校內瀏覽/列印電子全文服務,於2020-06-29起公開。

  • 如您有疑問,請聯絡圖書館