系統識別號 U0026-0309201510440600
論文名稱(中文) 以文字探勘方法建構專利地圖並探測潛力技術機會之研究
論文名稱(英文) Constructing patent map and detecting potential technological opportunities using text mining techniques
校院名稱 成功大學
系所名稱(中) 資訊管理研究所
系所名稱(英) Institute of Information Management
學年度 103
學期 2
出版年 104
研究生(中文) 夏平倫
研究生(英文) Ping-Lun Hsia
學號 R76021118
學位類別 碩士
語文別 中文
論文頁數 38頁
口試委員 指導教授-王惠嘉
中文關鍵字 文字探勘  專利地圖  資訊檢索 
英文關鍵字 Text mining  Patent map  Information retrieval 
中文摘要 當專利權受到侵害時,專利權人可向侵權人要求賠償所受到的損失,因此企業若捲入專利侵權案,往往需付出龐大的時間與金錢作為代價。而隨著知識經濟時代的到來,企業間的競爭其實就是智慧財產權的競爭,先行占領未來有潛力或可能成為熱門技術領域之專利權,即可幫助企業取得未來競爭優勢。因此,專利在商場上的重要性絕對不可小覷。然而,隨著科技的快速發展以及時間的累積,專利文件的數量非常龐大。如何有效的管理龐大數量的專利文件,已是目前炙手可熱議題。
英文摘要 The patentee can get the reparation for loss while the patents are infringed. If a company gets embroiled in legal disputes for patent infringement, significant losses in time and costs can occur. With the arrival of knowledge-based economy, Companies compete for the Intellectual Property Rights frequently. Occupying the patents of potential technological fields in advance will assists a company in acquiring competitive advantage in the future. As a result, patents play an important role in the marketplace. With the advance of science and technology, the amount of patent grows larger as time goes on. How to manage the considerable patents effectively is currently an important issue.
Patent map is the visualization of the results of statistical analysis applied to patent documents. This study proposes a method for constructing patent map and recommending technological vacancy. When companies are formulating research and development strategies, patent map allows them to distinguish the patent similarity and assists them in avoiding developing similar technique. The recommendation function assists companies in assessing whether to occupy technological vacancy in advance for acquiring competitive advantage in the future.
One feature of this study is the method of dimension reduction of the terms. In text mining, the sparse matrix generated by the considerable terms usually costs a lot of computational resource. Dimension reduction of the terms will save storage spaces and increase execution efficiency.
論文目次 第1章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 3
1.3 研究範圍 3
1.4 研究流程 4
1.5 論文大綱 5
第2章 文獻探討 6
2.1 自然語言處理 6
2.1.1 語意網 – WordNet 6
2.1.2 詞性標記 8
2.1.3 字根還原 8
2.1.4 停用字 9
2.2 資訊檢索 9
2.3 多維尺度 11
2.4 分群演算法 13
2.4.1 分割式分群法 13
2.4.2 階層式分群法 14
2.4.3 分群效度評估 15
第3章 研究方法 17
3.1 研究架構 17
3.2 文件收集與前處理模組 19
3.3 字詞相似度計算與字詞分群模組 21
3.4 文件相似度計算模組 23
3.5 多維尺度降維模組 24
3.6 文件離群程度計算模組 25
第4章 系統建置與驗證 27
4.1 系統建置 27
4.2 實驗設計與實驗結果分析 27
4.2.1 資料集 27
4.2.2 實驗一:探討語意網對專利檢索效果之影響 28
4.2.3 實驗二:探討離群程度較高之專利文件 29
第5章 結論與未來研究方向 33
5.1 研究成果 33
5.2 未來研究方向 34
參考文獻 Banerjee, S., & Pedersen, T. (2003). Extended gloss overlaps as a measure of semantic relatedness. Paper presented at the IJCAI.
Bergmann, I., Möhrle, M. G., Walter, L., Butzke, D., Erdmann, V. A., & Fürste, J. P. (2007). The use of semantic maps for recognition of patent infringements: A case study in biotechnology. Zeitschrift für Betriebswirtschaft–Special(4), 69-86.
Breunig, M. M., Kriegel, H.-P., Ng, R. T., & Sander, J. (2000). LOF: identifying density-based local outliers. Paper presented at the ACM sigmod record.
Chen, Y.-L., & Chang, Y.-C. (2012). A three-phase method for patent classification. Information Processing & Management, 48(6), 1017-1030. doi: http://dx.doi.org/10.1016/j.ipm.2011.11.001
Chen, Y.-L., & Chiu, Y.-T. (2011). An IPC-based vector space model for patent retrieval. Information Processing & Management, 47(3), 309-322. doi: http://dx.doi.org/10.1016/j.ipm.2010.06.001
Cordon, O., Herrera-Viedma, E., Lopez-Pujalte, C., Luque, M., & Zarco, C. (2003). A review on the application of evolutionary computation to information retrieval. International Journal of Approximate Reasoning, 34(2-3), 241-264. doi: 10.1016/j.ijar.2003.07.010
Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure. Pattern Analysis and Machine Intelligence, IEEE Transactions on(2), 224-227.
Dorr, B. J. (2001). Review of Natural Language Processing in R.A. Wilson and F.C. Keil (Eds.), The MIT Encyclopedia of the Cognitive Sciences. Artificial Intelligence, 130(2), 185-189. doi: http://dx.doi.org/10.1016/S0004-3702(01)00096-0
Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis (Vol. 3): Wiley New York.
Ernst, H. (1998). Patent portfolios for strategic R&D planning. Journal of Engineering and Technology Management, 15(4), 279-308. doi: http://dx.doi.org/10.1016/S0923-4748(98)00018-6
Ernst, H. (2003). Patent information for strategic technology management. World Patent Information, 25(3), 233-242. doi: http://dx.doi.org/10.1016/S0172-2190(03)00077-2
Hall, B. H., & Ziedonis, R. H. (2001). The patent paradox revisited: an empirical study of patenting in the US semiconductor industry, 1979-1995. Rand Journal of Economics, 32(1), 101-128. doi: 10.2307/2696400
Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data: Prentice-Hall, Inc.
Kaufman, L., & Rousseeuw, P. (1987). Clustering by means of medoids: North-Holland.
Krovetz, R. T. (2000). Viewing morphology as an inference process. Artificial Intelligence, 118(1-2), 277-294. doi: 10.1016/s0004-3702(99)00101-0
Kruskal, J. B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29(1), 1-27. doi: 10.1007/BF02289565
Lee, S., Yoon, B., & Park, Y. (2009). An approach to discovering new technology opportunities: Keyword-based patent map approach. Technovation, 29(6–7), 481-497. doi: http://dx.doi.org/10.1016/j.technovation.2008.10.006
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Paper presented at the Proceedings of the fifth Berkeley symposium on mathematical statistics and probability.
Miller, G. A. (1995). WordNet: a lexical database for English. Commun. ACM, 38(11), 39-41. doi: 10.1145/219717.219748
Mukherjea, S., Bamba, B., & Kankar, P. (2005). Information retrieval and knowledge discovery utilizing a biomedical patent semantic web. Knowledge and Data Engineering, IEEE Transactions on, 17(8), 1099-1110.
Paice, C. D. (1990). Another stemmer. SIGIR Forum, 24(3), 56-61. doi: 10.1145/101306.101310
Paquet, E. (2004). Exploring anthropometric data through cluster analysis.
Park, H., Yoon, J., & Kim, K. (2012). Identifying patent infringement using SAO based semantic technological similarities. Scientometrics, 90(2), 515-529. doi: 10.1007/s11192-011-0522-7
Park, H., Yoon, J., & Kim, K. (2013). Identification and evaluation of corporations for merger and acquisition strategies using patent information and text mining. Scientometrics, 97(3), 883-909. doi: 10.1007/s11192-013-1010-z
Porter, M. F. (1980). An algorithm for suffix stripping. Program-Automated Library and Information Systems, 14(3), 130-137. doi: 10.1108/eb046814
Rada, R., Mili, H., Bicknell, E., & Blettner, M. (1989). Development and application of a metric on semantic nets. Systems, Man and Cybernetics, IEEE Transactions on, 19(1), 17-30. doi: 10.1109/21.24528
Rosso, P., Correa, S., & Buscaldi, D. (2011). Passage retrieval in legal texts. The Journal of Logic and Algebraic Programming, 80(3–5), 139-153. doi: http://dx.doi.org/10.1016/j.jlap.2011.02.001
Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20, 53-65.
Salton, G., Wong, A., & Yang, C. S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613-620. doi: 10.1145/361219.361220
Schuh, G., & Grawatsch, M. (2004). TRIZ-based technology intelligence. Paper presented at the 16th International Conference on Management of Technology. Anais eletrônicos… Washington: IAMOT.
Trappey, A. J., & Trappey, C. V. (2008). An R&D knowledge management method for patent document summarization. Industrial Management & Data Systems, 108(2), 245-257.
Trappey, C. V., Wu, H.-Y., Taghaboni-Dutta, F., & Trappey, A. J. C. (2011). Using patent data for technology forecasting: China RFID patent analysis. Advanced Engineering Informatics, 25(1), 53-64. doi: http://dx.doi.org/10.1016/j.aei.2010.05.007
Tseng, Y.-H., Lin, C.-J., & Lin, Y.-I. (2007). Text mining techniques for patent analysis. Information Processing & Management, 43(5), 1216-1247. doi: http://dx.doi.org/10.1016/j.ipm.2006.11.011
Wan, X. (2007). A novel document similarity measure based on earth mover’s distance. Information Sciences, 177(18), 3718-3730. doi: http://dx.doi.org/10.1016/j.ins.2007.02.045
Wang, M.-Y., Fang, S.-C., & Chang, Y.-H. (2015). Exploring technological opportunities by mining the gaps between science and technology: Microalgal biofuels. Technological Forecasting and Social Change, 92(0), 182-195. doi: http://dx.doi.org/10.1016/j.techfore.2014.07.008
Wu, Z., & Palmer, M. (1994). Verbs semantics and lexical selection. Paper presented at the Proceedings of the 32nd annual meeting on Association for Computational Linguistics, Las Cruces, New Mexico.
Yoon, J., & Kim, K. (2012). Detecting signals of new technological opportunities using semantic patent analysis and outlier detection. Scientometrics, 90(2), 445-461. doi: 10.1007/s11192-011-0543-2
  • 同意授權校內瀏覽/列印電子全文服務,於2020-09-09起公開。

  • 如您有疑問,請聯絡圖書館