進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-0309201510440600
論文名稱(中文) 以文字探勘方法建構專利地圖並探測潛力技術機會之研究
論文名稱(英文) Constructing patent map and detecting potential technological opportunities using text mining techniques
校院名稱 成功大學
系所名稱(中) 資訊管理研究所
系所名稱(英) Institute of Information Management
學年度 103
學期 2
出版年 104
研究生(中文) 夏平倫
研究生(英文) Ping-Lun Hsia
學號 R76021118
學位類別 碩士
語文別 中文
論文頁數 38頁
口試委員 指導教授-王惠嘉
口試委員-高宏宇
口試委員-盧文祥
口試委員-劉任修
中文關鍵字 文字探勘  專利地圖  資訊檢索 
英文關鍵字 Text mining  Patent map  Information retrieval 
學科別分類
中文摘要 當專利權受到侵害時,專利權人可向侵權人要求賠償所受到的損失,因此企業若捲入專利侵權案,往往需付出龐大的時間與金錢作為代價。而隨著知識經濟時代的到來,企業間的競爭其實就是智慧財產權的競爭,先行占領未來有潛力或可能成為熱門技術領域之專利權,即可幫助企業取得未來競爭優勢。因此,專利在商場上的重要性絕對不可小覷。然而,隨著科技的快速發展以及時間的累積,專利文件的數量非常龐大。如何有效的管理龐大數量的專利文件,已是目前炙手可熱議題。
專利地圖係指將專利檢索系統所得之結果,藉由各種統計方法分析,最後以圖像化的圖表呈現結果,以方便使用者閱覽。本研究提出一個建構出專利地圖和推薦出技術缺口或技術機會的方法。建構出可區別專利相似度之專利地圖,將可幫助企業於制定研發策略時,了解技術領域分佈,避免因研發相同技術而涉入專利侵權案。推薦出技術缺口或技術機會,將可幫助企業提早評估是否需要先行占領與該技術領域相關之專利,以取得未來競爭優勢。
本研究主要的特點為在文字探勘的相關領域上,提出一個降低字詞維度的方法。在資料探勘或文字探勘的領域中,龐大數量的特徵或是字詞所形成的稀疏矩陣往往都會大幅降低整體執行的效率。因此利用本研究所提出的方法,將可降低字詞所構成的維度,達到節省儲存空間、提高檢索速度的效果。
英文摘要 The patentee can get the reparation for loss while the patents are infringed. If a company gets embroiled in legal disputes for patent infringement, significant losses in time and costs can occur. With the arrival of knowledge-based economy, Companies compete for the Intellectual Property Rights frequently. Occupying the patents of potential technological fields in advance will assists a company in acquiring competitive advantage in the future. As a result, patents play an important role in the marketplace. With the advance of science and technology, the amount of patent grows larger as time goes on. How to manage the considerable patents effectively is currently an important issue.
Patent map is the visualization of the results of statistical analysis applied to patent documents. This study proposes a method for constructing patent map and recommending technological vacancy. When companies are formulating research and development strategies, patent map allows them to distinguish the patent similarity and assists them in avoiding developing similar technique. The recommendation function assists companies in assessing whether to occupy technological vacancy in advance for acquiring competitive advantage in the future.
One feature of this study is the method of dimension reduction of the terms. In text mining, the sparse matrix generated by the considerable terms usually costs a lot of computational resource. Dimension reduction of the terms will save storage spaces and increase execution efficiency.
論文目次 第1章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 3
1.3 研究範圍 3
1.4 研究流程 4
1.5 論文大綱 5
第2章 文獻探討 6
2.1 自然語言處理 6
2.1.1 語意網 – WordNet 6
2.1.2 詞性標記 8
2.1.3 字根還原 8
2.1.4 停用字 9
2.2 資訊檢索 9
2.3 多維尺度 11
2.4 分群演算法 13
2.4.1 分割式分群法 13
2.4.2 階層式分群法 14
2.4.3 分群效度評估 15
第3章 研究方法 17
3.1 研究架構 17
3.2 文件收集與前處理模組 19
3.3 字詞相似度計算與字詞分群模組 21
3.4 文件相似度計算模組 23
3.5 多維尺度降維模組 24
3.6 文件離群程度計算模組 25
第4章 系統建置與驗證 27
4.1 系統建置 27
4.2 實驗設計與實驗結果分析 27
4.2.1 資料集 27
4.2.2 實驗一:探討語意網對專利檢索效果之影響 28
4.2.3 實驗二:探討離群程度較高之專利文件 29
第5章 結論與未來研究方向 33
5.1 研究成果 33
5.2 未來研究方向 34
參考文獻 Banerjee, S., & Pedersen, T. (2003). Extended gloss overlaps as a measure of semantic relatedness. Paper presented at the IJCAI.
Bergmann, I., Möhrle, M. G., Walter, L., Butzke, D., Erdmann, V. A., & Fürste, J. P. (2007). The use of semantic maps for recognition of patent infringements: A case study in biotechnology. Zeitschrift für Betriebswirtschaft–Special(4), 69-86.
Breunig, M. M., Kriegel, H.-P., Ng, R. T., & Sander, J. (2000). LOF: identifying density-based local outliers. Paper presented at the ACM sigmod record.
Chen, Y.-L., & Chang, Y.-C. (2012). A three-phase method for patent classification. Information Processing & Management, 48(6), 1017-1030. doi: http://dx.doi.org/10.1016/j.ipm.2011.11.001
Chen, Y.-L., & Chiu, Y.-T. (2011). An IPC-based vector space model for patent retrieval. Information Processing & Management, 47(3), 309-322. doi: http://dx.doi.org/10.1016/j.ipm.2010.06.001
Cordon, O., Herrera-Viedma, E., Lopez-Pujalte, C., Luque, M., & Zarco, C. (2003). A review on the application of evolutionary computation to information retrieval. International Journal of Approximate Reasoning, 34(2-3), 241-264. doi: 10.1016/j.ijar.2003.07.010
Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure. Pattern Analysis and Machine Intelligence, IEEE Transactions on(2), 224-227.
Dorr, B. J. (2001). Review of Natural Language Processing in R.A. Wilson and F.C. Keil (Eds.), The MIT Encyclopedia of the Cognitive Sciences. Artificial Intelligence, 130(2), 185-189. doi: http://dx.doi.org/10.1016/S0004-3702(01)00096-0
Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis (Vol. 3): Wiley New York.
Ernst, H. (1998). Patent portfolios for strategic R&D planning. Journal of Engineering and Technology Management, 15(4), 279-308. doi: http://dx.doi.org/10.1016/S0923-4748(98)00018-6
Ernst, H. (2003). Patent information for strategic technology management. World Patent Information, 25(3), 233-242. doi: http://dx.doi.org/10.1016/S0172-2190(03)00077-2
Hall, B. H., & Ziedonis, R. H. (2001). The patent paradox revisited: an empirical study of patenting in the US semiconductor industry, 1979-1995. Rand Journal of Economics, 32(1), 101-128. doi: 10.2307/2696400
Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data: Prentice-Hall, Inc.
Kaufman, L., & Rousseeuw, P. (1987). Clustering by means of medoids: North-Holland.
Krovetz, R. T. (2000). Viewing morphology as an inference process. Artificial Intelligence, 118(1-2), 277-294. doi: 10.1016/s0004-3702(99)00101-0
Kruskal, J. B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29(1), 1-27. doi: 10.1007/BF02289565
Lee, S., Yoon, B., & Park, Y. (2009). An approach to discovering new technology opportunities: Keyword-based patent map approach. Technovation, 29(6–7), 481-497. doi: http://dx.doi.org/10.1016/j.technovation.2008.10.006
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Paper presented at the Proceedings of the fifth Berkeley symposium on mathematical statistics and probability.
Miller, G. A. (1995). WordNet: a lexical database for English. Commun. ACM, 38(11), 39-41. doi: 10.1145/219717.219748
Mukherjea, S., Bamba, B., & Kankar, P. (2005). Information retrieval and knowledge discovery utilizing a biomedical patent semantic web. Knowledge and Data Engineering, IEEE Transactions on, 17(8), 1099-1110.
Paice, C. D. (1990). Another stemmer. SIGIR Forum, 24(3), 56-61. doi: 10.1145/101306.101310
Paquet, E. (2004). Exploring anthropometric data through cluster analysis.
Park, H., Yoon, J., & Kim, K. (2012). Identifying patent infringement using SAO based semantic technological similarities. Scientometrics, 90(2), 515-529. doi: 10.1007/s11192-011-0522-7
Park, H., Yoon, J., & Kim, K. (2013). Identification and evaluation of corporations for merger and acquisition strategies using patent information and text mining. Scientometrics, 97(3), 883-909. doi: 10.1007/s11192-013-1010-z
Porter, M. F. (1980). An algorithm for suffix stripping. Program-Automated Library and Information Systems, 14(3), 130-137. doi: 10.1108/eb046814
Rada, R., Mili, H., Bicknell, E., & Blettner, M. (1989). Development and application of a metric on semantic nets. Systems, Man and Cybernetics, IEEE Transactions on, 19(1), 17-30. doi: 10.1109/21.24528
Rosso, P., Correa, S., & Buscaldi, D. (2011). Passage retrieval in legal texts. The Journal of Logic and Algebraic Programming, 80(3–5), 139-153. doi: http://dx.doi.org/10.1016/j.jlap.2011.02.001
Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20, 53-65.
Salton, G., Wong, A., & Yang, C. S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613-620. doi: 10.1145/361219.361220
Schuh, G., & Grawatsch, M. (2004). TRIZ-based technology intelligence. Paper presented at the 16th International Conference on Management of Technology. Anais eletrônicos… Washington: IAMOT.
Trappey, A. J., & Trappey, C. V. (2008). An R&D knowledge management method for patent document summarization. Industrial Management & Data Systems, 108(2), 245-257.
Trappey, C. V., Wu, H.-Y., Taghaboni-Dutta, F., & Trappey, A. J. C. (2011). Using patent data for technology forecasting: China RFID patent analysis. Advanced Engineering Informatics, 25(1), 53-64. doi: http://dx.doi.org/10.1016/j.aei.2010.05.007
Tseng, Y.-H., Lin, C.-J., & Lin, Y.-I. (2007). Text mining techniques for patent analysis. Information Processing & Management, 43(5), 1216-1247. doi: http://dx.doi.org/10.1016/j.ipm.2006.11.011
Wan, X. (2007). A novel document similarity measure based on earth mover’s distance. Information Sciences, 177(18), 3718-3730. doi: http://dx.doi.org/10.1016/j.ins.2007.02.045
Wang, M.-Y., Fang, S.-C., & Chang, Y.-H. (2015). Exploring technological opportunities by mining the gaps between science and technology: Microalgal biofuels. Technological Forecasting and Social Change, 92(0), 182-195. doi: http://dx.doi.org/10.1016/j.techfore.2014.07.008
Wu, Z., & Palmer, M. (1994). Verbs semantics and lexical selection. Paper presented at the Proceedings of the 32nd annual meeting on Association for Computational Linguistics, Las Cruces, New Mexico.
Yoon, J., & Kim, K. (2012). Detecting signals of new technological opportunities using semantic patent analysis and outlier detection. Scientometrics, 90(2), 445-461. doi: 10.1007/s11192-011-0543-2
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2020-09-09起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw