進階搜尋


 
系統識別號 U0026-0812200911560542
論文名稱(中文) 支援屬性權重與遺失值填補之案例式推理模式建立
論文名稱(英文) A Case-Based Reasoning Model for Supporting Feature Weight and Missing Value Completion
校院名稱 成功大學
系所名稱(中) 工業與資訊管理學系專班
系所名稱(英) Department of Industrial and Information Management (on the job class)
學年度 94
學期 2
出版年 95
研究生(中文) 黃建彰
研究生(英文) Chien-Chang Huang
學號 r3793108
學位類別 碩士
語文別 中文
論文頁數 53頁
口試委員 指導教授-王惠嘉
口試委員-李健興
口試委員-盧文祥
口試委員-李昇暾
中文關鍵字 遺失值  權重  案例式推理  關聯規則 
英文關鍵字 case-based reasoning  feature weights  missing values  association rules 
學科別分類
中文摘要 企業界為解決各式各樣的問題,常投入大量經費,訓練眾多人材並累積相當多的研究成果。為避免這些成果因人員異動而流失,許多公司會以建構知識管理系統的方式來解決此問題,既可保存專家知識且能快速提供解決問題之建議,其中案例式推理(Case-Based Reasoning;CBR)是普遍使用的推論方法之一。
在案例式推理模式中,系統成敗的關鍵因素在於如何擷取(retrieve)出正確的推薦案例,一般而論是將屬性比對符合度最高的視為最可能答案,其通常假設案例庫的案例資料很完整,但實際上案例庫中可能存在許多遺失值,致使系統推論的準確性降低。遺失值發生的原因很多,例如當案例庫中的案例來源不同時(例如書本、文章、技術文件等),各來源會依據本身的需求定義欄位屬性,造成整合後的案例庫無法完全記錄所有屬性值。另外,當調查敏感性內容,受訪者可能拒答一些資料;或者使用者在輸入資料時忘了輸入;或者輸入的值超過系統可接受範圍等因素,也都有可能產生遺失值。遺失值問題常會影響CBR推論的準確性,因此如何妥善處理遺失值也就成了一個重要的研究課題。
根據觀察,案例庫資料通常會隨著時間不斷地累積,當案例累積到一定數量時,預期可據以產生出一般性的規則,因此本研究認為若透過資料探勘(Data Mining)技術挖掘資料間隱藏的關聯規則(Association Rules),應可以藉此填補案例庫中的遺失值資料,如此應可提高擷取相似案例的符合程度。在經過多次實驗後,為增加其準確性,本研究提出一支援屬性權重與遺失值填補之案例式推理模式概念,先使用關聯規則探勘技術填補遺失值,改善案例品質;再以關聯規則強度(strength)作為填補遺失值正確率之量化指標,據此調整屬性權重;最後使用最近鄰法(K nearest neighbor, KNN)計算案例相似度。其中我們在挖掘關聯規則之前,先將案例依所屬類別分類,以減少因案例類別特性不同而產生偏頗的關聯規則。
最後,本研究以一公開的UCI資料庫所載資料來驗證所提方法的有效性,並與其它相關方法做比較。實驗結果顯示本研究所提方法,確實能夠提高案例式推理模式擷取相似案例的能力。



英文摘要 Case-based reasoning (CBR) is considered as a good way to problem solving and learning, and is a generic methodology for building knowledge-based systems. A CBR system must be able to assess the similarity of cases in the case base to the current problem description. Thus, retrieving similar cases is a primary step in CBR, and the similarity measure method plays a very important role in case retrieval.
However, the retrieval step begins by postulating complete attribute information stored in case base. But in reality, the stored case and the target case will commonly contain some missing or null attribute values. There are many possible situations leading to occur missing values. For example, when the cases are collected from many different resources (e.g. books, papers, or technical reports etc.), different resources may use different attributes to describe cases according to different requirements. Under such circumstances, many missing values will occur in the case base when knowledge engineers integrate those information into a case base. These incompletely description may affect the reasoning accuracy very seriously. Therefore, handling missing values is an important issue in similarity measure of CBR.
To improve the accuracy of a CBR system with missing values, we propose an adapted CBR model with supporting feature weight and missing value completion by data mining. At first, we fill up the missing values according to the possible answers from association rule mining. In the meantime, the strength of association rules was viewed as quantitative value of completion accuracy. Secondly, we adjust the feature weights according to the strength of association rules, and then assessing the similarity of cases using K nearest neighbor (KNN). In addition, before minging association rules, we classify the cases according to their inherent classification to reduce insufficient rules.
In order to demonstrate the accuracy of this proposed model, UCI Machine Learning Repository is used for evaluation. The results show that the proposed method achieves higher accuracy of prediction than the other methods.



論文目次 第一章 緒 論....................................1
1.1 研究背景....................................1
1.2 研究動機與目的..............................2
1.3 研究範圍與限制..............................3
1.4 研究流程....................................4
1.5 論文大綱....................................5
第二章 文獻探討.................................6
2.1 案例式推理(CASE-BASED REASONING,CBR)........6
2.1.1 案例式推理流程............................6
2.1.2 案例擷取..................................8
2.1.3 混合式(hybrid)案例式推理.................10
2.2 資料探勘(DATA MINING)......................12
2.2.1 關聯規則.................................12
2.2.2 關聯規則演算法...........................13
2.2.2.1 Apriori演算法..........................13
2.2.2.2 OPUS演算法.............................14
2.2.2.3 Robust Association Rule演算法..........16
2.3 遺失值問題.................................17
2.3.1 利用統計方法進行填補.....................18
2.3.2 利用機器學習方法進行填補.................19
2.4 相似度計算.................................21
2.4.1 定性相似度...............................21
2.4.2 定量相似度...............................22
2.5 小結.......................................23
第三章 研究方法................................24
3.1 研究構想...................................24
3.2 模式建立...................................27
3.2.1 案例分類.................................27
3.2.2 挖掘關聯規則.............................28
3.2.3 填補遺失值資料與更新權重.................30
3.2.4 相似度計算...............................35
3.3 評估指標...................................36
3.4 小結.......................................37
第四章 實證分析................................38
4.1 實驗規劃...................................38
4.2 實驗模型建構...............................39
4.3 實驗結果與討論.............................40
4.4 小結.......................................48
第五章 結論與建議..............................49
5.1 研究結論與建議.............................49
5.2 未來研究方向...............................50
參考文獻.......................................51

參考文獻 Aamodt, A. and Plaza, E. (1994). Case-based reasoning: foundational issues, methodological variations, and system approaches. Artificial Intelligence Communications, 7, 39-59.
Acock, A. C. (2005). Working With Missing Values. Journal of Marriage and Family, 67, 1012-1028.
Adriaans, P. and Zantinge, D. (1996). Data Mining. Harlow: Addison Wesley.
Agrawal, R., Imilienski, T. and Swami, A. (1993). Mining Association Rules between Sets of Items in Large Databases. In Proceedings of the ACM SIGMOD Conference on Management of Data, 207-216.
Agrawal, R. and Srikant, R. (1994). Fast Algorithm for Mining Association Rules in Large Databases. In Proceedings of the VLDB conference, 487-499.
Arshadi, N., and Jurisica, I. (2005). Data Mining for Case-Based Reasonin in High-Dimensional Biological Domains. IEEE Transactions on Knowledge and Data Engineering, 17(8), 1127-1137.
Becker, L. and Jazayeri, K. (1989). A Connectionist Approach to Case-Based Reasoning. In Proceedings of the Case-Based Reasoning Workshop, Morgan Kaufmann, 213-217.
Bradburn, C. and Zeleznikow, J. (1993). The Application of Case-Based Reasoning to the Tasks of Health Care Planning. In Proceeding of European Workshop on CBR, Springer, Berlin, 365-378.
Chan, F.T.S. (2005). Application of a hybrid case-based reasoning approach in electroplating industry. Expert Systems with Applications, 29, 121-130.
Chiu, C. (2002). A case-based customer classification approach for direct marketing. Expert System with Applications, 22, 163-168.
Deng, P.-S. (1996). Using case-based reasoning approach to the support of ill- structured decisions. European Journal of Operational Research, 93, 511-521.
Gonzalez, A. J., Xu, Li. and Gupta, U. M. (1998). Validation Techniques for Case-Based Reasoning Systems. IEEE Transactions on Systems, Man, and Cybernetics, 28(4), 465-477.
Gardingen, D. and Watson, I. (1999). A web based CBR system for heatig ventilation and air conditioning systems sales support. Knowledge-Based System, 12, 207-214.
Garrell, J. M., Golobardes, E., Bernado, E. and Llora, X. (1999). Automatic diagnosis with genetic algorithms and case-based reasoning. Artifical Intelligence in Engineering, 13, 367-372.
Graham, D., Smith, S.D. and Crapper M. (2004). Improving concrete placement simulation with a case-based reasoning input. Civil Engineering and Environmental Systems, 21(2), 137-150.
Grupe, F. H. and Owrang, M. M. (1995). Database Mining Discovering New Knowledge and Cooperative Advantage. Information Systems Management, 12, 26-31.
Hall, C. (1995). The Devil's in The Details : Techniques, Tools, and Application for Database Mining and Knowledge Discovery Part 1. Intelligent Software Strategies, 11(9), 1-16.
Han, J. and Kamber, M. (2000). Data Mining : Concepts and Techniques. Morgan Kaufmann Publishers.
Hongkyu, J. and Ingoo, H. (1996). Integration of Case-Based Forcasting, Neural Network and Discriminant Analysis for bankruptcy Prediction. Expert Systems with Applications, 11(4), 415-422.
Hsu,C.-C. and Ho, C.-S. (2004). A new hybrid case-based architecture for medical diagnosis. Information Sciences, 166, 231-247.
Hui, S.C., Fong, A.C.M. and Jha, G. (2001). A web-based intelligent fault diagnosis system for customer service support. Engineering Applications of Artificial Intelligence, 14, 537-548.
Hunt, J. and Miles, R. (1994). Hybrid Case-based Reasoning. The Knowledge Engineering Review, 9(4), 383-397.
Kim, J.S. (2003). Customized Recommendation Mechanism Based on Web Data Mining and Case-Based Reasoning. In Proceedings of the International Conference on Intelligent Agents, Web Technologies and Internet Commerce (IAWTIC), Vienna (Austria), 242-250.
Kim, K.-S. and Han, I. (2001). The cluster-indexing method for case-based reasoning using self-organizing maps and learning vector quantization for bond rating cases. Expert Systems with Applications, 21, 147-156.
Koton, P. (1989). SMARTPlan: A case-based resource allocation and scheduling system. In Proceedings of the Case-Based Reasoning Workshop, Pensacola Beach, Florida, San Mateo, CA: Morgan Kaufmann, 285-289.
Lakshminarayan, K., Harp, S.A. and Samad, T. (1999). Imputation of Missing Data in Industrial Databases. Applied Intelligence,11, 259-275.
Li, L. L. X. (1999). A hybrid knowledge-based system applied to epidemic screening. Expert Systems, 16(4), 248-256.
Liu, W.Z., White, A.P., Thompson, S.G. and Bramer, M.A. (1997). Techniques for Dealing with Missing Values in Classification. International Symposium on Intelligent Data Analysis, 527-536.

Miyashita, K. (1995). Case-based knowledge acquisition for schedule optimization. Artificial Intelligence in Engineering, 9, 277-287.
Park, C-S. and Han, I. (2002). A case-based reasoning with the feature weights derived by analytic hierarchy process for bankruptcy prediction. Expert Systems with Applications, 23(3), 255-264.
Ragel, A. and Cremilleux, B. (1999). MVC-a Preprocessing Method to Deal with Missing Values. Knowledge-Based Systems, 12, 285-291.
Ragel, A. and Cremilleux, B. (1998). Treatment of Missing Values for Association Rules. Proceeding of the Second Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98), Melbourne, Australia, 258-270.
Riesbeck, C. and Schank, R. (1989). Inside Case-based reasoning. Hillsdale, N.J.: Lawrence Erlbaum.
Shin, K.-S. and Han, I. (1999). Case-based reasoning supported by genetic algorithms for corporate bond rating. Expert Systems with Application,16, 85-95.
Shin, K.-S. and Han, I. (2001). A case-based approach using inductive indexing for corporate bond rating. Decision Support Systems, 32, 41-52.
Slonim, T. Y. and Schneider, M. (2001). Design issues in fuzzy case-based reasoning. fuzzy sets and systems, 117, 251-267.
Suh, M.S., Jhee, W.C., Ko, Y.K. and Lee, A. (1998). A case-based expert system approach for quality design. Expert Systems With Applications, 15, 181-190.
Wang, H.-C. and Wang, H.-S. (2005). A hybrid expert system for equipment failure analysis. Expert Systems with Applications, 28, 615-622.
Watson, I. (1999). Case-based reasoning is a methodology not a technology. Knowledge-Based Systems, 12, 303-308.
Webb, G.I. (1995). OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3, 431-465.
Webb, G.I. (2000). Efficient Search for Association Rules. Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining, 99-107.
Witten, I.H. and Frank, E. (2000). Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, San Francisco: Morgan Kaufmann.
Xu, L. (1996). An integrated rule-and case-based approach to AIDS initial assessment. International Journal of Bio-Medical Computing, 40(3), 197-207.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2007-07-11起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2007-07-11起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw