進階搜尋


 
系統識別號 U0026-2906201919101200
論文名稱(中文) 具名屬性的小樣本學習
論文名稱(英文) Learning from small datasets containing nominal attributes
校院名稱 成功大學
系所名稱(中) 資訊管理研究所
系所名稱(英) Institute of Information Management
學年度 107
學期 2
出版年 108
研究生(中文) 陳泓佑
研究生(英文) Hung-Yu Chen
學號 R78031046
學位類別 博士
語文別 英文
論文頁數 43頁
口試委員 指導教授-利德江
召集委員-吳植森
口試委員-王維聰
口試委員-蔡長鈞
口試委員-黃信豪
中文關鍵字 小樣本  名目型輸入值  連續型輸出值  虛擬樣本 
英文關鍵字 Small data  nominal input  continuous output  virtual sample 
學科別分類
中文摘要 在許多小數據學習問題中,由於資料結構不完整,決策者能獲得資訊有限。儘管機器學習算法被廣泛應用於萃取知識,但是大多數演算法建立於不考慮訓練資料集是否能夠完全代表母體的情況下所開發的。本研究著重於包含名目型輸入與連續型輸出的小資料集,發展了一種基於模糊理論的有效樣本產生過程,通過資料前處理解決學習問題。根據推導出的類別值和連續型輸出值之間的模糊關係,當給予連續輸出值時,可以獲得新類別組合(虛擬樣本)的可能性。接著透過可能性分佈使用模糊 進一步選擇適當的虛擬樣本,並將這些樣本添加到訓練資料集中以形成新的樣本。在該實驗中,使用倒傳遞類神經網路和支援向量迴歸來驗證從UCI機器學習資料庫取得的16個資料集。結果顯示,當建立新的訓練集時,兩種模型的預測精度得以顯著提高。此外,結果還表明,所提出的方法優於Bagging和SMOTE -NC,具有統計上顯著的支持。
英文摘要 In many small-data-learning problems, owing to the incomplete data structure, explicit information for decision makers is limited. Although machine learning algorithms are extensively applied to extract knowledge, most of them are developed without considering whether the training sets can fully represent the population properties. Focusing on small data which contains nominal inputs and continuous outputs, this paper develops an effective sample generating procedure based on fuzzy theories to tackle the learning issue by data preprocessing. According to the derived fuzzy relations between categories and continuous outputs, the possibilities of the combinations of categories (virtual samples) can be aggregated when continuous outputs are given. Proper virtual samples are further selected by using fuzzy alpha-cut on the possibility distributions, and these are added to the training sets to form new ones. In the experiment, sixteen datasets taken from the UC Irvine Machine Learning Repository are examined with back-propagation neural networks and support vector regressions. The results reveal that the forecasting accuracies of the two models are significantly improved when they are built with the proposed new training sets. Moreover, the results also indicate the proposed method outperforms bootstrap aggregating and the synthetic minority over-sampling technique-Nominal-Continuous with the greatest amount of statistical support.
論文目次 中文摘要 i
Abstract ii
誌謝 iii
List of tables v
List of figures vi
1. Introduction 1
2. Related studies 8
2.1 The nominal input preprocessing in the M5’ 8
2.2 The mega-trend-diffusion technique 9
2.3 The possibility theory 10
2.4 The possibility assessment mechanism (PAM) 12
2.5 SMOTE-NC 13
3. The proposed method 15
3.1 Definition of notations 15
3.2 The fuzzy relation extraction 15
3.3 The sample generation 18
3.4 The sample filtering 22
3.5 The implementation outline 24
4. Experimental results and discussion 26
4.1 The datasets examined 26
4.2 The designs of the experiments 27
4.3 Results of the Experiments 29
4.4 Findings from the experimental results 34
5. Conclusions 36
REFERENCES 38

參考文獻 1. Błaszczyński, J., & Stefanowski, J. (2015). Neighbourhood sampling in bagging for imbalanced data. Neurocomputing, 150, 529-542.
2. Byon, E., Shrivastava, A. K., & Ding, Y. (2010). A classification procedure for highly imbalanced class sizes. IIE Transactions, 42(4), 288-303. doi:10.1080/07408170903228967
3. Chang, C.-J., Dai, W.-L., & Chen, C.-C. (2015). A novel procedure for multimodel development using the grey silhouette coefficient for small-data-set forecasting. Journal of the Operational Research Society, 66(11), 1887-1894.
4. Chao, G. Y., Tsai, T. I., Lu, T. J., Hsu, H. C., Bao, B. Y., Wu, W. Y., . . . Lu, T. L. (2011). A new approach to prediction of radiotherapy of bladder cancer cells in small dataset analysis. Expert Systems with Applications, 38(7), 7963-7969. doi:10.1016/j.eswa.2010.12.035
5. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. J. Artif. Int. Res., 16(1), 321-357.
6. Chi-Wing, R., Pei, J., Fu, A. W. C., & Wang, K. (2009). Online Skyline Analysis with Dynamic Preferences on Nominal Attributes. IEEE Transactions on Knowledge and Data Engineering, 21(1), 35-49. doi:10.1109/tkde.2008.115
7. Coppersmith, D., Hong, S. J., & Hosking, J. R. (1999). Partitioning nominal attributes in decision trees. Data Mining and Knowledge Discovery, 3(2), 197-217.
8. Cortez, P., & Morais, A. (2007). A Data Mining Approach to Predict Forest Fires using Meteorological Data. Paper presented at the Proceedings of the 13th EPIA 2007 - Portuguese Conference on Artificial Intelligence, Guimaraes, Portugal.
9. Cost, S., & Salzberg, S. (1993). A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features. Machine Learning, 10(1), 57-78. doi:10.1023/a:1022664626993
10. Domingo-Ferrer, J., & Solanas, A. (2008). A measure of variance for hierarchical nominal attributes. Information Sciences, 178(24), 4644-4655.
11. Dubois, D., Foulloy, L., Mauris, G., & Prade, H. (2004). Probability-possibility transformations, triangular fuzzy sets, and probabilistic inequalities. Reliable computing, 10(4), 273-297.
12. Dubois, D., Prade, H., & Sandri, S. (1993). On possibility/probability transformations. Fuzzy logic, 103-112.
13. Efron, B. (1979). Computers and the theory of statistics: thinking the unthinkable. SIAM review, 21(4), 460-480.
14. Fard, M. J., Wang, P., Chawla, S., & Reddy, C. K. (2016). A bayesian perspective on early stage event prediction in longitudinal data. IEEE Transactions on Knowledge and Data Engineering, 28(12), 3126-3139.
15. Flage, R., Baraldi, P., Zio, E., & Aven, T. (2013). Probability and Possibility‐Based Representations of Uncertainty in Fault Tree Analysis. Risk analysis, 33(1), 121-133.
16. Gosset, W. S. (1908). The probable error of a mean. Biometrika, 6(1), 1-25. doi:10.1093/biomet/6.1.1
17. Huang, C.-J., Wang, H.-F., Chiu, H.-J., Lan, T.-H., Hu, T.-M., & Loh, E.-W. (2010). Prediction of the Period of Psychotic Episode in Individual Schizophrenics by Simulation-Data Construction Approach. Journal of Medical Systems, 34(5), 799-808. doi:10.1007/s10916-009-9294-5
18. Huang, C. (1997). Principle of information diffusion. Fuzzy sets and systems, 91(1), 69-90.
19. Huang, C. F., & Moraga, C. (2004). A diffusion-neural-network for learning from small samples. International Journal of Approximate Reasoning, 35(2), 137-161. doi:10.1016/j.ijar.2003.06.001
20. Jiang, L., Li, C., & Wang, S. (2014). Cost-sensitive Bayesian network classifiers. Pattern Recognition Letters, 45(Supplement C), 211-216. doi:https://doi.org/10.1016/j.patrec.2014.04.017
21. Jiang, L., Qiu, C., & Li, C. (2015). A Novel Minority Cloning Technique for Cost-Sensitive Learning. International Journal of Pattern Recognition and Artificial Intelligence, 29(04), 1551004. doi:10.1142/s0218001415510040
22. Krejcie, R. V., & Morgan, D. W. (1970). Determining sample size for research activities. Educ Psychol Meas, 30(3), 607-610.
23. Li, D.-C., Lin, W.-K., Chen, C.-C., Chen, H.-Y., & Lin, L.-S. (2018). Rebuilding sample distributions for small dataset learning. Decision Support Systems, 105(Supplement C), 66-76. doi:https://doi.org/10.1016/j.dss.2017.10.013
24. Li, D.-C., Lin, W.-K., Lin, L.-S., Chen, C.-C., & Huang, W.-T. (2017). The attribute-trend-similarity method to improve learning performance for small datasets. International Journal of Production Research, 55(7), 1898-1913. doi:10.1080/00207543.2016.1213447
25. Li, D. C., Wu, C. S., Tsai, T. I., & Lina, Y. S. (2007). Using mega-trend-diffusion and artificial samples in small data set learning for early flexible manufacturing system scheduling knowledge. Computers & Operations Research, 34(4), 966-982. doi:10.1016/j.cor.2005.05.019
26. Mirzaei, A., Mohsenzadeh, Y., & Sheikhzadeh, H. (2017). Variational Relevant Sample-Feature Machine: A fully Bayesian approach for embedded feature selection. Neurocomputing, 241, 181-190.
27. Niyogi, P., Girosi, F., & Poggio, T. (1998). Incorporating prior information in machine learning by creating virtual examples. Proceedings of the IEEE, 86(11), 2196-2209.
28. Quinlan, J. R. (1986). Induction of Decision Trees. Machine Learning, 1(1), 81-106.
29. Raudys, S. J., & Jain, A. K. (1991). Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(3), 252-264.
30. Sezer, E. A., Nefeslioglu, H. A., & Gokceoglu, C. (2014). An assessment on producing synthetic samples by fuzzy C-means for limited number of data in prediction models. Applied Soft Computing, 24, 126-134.
31. Shao, C., Song, X., Yang, X., & Wu, X. (2016). Extended minimum-squared error algorithm for robust face recognition via auxiliary mirror samples. Soft Computing, 20(8), 3177-3187.
32. Skurichina, M., & Duin, R. P. (2002). Bagging, boosting and the random subspace method for linear classifiers. Pattern Analysis & Applications, 5(2), 121-135.
33. Tang, D., Zhu, N., Yu, F., Chen, W., & Tang, T. (2014). A novel sparse representation method based on virtual samples for face recognition. Neural computing and applications, 24(3-4), 513-519.
34. Wang, Y., & Witten, I. (1997). Inducing Model Trees for Continuous Classes. Paper presented at the Proceedings of the Poster Papers of the European Conference on Machine Learning, Prague, Czech Republic.
35. Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338-353.
36. Zadeh, L. A. (1978). Fuzzy sets as a basis for a theory of possibility. Fuzzy sets and systems, 1(1), 3-28.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2019-09-03起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2019-09-03起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw