進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-0208201623305700
論文名稱(中文) 利用新的資料轉換模型應用於小樣本學習演算法
論文名稱(英文) A New Data Transformation Model for Small Dataset Learning
校院名稱 成功大學
系所名稱(中) 工業與資訊管理學系
系所名稱(英) Department of Industrial and Information Management
學年度 104
學期 2
出版年 105
研究生(中文) 溫怡翔
研究生(英文) I-Hsiang Wen
學號 R38001035
學位類別 博士
語文別 英文
論文頁數 46頁
口試委員 指導教授-利德江
召集委員-吳植森
口試委員-黃信豪
口試委員-蔡長鈞
口試委員-王維聰
中文關鍵字 應用統計  分配系統  建模  小樣本學習  虛擬樣本產生法 
英文關鍵字 Applied statistics  distribution systems  modelling  small dataset learning  virtual sample generation 
學科別分類
中文摘要 為了因應工業化社會的高度競爭壓力下,產品週期越來越短,客戶在產
品建置初期的工程階段,都會相當快速的要讓產品量產,導致能夠拿到的資料量會相當稀少。然而,對於工程師而言,是非常困難在量產前去做產品的品質改善。在過去的研究指出,虛擬樣本產生法已被驗證是一種有效解決小樣本的方法,其中,如果能讓生成的虛擬樣本服從某種分配,那就能確保所生成的虛擬樣本能夠更具有說服力。Johnson轉換函數能夠將資料轉換至常態分配中,使得資料能夠滿足統計上的規則,但是,Johnson轉換函數只能適用於大樣本的資料,所以,如何建構一個可以應用於小樣本型態的Johnson轉換函數是本研究的動機。因此,本研究提出了一個新的轉換函數將極少量的樣本轉換到常態分配,再利用常態分配去生成虛擬樣本。最後,本研究比對了四種不同的方法,以面板產業為例,結果發現利用新的資料轉換模型不僅保有小樣本的行為模式,除可以改善學習效能外,更可以萃取出更多的資訊,同時在方法比較上,亦較MTD為好。
英文摘要 In most highly competitive manufacturing industries, the sample sizes are usually very small in pilot runs, in order to quickly launch new products. However, it is always difficult for engineers to improve the quality in mass production runs based on the limited data obtained in this way. Past research has demonstrated that adding artificial samples can be an effective approach when learning with small datasets. However, a prior analysis of the data is needed to deduce the appropriate sample distributions within which the artificial samples are generated. Johnson transformation is one of the well-known models that can be applied to bring data close to a normal distribution with the satisfaction of certain statistical assumptions. The sample size required for such data transformation methods is usually large, and this thus motivates the efforts of the current study to develop a new method which is suitable for small datasets. Accordingly, this research proposes the Small-Johnson Data Transformation (SJDT) method to transform small raw data to normal distributions to generate virtual samples. When compared with four other methods, the results obtained with a real small dataset drawn from the Film Transistor Liquid Crystal Display (TFT-LCD) industry in Taiwan demonstrate that the proposed method is able to effectively improve the forecasting ability with small sample sizes.
論文目次 CONTENTS
摘要 I
Abstract II
誌謝 III
LIST OF FIGURES VI
LIST OF TABLES VII
1. Introduction 1
1.1 Backgrounds 1
1.2 Research Motivation 3
1.3 Research Purposes 5
1.4 Research Structure 6
2. Literature Review 8
2.1 Related Works 8
2.1.1 Virtual Sample Generation Methods 8
2.1.2 Small Dataset Learning Methods 16
2.2 Johnson Transformation 18
3. Methodology 21
3.1 Preliminary system 22
3.2 Constructing the SJDT functions 23
3.3 SJDT-based virtual sample generation 27
3.4 The screening method 28
3.5 Implementation steps 30
4. An example and computational results 31
4.1 Case description 31
4.2 Mathematical model 35
4.3 Experimental result and analysis 37
5. The conclusions and recommendations 40
References 42

參考文獻 References
Chan, K. Y., Kwong, C. K., & Tsim, Y. C. (2010). A genetic programming based fuzzy regression approach to modelling manufacturing processes. International Journal of Production Research, 48(7), 1967-1982.

Chao, G. Y., Tsai, T. I., Lu, T. J., Hsu, H. C., Bao, B. Y., & Wu, W. Y. (2011). A new approach to prediction of radiotherapy of bladder cancer cells in small dataset analysis. Expert Systems with Applications, 38(7), 7963-7969.

Cressie, N. (2006). Block kriging for lognormal spatial processes. Mathematical Geology, 38(4), 413-430.

Efron, B., & Tibshirani, R. J. (1993). An Introduction to the Bootstrap. New York: Chapmen & Hall.

Guo, G. D., & Dyer, C. R. (2005). Learning from examples in the small sample case: Face expression recognition. IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics, 35(3), 477-488.

Hong, T. P., Tseng, L. H., & Chien, B. C. (2010). Mining from incomplete quantitative data by fuzzy rough sets. Expert Systems with Applications, 37(3), 2644-2653.

Huang, C. F. (1997). Principle of information. Fuzzy Sets and Systems, 91(1), 69-90.

Huang, C. F., & Moraga, C. (2004). A diffusion-neural-network for learning from small samples. International Journal of Approximate Reasoning, 35(2), 137-161.

Huang, C. J., Wang, H. F., Chiu, H. J., Lan, T. H., Hu, T. M., & Loh, E. W. (2010). Prediction of the period of psychotic episode in individual schizophrenics by simulation-data construction approach. Journal of Medical Systems, 34(5), 799-808.

Ivănescu, V. C., Bertrand, J. W. M., Fransoo, J. C., & Kleijnen, J. P. C. (2006). Bootstrapping to solve the limited data problem in production control: an application in batch process industries. Journal of the Operational Research Society, 57(1), 2-9.

Jang, J. S. R. (1993). ANFIS: adaptive-network-based fuzzy inference system. IEEE Transactions on Systems, Man and Cybernetics, 23(3), 665-685.

Johnson, N. L. (1949). Systems of frequency curves Generated by methods of translation. Biometrika, 36, 149-176.

Kuo, Y., Yang, T., Peters, B. A., & Chang, I. (2007). Simulation metamodel development using uniform design and neural networks for automated material handling systems in semiconductor wafer fabrication. Simulation Modelling Practice and Theory, 15(8), 1002-1015.

Lanouette, R., Thibault, J., & Valade, J. L. (1999). Process modeling with neural networks using small experimental datasets. Computers & Chemical Engineering, 23(9), 1167-1176.

Li, D. C., Chang, F. M., & Chen, K. C. (2010b). Building reliability growth model using sequential experiments and the Bayesian theorem for small datasets. Expert Systems with Applications, 37(4), 3434-3443.

Li, D. C., Chen, L. S., & Lin, Y. S. (2003). Using Functional Virtual Population as assistance to learn scheduling knowledge in dynamic manufacturing environments. International Journal of Production Research, 41(17), 4011-4024.

Li, D. C., Chen, W. C., Chang, C. J., Chen, C. C., Wen, I. H., (2015). Practical information diffusion techniques to accelerate new product pilot runs. International Journal of Production Research, 53(7), 5310-5319

Li, D. C., Fang, Y. H., Lai, Y. Y., & Hu, S. C. (2009a). Utilization of virtual samples to facilitate cancer identification for DNA microarray data in the early stages of an investigation. Information Sciences, 179(16), 2740-2753.

Li, D. C., Gu, H., & Zhang, L. Y. (2010a). A fuzzy c-means clustering algorithm based on nearest-neighbor intervals for incomplete data. Expert Systems with Applications, 37(10), 6942-6947.

Li, D. C., Hsu, H. C., Tsai, T. I., Lu, T. J., & Hu, S. C. (2007a). A new method to help diagnose cancers for small sample size. Expert Systems with Applications, 33(2), 420-424.

Li, D. C., & Lin, L. S. (2013). A new approach to assess product lifetime performance for small data sets. European Journal of Operational Research, 230, 290-298.

Li, D. C., Lin, L. S., & Peng, L. J. (2014). Improving learning accuracy by using synthetic samples for small datasets with non-linear attribute dependency. Decision Support Systems, 59, 286-295.

Li, D. C., & Lin, Y. S. (2006). Using virtual sample generation to build up management knowledge in the early manufacturing stages. European Journal of Operational Research, 175(1), 413-434.

Li, D. C., Lin, Y. S., & Huang, Y. C. (2009b). Constructing marketing decision support systems using data diffusion technology: A case study of gas station diversification. Expert Systems with Applications, 36(2), 2525-2533.

Li, D. C., Liu, C. W., & Hu, S. C. (2010c). A learning method for the class imbalance problem with medical data sets. Computers in Biology and Medicine, 40(5), 509-518.

Li, D. C., & Wen, I. H. (2014). A genetic algorithm-based virtual sample generation technique to improve small data set learning. Neurocomputing, 143, 220-230.
Li, D. C., Wen, I. H., Chen, W. C., (2016), A Novel Data Transformation Model for Small Dataset Learning. International Journal of Production Research (In press).

Li, D. C., Wu, C. S., & Chang, F. M. (2005). Using data-fuzzification technology in small data set learning to improve FMS scheduling accuracy. International Journal of Advanced Manufacturing Technology, 27(3-4), 321-328.

Li, D. C., Wu, C. S., Tsai, T. I., & Chang, F. M. (2006). Using mega-fuzzification and data trend estimation in small data set learning for early FMS scheduling knowledge. Computers & Operations Research, 33(6), 1857-1869.

Li, D. C., Wu, C. S., Tsai, T. I., & Lina, Y. S. (2007b). Using mega-trend-diffusion and artificial samples in small data set learning for early flexible manufacturing system scheduling knowledge. Computers & Operations Research, 34(4), 966-982.

Li, D. C., Yeh, C. W., & Li, Z. Y. (2008). A case study: The prediction of Taiwan's export of polyester fiber using small-data-set learning methods. Expert Systems with Applications, 34(3), 1983-1994.

Li, R. M., Yao, J. Q., Shang, H. J., & Ruan, D. (2010). An optimal model of information diffusion principles to risk and decision analysis of breast cancer morbidity. Soft Computing, 14(12), 1297-1303.

Liu, X. P., Zhang, J. Q., Cai, W. Y., & Tong, Z. J. (2010). Information diffusion-based spatio-temporal risk analysis of grassland fire disaster in northern China. Knowledge-Based Systems, 23(1), 53-60.

Niyogi, P., Girosi, F., & Poggio, T. (1998). Incorporating prior information in machine learning by creating virtual examples. Proceedings of the IEEE, 86(11), 2196-2209.

Oniśko, A., Druzdzel, M. J., & Wasyluk, H. (2001). Learning Bayesian network parameters from small data sets: application of Noisy-OR gates. International Journal of Approximate Reasoning, 27(2), 165-182.

Slifker, J. F., & Shapiro, S. S. (1980). The Johnson system: selection and parameter estimation. Technometrics, 22, 239-246.

Thomas, M., Kanstein, A., & Goser, K. (1997). Rare fault detection by possibilistic reasoning. Computational Intelligence - Theory and Applications, 1226, 294-298.

Tsai, T. I., & Li, D. C. (2008). Utilize bootstrap in small data set learning for pilot run modeling of manufacturing systems. Expert Systems with Applications, 35(3), 1293-1300.

Yang, J., Yu, X., Xie, Z.-Q., & Zhang, J.-P. (2011). A novel virtual sample generation method based on Gaussian distribution. Knowledge-Based Systems, 24(6), 740-748.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2016-08-17起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw