進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-2507201612135200
論文名稱(中文) 高維度資料中交互作用的探討-以多階段製程資料為例
論文名稱(英文) A Study of Interaction Effect for High Dimensional Data with Application to Manufacturing Data of Multistage Process
校院名稱 成功大學
系所名稱(中) 統計學系
系所名稱(英) Department of Statistics
學年度 104
學期 2
出版年 105
研究生(中文) 歐嘉瑜
研究生(英文) Ka-U Ao
學號 R26034014
學位類別 碩士
語文別 英文
論文頁數 49頁
口試委員 指導教授-鄭順林
召集委員-馬瀰嘉
口試委員-李國榮
中文關鍵字 多階段製程生產資料  動態貝氏網路  高階交互作用  協同作用  類別型時間數列  迴歸樹 
英文關鍵字 Multistage manufacturing process data  Dynamic Bayesian Network  High order interactions  synergy factors  Categorical-value Time series  Regression tree 
學科別分類
中文摘要 在統計分析中,主作用及交互作用對反應變數的影響同樣重要,但是由於各種原因,例如變數數目比樣本數數目多而做變數篩選,主作用的影響會覆蓋了交互作用(協同作用) 的影響,令交互作用被忽略。不論是工業統計或生物資訊領域都同時遭遇到此類問題。
在半導體工業中,工廠的投資額非常龐大,花費不少人力和財力,而在經過多個階段的製作工序中,製成品卻常具有缺陷或是不佳的情況,為了降低因此造成的損失,以及提升產品的品質,蒐集產品在製品期間的相關的歷史資料並進行分析,已經成為趨勢。此論文的主要目的是利用數個統計和機器學習的方法,找尋對製成品造成良率不佳的機台或是機台組合,提供造成品質不佳的可疑因子以協助尋找根本原因。
在製造的過程中,製品於不同階段使用的機台會被記錄,本研究為了找尋可能造成製成品良率不佳的問題機台,使用了動態貝氏網路(Dynamic Bayesian Network)(Dean and Kanazawa, 1989) 和Learned Pattern Similarity(Baydogan and Runger ,2015)的方法,尋找造成良率不佳的因子(尤其是交互作用和協同作用),並比較兩種分析方法的結果與使用傳統資料採礦方法分析的結果。
本論文的主要貢獻是建立了一個建議的流程,解決動態貝氏網路一階馬可夫鏈的限制,應用上更有彈性。另外,基於Learned pattern similarity 的想法並作修改,利用此修改後的方法找到造成良率低下的可疑機台組合。最後,本研究的結果可以協助分析造成良率不佳的可疑因子。
英文摘要 In statistical analysis, the influences of both main effects and interaction effects are important to the response variable. However, some reasons such as variable filtering due to computational burden would make interaction terms (or sometimes called synergy factors) being masked. No matter the field of industry statistics or bioinformatics, the similar problem exists.
In semiconductor manufacturing industry, huge investment is always consumed. It also spends a lot of human and financial resources. However, after a great number of manufacturing procedure stages, the final products often have defects or poor performance. In order to reduce the loss caused by this situation and improve the products’ quality, collection and analysis of the historical data of work in process (WIP) have become a trend. In this thesis, the aim is to find out one or some tools that would affect yield by using some statistical and data mining methods and these result can help to find out the possible root causes of the defect products.
The tools used in stages would be recorded during the manufacturing process. For the purpose of finding the suspected tools (especially for the interactions or synergy factors), Dynamic Bayesian Network (Dean and Kanazawa, 1989) and Learned Pattern Similarity (Baydogan and Runger, 2015) are considered in this thesis. At the end, the results of these methods are compared with the traditional data mining strategies.
One of the major contributions of this research is to develop a framework for finding suspected tools with Dynamic Bayesian Network. The assumption of the first order Markovian with fixed transition probability is relieved by using this framework. Also, a proposed approach which based on the concept of Learned Pattern Similarity is introduced. The result of these approaches identify the used tools which could reduce the yield rates.
論文目次 摘要 i
Abstract ii
誌謝 iii
Table of Contents iv
List of Figures v
List of Tables vi
Chapter 1. Introduction 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Time Series Regression Tree . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Bayesian Network and Dynamic Bayesian Network . . . . . . . . . . 2
1.3 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Chapter 2. Methodology 11
2.1 Learned Pattern Similarity (LPS) 11
2.2 Bayesian Network 15
2.3 Dynamic Bayesian Network (DBN) 18
2.4 Proposed Methods 21
2.4.1 Modification of LPS for Categorical-valued Time Series (mLPS) . . . 21
2.4.2 Modification of DBN for Multi-stage Manufacturing Data (mDBN) . . 22
2.5 Traditional Data Mining Strategies for Paths Finding 23
2.5.1 Association Rule 23
2.5.2 Recursive Partitioning Methods 24
Chapter 3. Real Data Application 26
3.1 Synergy Factors Finding with mLPS 26
3.2 The mLPSfast: an Improvement of mLPS 32
3.3 Suspected Tool Paths Finding with mDBN 35
3.4 Traditional Methods for Finding Tools Paths 37
3.5 Comparing Results of Different Methods 40
Chapter 4. Conclusion and Discussion 43
References 45
Appendix A. Table 47
參考文獻 Baydogan, M. G. (2013). Learned pattern similarity (LPS). homepage: www.mustafabaydogan.com/learned-pattern-similarity-lps.html/.

Baydogan, M. G. and Runger, G. (2015). Time series representation and similarity based on local autopatterns. Data Mining and Knowledge Discovery, pages 1–34.

Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.

Borgelt, C. (2003). Efficient implementations of apriori and eclat. In FIMI’03: Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations.

Cortina-Borja, M., Smith, A. D., Combarros, O., and Lehmann, D. J. (2009). The synergy factor: a statistic to measure interactions in complex diseases. BMC Research Notes, 2(1): 1.

Dean, T. and Kanazawa, K. (1989). A model for reasoning about persistence and causation.

Computational intelligence, 5(2):142–150.


Ghahramani, Z. (1998). Learning dynamic bayesian networks. In Adaptive processing of sequences and data structures, pages 168–197. Springer.

Hahsler, M., Buchta, C., Gruen, B., and Hornik, K. (2016). arules: Mining Association Rules and Frequent Itemsets. R package version 1.4-1.

Hahsler, M., Grün, B., and Hornik, K. (2007). Introduction to arules–mining association rules and frequent item sets. SIGKDD Explor, 2(4).

Han, K. and Wang, K. (2013). Coordination and control of batch-based multistage processes.

Journal of Manufacturing Systems, 32(2):372–381.


Lèbre, S. (2009). Inferring dynamic genetic networks with low order independencies. Sta- tistical applications in genetics and molecular biology, 8(1):1–38.

Lèbre, S., Becq, J., Devaux, F., Stumpf, M. P., and Lelandais, G. (2010). Statistical inference of the time-varying structure of gene-regulation networks. BMC systems biology, 4(1): 130.

Lebre, S., original version 1.0 by Sophie Lebre, and contribution of Julien Chiquet to version

2.0 (2013). G1DBN: A package performing Dynamic Bayesian Network inference. R package version 3.1.1.

Liu, S., Chen, F., and Lu, W. (2002). Wafer bin map recognition using a neural network approach. International Journal of production research, 40(10):2207–2223.

Murphy, K. P. (2002). Dynamic bayesian networks: representation, inference and learning.

PhD thesis, University of California, Berkeley.

Nagarajan, R., Scutari, M., and Lèbre, S. (2013). Bayesian networks in r. Springer, 122:125– 127.

R Core Team (2015). R: A language and environment for statistical computing.

Robinson, J. W. and Hartemink, A. J. (2010). Learning non-stationary dynamic bayesian networks. Journal of Machine Learning Research, 11(Dec):3647–3680.

Russell, S. J., Norvig, P., Canny, J. F., Malik, J. M., and Edwards, D. D. (2003). Artificial intelligence: a modern approach, volume 2. Prentice hall Upper Saddle River.

Strobl, C., Malley, J., and Tutz, G. (2009). An introduction to recursive partitioning: ratio- nale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological methods, 14(4):323.

Therneau, T., Atkinson, B., and Ripley, B. (2015). rpart: Recursive Partitioning and Re- gression Trees. R package version 4.1-10.

Therneau, T. M. and Atkinson, E. J. (1997). An introduction to recursive partitioning using the rpart routines.

Verron, S., Li, J., and Tiplica, T. (2010). Fault detection and isolation of faults in a multi- variate process with bayesian network. Journal of Process Control, 20(8):902–911.

Yang, L. and Lee, J. (2012). Bayesian belief network-based approach for diagnostics and prognostics of semiconductor manufacturing systems. Robotics and Computer-Integrated Manufacturing, 28(1):66–74.

Zhang, Z. and Dong, F. (2014). Fault detection and diagnosis for missing data systems with a three time-slice dynamic Bayesian network approach. Chemometrics and Intelligent Laboratory Systems, 138:30–40.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2021-07-25起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw