進階搜尋


 
系統識別號 U0026-0812200911432796
論文名稱(中文) 微陣列資料庫之模糊關連規則探勘
論文名稱(英文) Mining Fuzzy Association Patterns in Microarray Databases
校院名稱 成功大學
系所名稱(中) 資訊工程學系碩博士班
系所名稱(英) Institute of Computer Science and Information Engineering
學年度 93
學期 2
出版年 94
研究生(中文) 陳彥旭
研究生(英文) Yen-Hsu Chen
學號 p7692177
學位類別 碩士
語文別 中文
論文頁數 53頁
口試委員 口試委員-李強
口試委員-蔣榮先
口試委員-辛致煒
指導教授-曾新穆
召集委員-洪宗貝
中文關鍵字 關聯規則  模糊理論  基因表現分析  微陣列  資料探勘 
英文關鍵字 Association Rule  Gene Expression Analysis  Microarray  Fuzzy theorem  Data Mining 
學科別分類
中文摘要   在本研究中,我們以資料探勘中的一種方法--關聯規則為基礎,再整合模糊理論的概念,先將這樣的觀念套用到微陣列的分析上形成FAGE (Fuzzy Association Gene Expression)的演算法,接著更進一步提出了一種新的樣式跟規則,稱為REGER ( Ripple Effective Gene Expression Rule )。其中模糊理論可以將量的描述轉換為較為人類可以理解的語意項目,我們也可以藉由如此得到較佳的規則。最近的一些研究已經可以證實,關聯規則的確可以探勘出基因之間隱藏的關係以及互動;而以傳統的分群分析無法顯示出的基因樣式。而我們提出的FAGE首先整合模糊理論的觀念到微陣列的分析上,找出他們其中的規則。例如"G1:L'G2:SH",表示當基因G1被抑制的時候,基因G2是少量的被激發。我們在REGER找出的規則如"WSC4:L ' SOK1:SH ' HSP12:H",表示當基因WSC4被抑制的時候SOK1被微量的激發,同時HSP12處於激發的狀態,他們可能是處於同一個生物反應路徑上的關係。經由實驗分析證實,我們提出的方法可以找到比傳統關聯規則更多有意義的規則,同時REGER演算法也提出了新的歸給生物學家做參考。
英文摘要  In our research, we propose a novel method based on association rule and extends fuzzy theorem. We applied fuzzy theorem on analyzing microarray and call the method FAGE. The paper proposes a novel pattern and rule named REGER (Ripple Effective Gene Expression Rule). With fuzzy association, the technique transforms quantity into human linguistic term. The rules found by fuzzy association method are more readable. Some recent studies have shown that association rules could reveal the interactions and relations between genes which are not found by using traditional clustering method. Our method could find some extra rules which are not found in traditional association rules. For example, a rule likes "G1:L'G2:SH", which indicates G2 being slightly up-regulated whenever G1 is down-regulated. Our REGER rule may form "WSC4:L->SOK1:SH->HSP12:H". It shows that WSC4, SOK1 and HSP12 are active at the same time and its fuzzy items, "L->SH->H", are monotone. It may be caused by the genes are on the same pathway. Through empirical evaluation, our method finds extra rules than traditional one and we provide a novel rule for biologists for advance research and analyzing.
論文目次 目錄
英文摘要......................................................I
中文摘要.....................................................II
誌謝........................................................III
目錄.........................................................IV
表目錄.......................................................VI
圖目錄......................................................VII

第 一 章 簡介................................................1
1.1 背景.....................................................1
1.2 研究動機.................................................2
1.3 問題定義.................................................3
1.4 研究方法.................................................5
1.5 貢獻.....................................................7
1.6 論文架構.................................................7

第 二 章 相關研究............................................9
2.1 生物資訊學上的相關研究...................................9
2.2 模糊理論................................................10
2.3 關聯規則................................................11
2.4 關聯規則與模糊理論的整合................................16
2.5 關聯規則與分群方式......................................17

第 三 章 微陣列資料庫的模糊關聯規則探勘.....................19
3.1 相關基礎描述............................................19
3.2 FAGE(Fuzzy Association Gene Expression)演算法...........20
3.3 FAGE範例................................................22
3.4 REGER(Ripple Effective Gene Expression Rule)演算法......27
3.5 REGER範例...............................................30

第 四 章 實驗分析...........................................38
4.1 實驗資料及環境..........................................38
4.2 FAGE跟傳統關聯規則的比較結果............................38
4.3 不同的隸屬函數的結果....................................41
4.4 REGER結果...............................................43
4.5 實驗總結................................................47

第 五 章 結論與未來發展.....................................49
5.1 結論....................................................49
5.2 未來發展................................................50

參考文獻.....................................................51
參考文獻 [1] R. Agrawal, T. Imielinski and A. Swami, "Mining Association Rules Between Sets in LargeDatabases," Pro. Of ACM SIGMOD Conference on Management of Data, pp 207-216. 1993.
[2] R. Agrawal and R. Srikant. "Fast Algorithms for Mining Association Rules," Pro. 20th Very Large Databases (VLDB) Conference, pp 487-499, Santiage, Chile. 1994.
[3] Manoj Bhasin and G. P. S. Raghava, "SVM based method for predicting HLA-DRB1*0401 binding peptides in an antigen sequence," Bioinformatics 20: 421 - 423. 2004.
[4] Mehmet Bilgen, Mehmet Karaca, A. Naci Onus, and Ayse Gu"l Ince, "A software program combining sequence motif searches with keywords for finding repeats containing DNA sequences," Bioinformatics, Dec 2004; 20: 3379 - 3386.
[5] Volker Brendel, Liqun Xing, and Wei Zhu, "Gene structure prediction from consensus spliced alignment of multiple  ESTs matching the same genomic locus," Bioinformatics, May 2004; 20: 1157 - 1169.
[6] R. Chen, Q. Jiang, H. Yuan and L. Gruenwald. "Mining Association Rules in Analysis of Transcription Factors Essential to Gene Expressions," Atlantic Symposium on Computational Biology, and Genome Information Systems & Technology. 2001.
[7] C. Creighton and S. Hanash. "Mining Gene Expression Databases for Association Rules," Bioinformatics Vol19 no. 1, pp. 79-86, 2003.
[8] T. M. Chu, B. Weir, and R. Wolfinger.. "A systematic statistical linear modeling approach to oligonucleotide array experiments," Mathematical Biosciences 176, pp. 35-51, 2002.
[9] M.B. Eisen, P.T. Spellman, P.O. Brown, and Botstein, D. "Cluster analysis and display of genome-wide expression patterns," Proc. Natl Acad. Sci. USA, 14863-14868, 1998.
[10] EL Braun, Fuge EK, Padilla PA, Werner-Washburne M, "A stationary-phase gene in Saccharomyces cerevisiae is a member of a novel, highly conserved gene family", J Bacteriol, 6865-6879, 1996
[11] I. Graham and P. L. Jones, Expert Systems - Knowledge, Uncertainty and Decision, Chapman and Computing, Boston, 1988, pp.117-158.
[12] T. P. Hong, C. S. Kuo and S. C. Chi, "Trade-off between time complexity and number of rules for fuzzy mining from quantitative data," International Journal of Uncertainty, Fuzziness and Knowledge-based Systems, Vol. 9, No. 5, 2001, pp. 587-604.
[13] T. P. Hong, C. S. Kuo and S. C. Chi, "Mining association rules from quantitative data", Intelligent Data Analysis, Vol. 3, No. 5, 1999, pp. 363-376.
[14] A. Kandel, Fuzzy Expert Systems, CRC Press, Boca Raton, 1992, pp.8-19.
[15] L. Kaufrnan and P.J. Rousseeuw. "Finding Groups in Data: An Introduction to Cluster Analysis." New York: John Wiley & Sons, 1990.
[16] P. Kotala, P. Zhou, S. Mudivarthy, W. Perrizo and E. Deckard. "Gene Expression Profiling of DNA Microarray Data using Peano Count Trees (P-trees)," Online Proceedings of the First Virtual Conference on Genomics and Bioinformatics. 2001.
[17] Y. C. Lee, T. P. Hong and W. Y. Lin, "Mining fuzzy association rules with multiple minimum supports using maximum constraints", Lecture Notes in Computer Science, Vol. 3214, 2004, pp. 1283-1290.
[18] H. J. Lars, D. Marielle, T. H. Lasse, G. Morten, B. J. Peter , and S. Maxwell "Characterisation of cytotoxicity and DNA damage induced by the topoisomerase II-directed bisdioxopiperazine anti-cancer agent ICRF-187 (dexrazoxzne) in yeast and mammalian cells", BMC Pharmacology, 2004
[19] J. MacQueen. "Some methods for classification and anlysis of multivariate observations." Proc. 5th Berkeley Symp. Math. Statist, Prob., 1:281-297, 1967
[20] E. H. Mamdani, "Applications of fuzzy algorithms for control of simple dynamic plants, " IEEE Proceedings, 1974, pp.1585-1588.
[21] Padilla,P.A., Fuge,E.K., Crawford,M.E., Errett,A. and WernerWashburne,M., " The highly conserved, coregulated SNO and SNZ gene families in Saccharomyces cerevisiae respond to nutrient limitation.", J. Bacteriol., 1998.
[22] Ritu Pandey, Raghavendra K. Guru, and David W. Mount "Pathway Miner: extracting gene association networks from molecular pathways for predicting the biological significance of gene expression microarray data," Bioinformatics, Sep 2004; 20: 2156 - 2158.
[23] I. Res, I. Mihalek, and O. Lichtarge, "An evolution based classifier for prediction of protein interfaces without using protein structures," Bioinformatics, May 2005; 21: 2496 - 2501.
[24] S. Luikenhuis, G. Perrone, I. W. Dawes, and C. M. Grant, "The Yeast Saccharomyces cerevisiae contains two Glutaredoxin Genes That are required for Protection agaist Reactive Oxygen Species", Molecular Biology of the Cell, Vol. 9, 1081-1091, May 1998.
[25] Tamayo, P. Slonim, D. Mesirov, J. Q. Zhu, S. Kitareewan, E. Dmitrovsky, E. Lander, T. Golub, "Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl Acad. Sci. USA, 2907-2912, 1999.
[26] Zu T, Verna J, Ballester R, "Mutations in WSC genes for putative stress receptors result in sensitivity to multiple stress conditions and impairment of RIM I-dependent gene expression in Saccharomyces cerevisiae." Mol Genet Genomics 2001, 266:142-155.
[27] R.Weber, "Fuzzy-ID3: a class of methods for automatic knowledge acquisition," The Second International Conference on Fuzzy Logic and Neural Networks, Iizuka, Japan, 1992, pp. 265-268.
[28] Hui Xiong, Pang-Ning Tan, and Vipin Kumar, "Mining strong affinity association patterns in data sets with skewed support distribution,"  In Proc. of the Third IEEE International Conference on Data Mining (ICDM) , pp. 387-394, USA, 2003.
[29] L. A. Zadeh, "Fuzzy sets," Information and Control, Vol. 8, No. 3, 1965, pp. 338-353.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2005-08-31起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2005-08-31起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw