||A Study on Clustering Techniques for Biomedical Data Mining
||Institute of Computer Science and Information Engineering
biomedical data mining
gene expression analysis
fuzzy association rule mining
近年來，生物醫學資料之知識探勘的需求與重要性與日俱增。對於生物學家而言，一般來說，分析的流程常常是由幾個感興趣的標的物(例如: 疾病相關基因)開始，進而找出更多相關的生物標的物，之後再接著分析更大量的資料。此外，在各種資料探勘技術中，叢集分析為其重要的資料探勘方法之一，並且常利用於生物及醫學等各領域的資料分析，例如:基因微陣列資料(gene expression microarray)分析。在本論文中，我們針對不同的分析需求，提出三種以叢集技術為基礎的探勘方法，包含:查詢導向之模糊式雙分群法、整合式叢集法，以及模糊式關聯規則探勘，以期望能提供給生物學家一個由小到大規模的探勘分析。
首先，針對如何尋找微陣列資料中，與使用者感興趣基因有相似表現量之雙分群分析，我們提出一種稱為加權式模糊基礎之最大相似雙分群法(Weighted Fuzzy-based Maximum Similarity Biclustering), 簡稱WF-MSB法。此雙分群法提供使用者輸入一個感興趣的基因，根據與此基因的表現量相似程度，找出其他基因在子樣本空間維度下與此輸入參考基因(the reference gene)有相似的基因表現量。相對於傳統的雙分群法，此方法利用模糊理論可以找出不同相似程度的雙分群(bicluster)結果，特別是可以找出與參考基因最相似與最不相似的雙分群結果，而透過基因本體論(Gene Ontology)的生物註解資料，驗證出在同一個雙分群結果裡的基因組具有高度生物相關程度。經由模擬資料與真實微陣列資料之實驗顯示，WF-MSB演算法的結果比其他方法的結果具有更顯著的生物意義和基因表現訊號。
另外，針對如何同時分析不同型態之微陣列資料問題的探討，我們提出一種整合時間序列類型與類別型態(例如:有用藥治療樣本與沒有用藥治療樣本)之基因微陣列資料的分析方法，稱為混合式之時間序列類型與類別型態分析演算法(the mixture of Time-series and Group-comparative analysis algorithm)，簡稱為TGmix，經由此方法，我們可以分析出有哪些基因同時在時間序列部分與類別型態部分，具有相似的基因表現樣式。方法概述如下，針對每組基因，我們將時間序列類型與類別型態的表現量數據組合成一個整合型態的表現量數據。接下來，我們提出一種新型的相似度計算方法，用來量測兩個整合型態的表現量數據間的相似程度，並且利用密度基礎之叢集分析演算法將此整合型態的表現量數據分成數群。最後，再透過篩選機制挑選出最具有顯著相關性的基因集合。透過真實的大鼠口腔傷口癒合的微陣列資料實驗，TGmix演算法能找出許多具有生物意義的分析結果。
最後，我們提出一種叢集式模糊關聯規則探勘演算法，稱為以叢集與各個擊破法為基礎之基因演化模糊探勘法(Cluster-based Divide-and-conquer Genetic-Fuzzy approach with Multiple Minimum Supports)，簡稱CDGFMMS。此演算法利用基因演算法、叢集分析與模糊理論來分析交易(transaction)類型的生物醫學資料，並且找出各個生物醫學項目(item)之最佳的最小支持度門檻值、隸屬函數和模糊關聯規則。透過實驗之驗證，CDGFMMS演算法能找出各個項目中最合適的最小支持度門檻值、隸屬函數和模糊關聯規則，並且比其他方法更能大幅度減少執行時間。
The importance of discovering knowledge from biomedical data is growing at rapid pace in recent years. In general, the analysis flow on biomedical data runs from study on a few targeted biomarkers, like disease-related genes, to the analysis of relationships among huge targeted biomarkers. Among various data mining techniques, clustering analysis is one of the most important methods being applied to biomedical problems, like gene expression microarray analysis. In this dissertation, we proposed three clustering-based mining algorithms with different analysis purposes, including query-driven fuzzy biclustering, integrated-based clustering and fuzzy association rule mining, for biologists to investigate one or huge amount of biomarkers.
First, we proposed a query-driven fuzzy biclustering (or co-clustering) algorithm, namely Weighted Fuzzy-based Maximum Similarity Biclustering (WF-MSB), for extracting biclusters with different similarity levels based on the user-defined reference gene. In particular, the most similar bicluster and the most dissimilar bicluster to the reference gene can be extracted, and both biclusters have functional meanings with the Gene Ontology (GO) annotations. Through experiments conducted on simulated and real gene expression data sets, the WF-MSB algorithm was shown to outperform previous query-driven biclustering methods greatly in the sense that more significant expression signals are discovered in the biclusters.
Second, we proposed an integrated approach, called mixture of Time-series and Group-comparative analysis (TGmix), on both of time-series type and two-group comparative type (drug treatment samples versus non-treatment samples) microarray datasets for finding significant genes being co-expressed in time-series part and differentially expressed in group-based part simultaneously. For each gene, the corresponding time-series profile and two-group comparative profile are combined to be an integrated gene profile. A novel similarity measure was proposed to calculate the similarity between two integrated gene profiles. Then, the density-based clustering algorithm is used to group co-expressed genes into the same cluster. Finally, a filtering process is applied to select significant gene sets. Through experiments conducted on rat wound healing microarray datasets, the TGmix algorithm was shown to be effective in finding gene clusters with biological meanings.
Finally, we proposed an efficient cluster-based fuzzy association rule mining algorithm, called Cluster-based Divide-and-conquer Genetic-Fuzzy approach with Multiple Minimum Supports (CDGFMMS), for discovering associated items from biomedical data. In the CDGFMMS algorithm, Genetic Algorithm (GA), the clustering technique and the fuzzy concepts are used together to discover suitable minimum supports, membership functions and useful fuzzy association rules from quantitative transactions. The CDGFMMS algorithm was shown to deliver higher efficiency than previously existing algorithms.
In summary, we proposed a set of clustering-based algorithms with different mining purposes for analysis on biomedical data. Through performance evaluations on various simulated and real datasets, these proposed methods can successfully resolve the targeted problems in biomedical data mining.
摘 要 I
誌 謝 V
LIST OF FIGURES VIII
LIST OF TABLES X
CHAPTER 1 INTRODUCTION 1
1.1 MOTIVATION 1
1.2 OVERVIEW OF THE DISSERTATION 4
1.2.1 A Weighted Fuzzy Biclustering Method for Gene Expression Data 4
1.2.2 Discovering Gene Clusters via Integrated Analysis on Time-Series and Group-Comparative Microarray Datasets 5
1.2.3 A Cluster-based Divide-and-conquer Genetic-Fuzzy approach with Multiple Minimum Supports 6
1.3 ORGANIZATION OF THE DISSERTATION 7
CHAPTER 2 A WEIGHTED FUZZY BICLUSTERING METHOD FOR GENE EXPRESSION DATA ANALYSIS 8
2.1 INTRODUCTION 8
2.2 THE PROBLEM DEFINITIONS 11
2.3 THE PROPOSED METHOD: WF-MSB 14
2.4 EXPERIMENTAL EVALUATION 18
2.4.1 Synthetic Data 18
2.4.2 Experiments on Real Yeast Data 25
2.5 SUMMARY 34
CHAPTER 3 DISCOVERING GENE CLUSTERS VIA INTEGRATED ANALYSIS ON TIME-SERIES AND GROUP-COMPARATIVE MICROARRAY DATASETS 36
3.1 INTRODUCTION 36
3.2 PROPOSED METHOD: TGMIX 39
3.3 EXPERIMENTAL EVALUATION 42
3.4 SUMMARY 49
CHAPTER 4 A CLUSTER-BASED DIVIDE-AND-CONQUER GENETIC-FUZZY APPROACH WITH MULTIPLE MINIMUM SUPPORTS 51
4.1 INTRODUCTION 51
4.2 RELATED WORK 54
4.2.1 Fuzzy Data Mining Approaches 54
4.2.2 Genetic-Fuzzy Mining Approaches 55
4.3 THE PROPOSED FRAMEWORK 57
4.4 THE COMPONENTS OF THE PROPOSED GENETIC-FUZZY MINING APPROACH 59
4.4.1 Chromosome Representation 59
4.4.2 Initial Population 61
4.4.3 The Required Strength of Fuzzy Regions 61
4.4.4 Fitness Function and Selection 63
4.4.5 Clustering Chromosomes 65
4.4.6 Genetic Operators 67
4.4.7 The Proposed Genetic-Fuzzy Mining Algorithm 67
4.5 EXPERIMENTAL RESULTS 71
4.5.1 Description of the Experimental Datasets 71
4.5.2 Performance of the Proposed Approach 72
4.6 SUMMARY 81
CHAPTER 5 CONCLUSIONS AND FUTURE WORK 83
 R. Agrawal and R. Srikant, “Fast algorithm for mining association rules,” in Proceedings of the International Conference on Very Large Databases, pp. 487-499, 1994.
 J. Alcalá-Fdez, R. Alcalá, M. J. Gacto, and F. Herrera, “Learning the membership function contexts for mining fuzzy association rules by using genetic algorithms,” Fuzzy Sets and Systems, Vol. 160, No. 7, pp. 905-921, 2009.
 R. Alhajj and M. Kaya, “Multi-objective genetic algorithms based automated clustering for fuzzy association rules mining,” Journal of Intelligent Information Systems, Vol. 31, No. 3, pp. 243-264, 2008.
 B. Almeida, S. Buttner, S. Ohlmeier, A. Silva, A. Mesquita, B. Sampaio-Marques, and et al., “NO-mediated apoptosis in yeast,” Journal of Cell Science, Vol. 120, pp. 3279-3288, 2007.
 U. Alon, N. Barkai, D. A. Notterman, K. Gish, S. Ybarra, D. Mack, and A. J. Levine, “Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays,” in Proceedings of the National Academy of Science, Vol. 96, pp. 6745-6750, 1999.
 D. Amaratunga and J. Cabrera, Exploration and Analysis of DNA Microarray and Protein Array Data, John Wiley, 2004.
 W. H. Au, Keith C. C. Chan, and X. Yao, “A novel evolutionary data mining algorithm with applications to churn prediction,” IEEE Transactions on Evolutionary Computation, Vol. 7, No. 6, pp. 532-545, 2003.
 P. Baldi and G. W. Hatfield, DNA Microarrays and Gene Expression From Experiments to Data Analysis and Modelling, Cambridge Univ. Press, 2002.
 Z. Bar-Joseph, “Analyzing time series Gene Expression Data,” Bioinformatics, Vol. 20, pp. 2493-2503, 2004.
 A. Ben-Dor, B. Chor, R. Karp, and Z. Yakhini, “Discovering local structure in gene expression data: the order-preserving submatrix problem,” in Proceedings of the sixth international conference on computational biology, Washington, DC, USA, pp. 89-100, 2002.
 A. Ben-Dor, R. Shamir, and Z. Yakhini, “Clustering gene expression patterns,” Journal of Computational Biology, Vol. 6, No. 3/4, pp. 281-297, 1999.
 S. Bergmann, J. Ihmels, and N. Barkai, “Iterative signature algorithm for the analysis of large-scale gene expression data,” Physical review E, Vol. 67, 031902, 2003.
 R. Bijlani, Y. Cheng, D. A. Pearce, A. I. Brooks, and M. Ogihara, “Prediction of biologically significant components from microarray data: Independently Consistent Expression Discriminator (ICED),” Bioinformatics, Vol. 19, pp. 62-70, 2003.
 R. J. Braun, H. Zischka, F. Madeo, T. Eisenberg, S. Wissing, and et al. “Molecular Basis of Cell and Developmental Biology,” Journal of Biological Chemistry, Vol. 281, pp. 25757-25767, 2006.
 H. M. Brown-Borg, S. G. Rakoczy, M. A. Romanick, and M. A. Kennedy, “Effects of Growth Hormone and Insulin-like Growth Factor-1 on Hepatocyte Antioxidative Enzymes,” Journal of Experimental Biology and Medicine, Vol. 227, pp. 94-104, 2002.
 J. M. Bruey, C. Ducasse, P. Bonniaud, L. Ravagnan, S. A. Susin, C. Diaz-Latoud, and et al., “Hsp27 negatively regulates cell death by interacting with cytochrome c,” Nature Cell Biology, Vol. 2, pp. 645-652, 2000.
 P. Carmona-Saez, M. Chagoyen, A. Rodriguez, O. Trelles, J. M. Carazo, A. Pascual-Montano, “Integrated analysis of gene expression by association rules discovery,” BMC Bioinformatics, Vol. 7, pp. 54-69, 2006.
 C. C. Chan and W. H. Au, “Mining fuzzy association rules,” in Proceedings of the Conference on Information and Knowledge Management, Las Vegas, pp. 209-215, 1997.
 M. S. Chen, J. Han, and P. S. Yu, “Data mining: An Overview from a Database Perspective,” IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No. 6, pp. 866-883, 1996.
 C. H. Chen, T. P. Hong and Vincent S. Tseng, “A Modified Approach to Speed up Genetic-Fuzzy Data Mining with Divide-and-Conquer Strategy,” in Proceedings of the IEEE Congress on Evolutionary Computation, pp. 1-6, 2007.
 C. H. Chen, T. P. Hong, and Vincent S. Tseng, “An improved approach to find membership functions and multiple minimum supports in fuzzy data mining,” Expert Systems with Applications, Vol. 36, No. 6, pp. 10016-10024, 2009.
 C. H. Chen, T. P. Hong, Vincent S. Tseng, and C. S. Lee, “A genetic-fuzzy mining approach for items with multiple minimum supports,” Soft Computing, Vol. 13, No. 5, pp. 521-533, 2009.
 C. H. Chen, T. P. Hong, Vincent S. Tseng, and L. C. Chen, “A multi-objective genetic-fuzzy mining algorithm,” in Proceedings of the IEEE International Conference on Granular Computing, pp. 115-120, 2008.
 J. Chen, A. Mikulcic, and D. H. Kraft, An integrated approach to information retrieval with fuzzy clustering and fuzzy inferencing, in O. Pons, M. A. Vila and J. Kacprzyk (eds.), Knowledge Management in Fuzzy Databases, Heidelberg, Germany: Physica-Verlag, 2000.
 T. Chen, R. W. Cho, P. J. Stork, and M. J. Weber, “Elevation of cyclic adenosine 3′,5′-monophosphate potentiates activation of mitogen-activated protein kinase by growth factors in LNCaP prostate cancer cells,” Cancer Research, Vol. 59, pp. 213-218, 1999.
 Y. Cheng and G. M. Church, “Biclustering of Expression Data,” in Proceedings of the 8th International Conference on Intelligent Systems for Molecular (ISMB-00), Menlo Park, CA, pp. 93-103, 2000.
 C. Creighton and S. Hanash, “Mining gene expression databases for association rules,” Bioinformatics, Vol. 19, No. 1, pp. 79-86, 2003.
 P. J. Darwen and X. Yao, “Speciation as automatic categorical modularization,” IEEE Transactions on Evolutionary Computation, Vol. 1, No. 2, pp. 101-108, 1997.
 T. Dhollander, Q. Sheng, K. Lemmens, B. De Moor, K. Marchal, and Y. Moreau, “Query-driven module discovery in microarray data,” Bioinformatics, Vol. 23, pp. 2573-2580, 2007.
 C. Ding, “Analysis of gene expression profiles: class discovery and leaf ordering,” in Proceedings of the International Conference on Computational Molecular Biology (RECOMB), pp. 127-136, 2002.
 J. C. Dunn, “A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters,” Journal of Cybernetics, Vol. 3, pp. 32-57, 1973.
 M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein, “Clustering analysis and display of genome wide expression patterns,” in Proceedings of the National Academy of Sciences, Vol. 95, pp. 14863-14868, 1998.
 M. Ester, H. P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Proceedings of the International Conference on knowledge Discovery and Data Mining, pp. 226-231, 1996.
 V. Filkov, S. Skiena, and J. Zhi, “Analysis techniques for microarray time-series data,” Journal of Computational Biology, pp. 317-330, 2002.
 G. Gan, C. Ma, and J. Wu, Data Clustering : Theory, Algorithms, and Applications, SIAM, Society for Industrial and Applied Mathematics, 2007.
 C. Garrido, S. Gurbuxani, L. Ravagnan, and G. Kroemer, “Heat Shock Proteins: Endogenous Modulators of Apoptotic Cell Death,” Vol. 286, No.3, pp. 433-442, 2001.
 A. P. Gasch, P. T. Spellman, C. M. Kao, O. Carmel-Harel, M. B. Eisen, G. Storz, D. Botstein, and P. O. Brown, “Genomic expression programs in the response of yeast cells to environmental changes,” Journal of Molecular Biology of the Cell, Vol. 11, pp. 4241-4257, 2000.
 C. H. Gau, Y. D. Hsieh, E. C. Shen, S. Lee, C. Y. Chiang, E. Fu, “Healing following tooth extraction in cyclosporine-fed rats,” Internal Journal of Oral Maxillofacial Surgery, Vol. 34, No. 7, pp.782-788, 2005.
 E. Georgii, L. Richter, U. Rückert, and S. Kramer, “Analyzing microarray data using quantitative association rules,” Bioinformatics, Vol. 21, pp. 123-129, 2005.
 T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gassenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, D. D. Bloomfield, and E. S. Lander, “Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring,” Science, Vol. 286, No.15, pp. 531-537, 1999.
 J. Gu and J. S. Liu, “Bayesian biclustering of gene expression data,” BMC Genomics, Vol. 9, No. Suppl. 1:S4, 2008.
 P. A. Heng, T. T. Wong, Y. Rong, Y. P. Chui, Y. M. Xie, K. S. Leung, and P. C. Leung, “Intelligent inferencing and haptic simulation for Chinese acupuncture learning and training,” IEEE Transactions on Information Technology in Biomedicine, Vol. 10, No. 1, pp. 28-41, 2006.
 F. Herrera, M. Lozano, and J. L. Verdegay, “Fuzzy connectives based crossover operators to model genetic algorithms population diversity,” Fuzzy Sets and Systems, Vol. 92, No. 1, pp. 21-30, 1997.
 S. E. Hoegy, H. R. Oh, M. L. Corcoran, and W. G. Stetler-Stevenson, “Tissue Inhibitor of Metalloproteinases-2 (TIMP-2) Suppresses TKR-Growth Factor Signaling Independent of Metalloproteinase Inhibition,” Journal of Biological Chemistry, Vol. 276, pp. 3203-3214, 2001.
 T. P. Hong, C. H. Chen, Y. L. Wu and Y. C. Lee, “A GA-based fuzzy mining approach to achieve a trade-off between number of rules and suitability of membership functions,” Soft Computing, Vol. 10, No. 11, pp. 1091-1101, 2006.
 T. P. Hong, C. S. Kuo, and S. C. Chi, “A data mining algorithm for transaction data with quantitative values,” in Proceedings of the Eighth International Fuzzy Systems Association World Congress, pp. 874-878, 1999.
 T. P. Hong, C. S. Kuo, and S. C. Chi, “Trade-off between time complexity and number of rules for fuzzy mining from quantitative data,” International Journal of Uncertainty, Fuzziness and Knowledge-based Systems, Vol. 9, No. 5, pp. 587-604, 2001.
 R. C. Hovey, J. Harris, D. L. Hadsell, A. V. Lee, C. J. Ormandy, and B. K. Vonderhaar, “Local insulin-like growth factor-II mediates prolactin-induced mammary gland development,” Mol. Endocrinol, Vol. 17, No. 3, pp. 460-471, 2003.
 J. Ihmels, G. Friedlander, S. Bergmann, O. Sarig, Y. Ziv, and N. Barkai, “Revealing Modular Organization in the Yeast Transcriptional Network,” Nature Genetics, Vol. 31, pp. 370-377, 2002.
 H. Ishibuchi and T. Yamamoto, “Rule weight specification in fuzzy rule-based classification systems,” IEEE Transactions on Fuzzy Systems, Vol. 13, No. 4, pp. 428-435, 2005.
 D. Jiang, C. Tang, A. Zhang, “Cluster analysis for gene expression data: a survey,” IEEE Transactions on Knowledge and Data Engineering, Vol.16, No.11, pp.1370-1386, 2004.
 M. Kaya and R. Alhajj, “Genetic algorithm based framework for mining fuzzy association rules,” Fuzzy Sets and Systems, Vol. 152, No. 3, pp. 587-601, 2005.
 V. R. Khare, X. Yao, B. Sendhoff, Y. Jin, and H. Wersing, “Co-evolutionary modular neural networks for automatic problem decomposition,” in Proceedings of the 2005 IEEE Congress on Evolutionary Computation, Vol. 3, pp. 2691-2698, 2005.
 G. J. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Applications, Prentice Hall, Upper Saddle River, NJ, 1995.
 C. Kuok, A. Fu, and M. Wong, “Mining fuzzy association rules in databases,” ACM SIGMOD Record, Vol. 27, No. 1, pp. 41-46, 1998.
 Y. C. Lee, T. P. Hong, and W. Y. Lin, “Mining fuzzy association rules with multiple minimum supports using maximum constraints,” Lecture Notes in Computer Science, Vol. 3214, pp. 1283-1290, 2004.
 Y. C. Lee, T. P. Hong, and T. C. Wang, “Multi-level fuzzy mining with multiple minimum supports,” Expert Systems with Applications, Vol. 34 , No. 1, pp. 459-468, 2008.
 G. Li, Q. Ma, H. Tang, A. H. Paterson, and Y. Xu, “QUBIC: a qualitative biclustering algorithm for analyses of gene expression data,” Nucleic Acids Res, 2009.
 L. Li, C. Weinberg, T. Darden, and L. Pedersen, “Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the ga/knn method,” Bioinformatics, Vol.17, pp. 1131–1142, 2001.
 H. Liang, Z. Wu, and Q. Wu, “A fuzzy based supply chain management decision support system,” The World Congress on Intelligent Control and Automation, Vol. 4, pp. 2617-2621, 2002.
 M. Ligr, I. Velten, E. Fröhlich, F. Madeo, M. Ledig, and K. U. Fröhlich, “The Proteasomal Substrate Stm1 Participates in Apoptosis-like Cell Death in Yeast,” Mol. Biol. Cell, Vol. 12, pp.2422-2432, 2001.
 K. H. Liu and C. G. Xu, “A genetic programming-based approach to the classification of multiclass microarray datasets‚” Bioinformatics, Vol. 25, pp. 331-337, 2009.
 X. Liu and L. Wang, “Computing the maximum similarity bi-clusters of gene expression data,” Bioinformatics, Vol. 23, No. 1, pp. 50-56, 2007.
 F. J. Lopez, A. Blanco, F. Garcia, C. Cano, and A. Marin, “Fuzzy association rules for biological data analysis: A case study on yeast,” BMC Bioinformatics, Vol. 9, No.107, 2008.
 S. C. Madeira and A. L. Oliveira, “Biclustering algorithms for biological data analysis: a survey,” IEEE Transactions on Computational Biology and Bioinformatics, Vol. 1, No. 1, pp. 24–45, 2004.
 F. Martella, “Classification of microarray data with factor mixture models,” Bioinformatics, Vol. 22, pp. 202-208, 2009.
 R. Martinez, N. Pasquier, and C. Pasquier, “GenMiner: mining non-redundant association rules from integrated gene expression data and annotations,” Bioinformatics, Vol. 24, pp. 2643-2644, 2008.
 M. Matsuo, H. Sakurai, and I. Saiki, “ZD1839, a selective epidermal growth factor receptor tyrosine kinase inhibitor, shows antimetastatic activity using a hepatocellular carcinoma model,” Molecular Cancer Therapeutics, Vol. 2, No. 6, pp. 557-561, 2003.
 J. B. McQueen, “Some Methods of Classification and Analysis of Mutivariate Observations,” in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281-297, 1967.
 T. M. Murali and S. Kasif, “Extracting conserved gene expression motifs from gene expression data,” in Proceedings of the 8th Pacific Symposium on Biocomputing Lihue, Hawaii, USA, pp. 77-88, 2003.
 A. B. Owen, J. Stuart, K. Mach, A.M. Villeneuve, and S. Kim, “A gene recommender algorithm to identify coexpressed genes in C. elegans,” Genome Research, Vol. 13, pp. 1828-1837, 2003.
 H. S. Park, J. S. Lee, S. H. Huh, J. S. Seo, and E. J. Choi, “Hsp72 functions as a natural inhibitory protein of c-Jun N-terminal kinase,” the EMBO Journal, Vol. 20, pp. 446-456, 2001.
 A. Parodi and P. Bonelli, “A new approach of fuzzy classifier systems,” in Proceedings of the Fifth International Conference on Genetic Algorithms, Morgan Kaufmann, Los Altos, CA, pp. 223-230, 1993.
 J. Pei, X. Zhang, M. Cho, H. Wang, and P. S. Yu, “MaPle: A Fast Algorithm for Maximal Pattern-based Clustering,” in Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM), Florida, USA, pp. 259–266, 2003.
 S. Placier, X. Bretot, N. Ardaillou, J. C. Dussaule, and R. Ardaillou, “Regulation of ANP clearance receptors by EGF in mesangial cells from NOD mice,” American Journal of Physiology-renal Physiology, Vol. 281, No. 2, pp. F244 - F254, 2001.
 A. Prelic, S. Bleuler, P. Zimmermann, A. Wille, P. Buhlmann, W. Gruissem, L. Hennig, L. Thiele, and E. Zitzler, “A systematic comparison and evaluation of biclustering methods for gene expression data,” Bioinformatics, Vol. 22, pp. 1122-1129, 2006.
 H. Roubos and M. Setnes, “Compact and transparent fuzzy models and classifiers through iterative complexity reduction,” IEEE Transactions on Fuzzy Systems, Vol. 9, No. 4, pp. 516-524, 2001.
 W. Siler and J. James, Fuzzy Expert Systems and Fuzzy Reasoning, John Wiley & Sons, 2004.
 M. E. Stearns, M. Wang, Y. Hu, F. U. Garcia, and J. Rhim, “Interleukin 10 blocks matrix metalloproteinase-2 and membrane type 1-matrix metalloproteinase synthesis in primary human prostate tumor lines,” Clinical Cancer Research, Vol. 9, No. 3, pp. 1191-1199, 2003.
 P. Tamayo, D. Slonim, J. Mesirov, Q. Zhu, S. Kitareewan, E. Dmitrovsky, E. S. Lander, and T. R. Golub, “Interpreting Patterns of Gene Expression with Self-organizing Maps: Methods and Application to Hematopoietic Differentiation,” in Proceedings of National Academy of Sciences, Vol. 96, No. 6, pp. 2907-2912, 1999.
 A. Tanay, R. Sharan, and R. Shamir, “Discovering statistically significant biclusters in gene expression data,” Bioinformatics, Vol. 18, No. Suppl. 1, pp. S136-S144, 2002.
 A. Thalamuthu, I. Mukhopadhyay, X. Zheng, and G. C. Tseng, “Evaluation and comparison of gene clustering methods in microarray analysis,” Bioinformatics, Vol. 22, pp. 2405-2412, 2006.
 O. G. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, D. Botstein, R.B. Altman, “Missing value estimation method for DNA microarrays,” Bioinformatics, Vol. 17, pp. 520-525, 2001.
 V. S. Tseng and C. P. Kao, “Efficiently Mining Gene Expression Data via a Novel Parameterless Clustering Method,” IEEE/ACM Transaction on Computational Biology and Bioinformatics, Vol. 2, No. 4, pp.355-365, 2005.
 V. G. Tusher, R. Tibshirani, and G. Chu, “Significance analysis of microarrays applied to the ionizing radiation response,” in Proceedings of the National Academy of Sciences, pp.5116-5121, 2001.
 M. van Uitert, W. Meuleman, and L. Wessels, “Biclustering sparse binary genomic data,” Journal of Computational Biology, Vol. 15, No. 10, pp. 1329-1345, 2008.
 J. Z. Wang, Z. Du, R. Payattakool, P. S. Yu, and C. F. Chen, “A New Method to Measure the Semantic Similarity of GO Terms,” Bioinformatics, Vol. 23, pp. 1274-1281, 2007.
 H. Wang, J. Pei, and P. S. Yu, “Pattern-based similarity search for microarray data,” Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), Chicago, Illinois, USA, pp. 814-819, 2005.
 S. Weng, Q. Dong, R. Balakrishnan, K. Christie, M. Costanzo, K. Dolinski, S.S. Dwight, S. Engel, D.G. Fisk, E. Hong, and et al. “Saccharomyces Genome Database (SGD) provides biochemical and structural information for budding yeast proteins,” Nucleic Acids Research, Vol. 31, pp. 216-218, 2003.
 J. L. Wu, Y. S. Lin, C. C. Yang, Y. J. Lin, S. F. Wu, Y. T. Lin, and et al. “MCRS2 represses the transactivation activities of Nrf1,” BMC Cell Biology, Vol. 10, pp. 9, 2009.
 E. P. Xing and R.M. Karp, “Cliff: Clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts,” Bioinformatics, Vol.17, No.1, pp. 306-315, 2001.
 X. Yao, “Adaptive divide-and-conquer using populations and ensembles,” in Proceedings of the 2003 International Conference on Machine Learning and Application, pp. 13-20, 2003.
 S. Yue, E. Tsang, D. Yeung, and D. Shi, “Mining fuzzy association rules with weighted items,” in Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pp. 1906-1911, 2000.
 H. Zhang and D. Liu, Fuzzy Modeling and Fuzzy Control, Springer Verlag, 2006.
 X. J. Zhou, M. Kao, and W.H. Wong, “Transitive functional annotation by shortest path analysis of gene expression data,” in Proceedings of the National Academy of Sciences, pp.12783-12788, 2002.
 D. Zhu, A.O. Hero, H. Cheng, K. Khanna, and A. Swaroop, “Network constrained clustering for gene microarray data,” Bioinformatics, Vol. 21, No. 21, pp. 4014-4020, 2005.