進階搜尋


 
系統識別號 U0026-0908201609022700
論文名稱(中文) 藉由文獻探勘預測合成致死關係:大腸癌的個案探討
論文名稱(英文) Prediction of Synthetic Lethality by Literature Mining: Case Studies of Colon Cancer
校院名稱 成功大學
系所名稱(中) 資訊工程學系
系所名稱(英) Institute of Computer Science and Information Engineering
學年度 104
學期 2
出版年 105
研究生(中文) 簡名昱
研究生(英文) Ming-Yu Chien
電子信箱 imwilly37@iir.csie.ncku.edu.tw
學號 P76034698
學位類別 碩士
語文別 英文
論文頁數 43頁
口試委員 指導教授-蔣榮先
共同指導教授-楊士德
口試委員-曾大千
口試委員-林鵬展
口試委員-李宗儒
中文關鍵字 合成致死  文獻探勘  篩選實驗  癌症  大腸癌  必要基因  突變基因 
英文關鍵字 synthetic lethality  literature mining  text-mining  screening data  cancer  colon cancer  essential gene  mutant gene 
學科別分類
中文摘要 合成致死是一種存在於兩個基因之間的現象,當兩個基因其中一個無法正常作用的時候,該細胞仍能存活,但是兩個基因同時無作用時,細胞變會死亡。合成致死現象於1946年即被發現,但是最近合成致死被發現能夠用來治療癌症,而且因為要兩個基因同時失效才會死亡,所以可以拿來針對癌細胞而對正常細胞幾乎無影響。然而發現新的合成致死關係非常復雜而且實驗上也因為成本關係不容易實作,因此需要一個較方便的預測合成致死系統。
本研究設計了一套預測合成致死的系統,首先利用文獻探勘技術來取得更多潛在的必要基因,再利用得到的必要基因與已知的突變基因預測可能的合成致死關係,最後再利用共同基因表現量與基因共同在文章中出現的次數做為篩選機制,找出最有可能的合成致死關係。由於已知的必要基因相較於突變基因相當稀少,於是使用文獻探勘的技術來擴展必要基因的數量是非常重要的,否則預測出來的合成致死關係將只侷限於少數幾個癌症。
本研究最後還與篩選實驗比較,試圖從篩選實驗中找出更有可能性的合成致死關係。目前有些合成致死的篩選實驗找出了大量可能的合成致死關係,但是無法進一步得知這些合成致死關係的正確性,我們透過我們的系統可以幫助生醫研究人員從篩選實驗中找出更可能的合成致死關係。本研究最後也針對找出特別可能的幾個合成致死關係做個案探討,也發現一些資訊顯示在大腸癌中的合成致死關係可能是正確的而且是新發現的。
本研究結合生物基因資料與文獻探勘技術預測更多可能的合成致死關係,其中透過文獻探勘技術取得更多必要基因資料補足生物資料的不足,又透過預測模型預測出可能的合成致死的關係,再利用這些合成致死關系從篩選實驗中找出最有可能的合成致死關係,方便醫學研究員快速的找出有興趣的潛在合成致死關係
英文摘要 Synthetic lethality (SL) is an interaction between two genes, which means the cell will be alive if one of these two genes is disabled but dead if both genes are disabled. SL was discovered in 1946 but become more popular recently because there are some cancer therapies based on this interaction. Due to this special interaction, we can make cancer cells lethal but normal cells alive by targeting the synthetic lethal pair of the mutant gene in cancer. However, it costs a lot to find an SL using traditional experiments. Therefore, we propose a new SL prediction system to easily predict some potential SL in silico.
In this study, we design an SL prediction system based on a text-mining method and an inference model. First, we extend some essential genes using the text-mining method. Compared with mutant genes, there are only a few essential genes in a few cancers recorded in databases. With these potential essential genes extended using the text-mining method, we can predict more SL not restricted by scant essential genes. Second, we predict SL from mutant genes and extend essential genes and filter wrong gene pairs using gene co-expression and the co-occurrence of two genes in the literature.
We also compare our prediction system with experimental screening data. Recently, there has been a lot of SL discovered through a few screening experiments, but it is hard for biological researchers to find more important SL in these screening data. Through our prediction system, biological researchers can find potential SL more easily. We also study some cases of colon cancer and find some information that reveals some novel potential SL might be valid.
This research predicts potential SL by combining a text-mining method and gene data. We extract more essential genes from literature mining to complement the lack of essential gene data. We then predict some potential SL using an inference model and compare the results with screening data to allow biological researchers find interesting SL quickly.
論文目次 中文摘要 III
Abstract V
Contents VIII
List of Tables X
List of Figures XI
Chapter 1 Introduction 1
1.1 Background 1
1.2 Brief Introduction of Methods 2
1.3 Research Objective and Specific Aims 4
1.4 Thesis Organization 5
Chapter 2 Related Work 6
2.1 Text Mining in Biomedical Literature 6
2.2 Relation Extraction in Biomedical Literature 7
2.3 Experiments on Synthetic Lethality in Human Cells 9
2.4 Prediction of Human Synthetic Lethality 10
2.5 Related Database of Human Synthetic Lethality 11
Chapter 3 Predicting Potential Synthetic Lethality by Literature Mining 13
3.1 Prediction Inference Model 13
3.2 Prediction Approach Architecture 15
3.3 Data Pre-processing 18
3.4 Extracting and Ranking Trigger Terms 20
3.5 Extracting and Ranking Essential Genes 25
3.6 Predicting and Ranking Synthetic Lethality 26
Chapter 4 Experiments 28
4.1 Experimental design 28
4.2 Experiments about essential gene extraction 29
4.3 Experiments on synthetic lethality prediction 31
4.4 Case study 35
Chapter 5 Conclusions and Future Work 37
5.1 Conclusions 37
5.2 Future Work 38
References 40
參考文獻 Andronis, C., A. Sharma, V. Virvilis, S. Deftereos and A. Persidis (2011). "Literature mining, ontologies and information visualization for drug repurposing." Briefings in bioinformatics 12(4): 357-368.
Azorsa, D. O., I. M. Gonzales, G. D. Basu, A. Choudhary, S. Arora, K. M. Bisanz, J. A. Kiefer, M. C. Henderson, J. M. Trent and D. D. Von Hoff (2009). "Synthetic lethal RNAi screening identifies sensitizing targets for gemcitabine therapy in pancreatic cancer." J Transl Med 7(43).
Barbosa-Silva, A., J.-F. Fontaine, E. R. Donnard, F. Stussi, J. M. Ortega and M. A. Andrade-Navarro (2011). "PESCADOR, a web-based tool to assist text-mining of biointeractions extracted from PubMed queries." BMC bioinformatics 12(1): 435.
Barretina, J., G. Caponigro, N. Stransky, K. Venkatesan, A. A. Margolin, S. Kim, C. J. Wilson, J. Lehár, G. V. Kryukov and D. Sonkin (2012). "The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity." Nature 483(7391): 603-607.
Blank, J. L., X. J. Liu, K. Cosmopoulos, D. C. Bouck, K. Garcia, H. Bernard, O. Tayber, G. Hather, R. Liu and U. Narayanan (2013). "Novel DNA damage checkpoints mediating cell death induced by the NEDD8-activating enzyme inhibitor MLN4924." Cancer research 73(1): 225-234.
Bui, Q.-C., S. Katrenko and P. M. Sloot (2011). "A hybrid approach to extract protein–protein interactions." Bioinformatics 27(2): 259-265.
Bui, Q.-C., P. M. Sloot, E. M. van Mulligen and J. A. Kors (2014). "A novel feature-based approach to extract drug–drug interactions from biomedical text." Bioinformatics 30(23): 3365-3371.
Campos, D., S. Matos and J. L. Oliveira (2013). "Gimli: open source and high-performance biomedical name recognition." BMC bioinformatics 14(1): 54.
Cer, D. M., M.-C. De Marneffe, D. Jurafsky and C. D. Manning (2010). Parsing to Stanford Dependencies: Trade-offs between Speed and Accuracy. LREC.
Davis, A. P., C. J. Grondin, K. Lennon-Hopkins, C. Saraceni-Richards, D. Sciaky, B. L. King, T. C. Wiegers and C. J. Mattingly (2014). "The Comparative Toxicogenomics Database's 10th year anniversary: update 2015." Nucleic acids research: gku935.
Dobzhansky, T. (1946). "Genetics of natural populations. XIII. Recombination and variability in populations of Drosophila pseudoobscura." Genetics 31(3): 269-290.
Felth, J., L. Rickardson, J. Rosén, M. Wickström, M. Fryknäs, M. Lindskog, L. Bohlin and J. Gullbo (2009). "Cytotoxic effects of cardiac glycosides in colon cancer cells, alone and in combination with standard chemotherapeutic drugs." Journal of natural products 72(11): 1969-1974.
Firth, H. V., S. M. Richards, A. P. Bevan, S. Clayton, M. Corpas, D. Rajan, S. Van Vooren, Y. Moreau, R. M. Pettett and N. P. Carter (2009). "DECIPHER: database of chromosomal imbalance and phenotype in humans using ensembl resources." The American Journal of Human Genetics 84(4): 524-533.
Fong, P., D. Boss, T. A. Yap, A. Tutt, P. Wu, M. Mergui-Reolvink, P. Mortimer, H. Swaisland, A. Lau, M. J. O'Connor, A. Ashworth, J. Carmichael, S. B. Kaye, J. H. M. Schellens and J. S. d. Bono (2009). "Inhibition of poly (ADP-ribose) polymerase in tumors from BRCA mutation carriers." The New England journal of medicine 361(2): 123-134.
Fontaine, J.-F., A. Barbosa-Silva, M. Schaefer, M. R. Huska, E. M. Muro and M. A. Andrade-Navarro (2009). "MedlineRanker: flexible ranking of biomedical literature." Nucleic acids research 37(suppl 2): W141-W146.
Forbes, S. A., D. Beare, P. Gunasekaran, K. Leung, N. Bindal, H. Boutselakis, M. Ding, S. Bamford, C. Cole and S. Ward (2015). "COSMIC: exploring the world's knowledge of somatic mutations in human cancer." Nucleic acids research 43(D1): D805-D811.
Guo, J., H. Liu and J. Zheng (2016). "SynLethDB: synthetic lethality database toward discovery of selective and sensitive anticancer drug targets." Nucleic acids research 44(D1): D1011-D1017.
Helleday, T. (2011). "The underlying mechanism for the PARP and BRCA synthetic lethality: clearing up the misunderstandings." Molecular oncology 5(4): 387-393.
Huang, M., X. Zhu, Y. Hao, D. G. Payan, K. Qu and M. Li (2004). "Discovering patterns to extract protein–protein interactions from full texts." Bioinformatics 20(18): 3604-3612.
Jerby-Arnon, L., N. Pfetzer, Y. Y. Waldman, L. McGarry, D. James, E. Shanks, B. Seashore-Ludlow, A. Weinstock, T. Geiger, P. A. Clemons, E. Gottlieb and E. Ruppin (2014). "Predicting cancer-specific vulnerability via data-driven detection of synthetic lethality." Cell 158(5): 1199-1209.
Kaewphan, S., S. Van Landeghem, T. Ohta, Y. Van de Peer, F. Ginter and S. Pyysalo (2016). "Cell line name recognition in support of the identification of synthetic lethality in cancer from text." Bioinformatics 32(2): 276-282.
Kilicoglu, H. and S. Bergler (2009). Syntactic dependency based heuristics for biological event extraction. Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task, Association for Computational Linguistics.
Koh, J. L. Y., K. R. Brown, A. Sayad, D. Kasimer, T. Ketela and J. Moffat (2012). "COLT-Cancer: functional genetic screening resource for essential genes in human cancer cell lines." Nucleic acids research 40(Databse issue): D957-963.
Leaman, R., R. I. Doğan and Z. Lu (2013). "DNorm: disease name normalization with pairwise learning to rank." Bioinformatics 29(22): 2909-2917.
Li, X.-j., S. K. Mishra, M. Wu, F. Zhang and J. Zheng (2014). "Syn-lethality: an integrative knowledge base of synthetic lethality towards discovery of selective anticancer therapies." BioMed research international 2014: 196034.
Luo, J., M. J. Emanuele, D. Li, C. J. Creighton, M. R. Schlabach, T. F. Westbrook, K.-K. Wong and S. J. Elledge (2009). "A genome-wide RNAi screen identifies multiple synthetic lethal interactions with the Ras oncogene." Cell 137(5): 835-848.
Nijman, S. M. B. (2011). "Synthetic lethality: general principles, utility and detection using genetic screens in human cells." FEBS letters 585(1): 1-6.
Ono, T., H. Hishigaki, A. Tanigami and T. Takagi (2001). "Automated extraction of information on protein–protein interactions from the biological literature." Bioinformatics 17(2): 155-161.
Peng, Y., M. Torii, C. H. Wu and Vijay-Shanker (2014). "A generalizable NLP framework for fast development of pattern-based biomedical relation extraction systems." BMC bioinformatics 15: 285.
Reinhold, W. C., M. Sunshine, H. Liu, S. Varma, K. W. Kohn, J. Morris, J. Doroshow and Y. Pommier (2012). "CellMiner: a web-based suite of genomic and pharmacologic tools to explore transcript and drug patterns in the NCI-60 cell line set." Cancer research 72(14): 3499-3511.
Ryan, C. J., C. J. Lord and A. Ashworth (2014). "DAISY: picking synthetic lethals from cancer genomes." Cancer cell 26(3): 306-308.
Schmidt, E. E., O. Pelz, S. Buhlmann, G. Kerr, T. Horn and M. Boutros (2013). "GenomeRNAi: a database for cell-based and in vivo RNAi phenotypes, 2013 update." Nucleic acids research 41(D1): D1021-D1026.
Settles, B. (2005). "ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text." Bioinformatics 21(14): 3191-3192.
Spangler, S., A. D. Wilkins, B. J. Bachman, M. Nagarajan, T. Dayaram, P. Haas, S. Regenbogen, C. R. Pickering, A. Comer and J. N. Myers (2014). Automated hypothesis generation based on mining scientific literature. Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM.
Turner, N. C., C. J. Lord, E. Iorns, R. Brough, S. Swift, R. Elliott, S. Rayter, A. N. Tutt and A. Ashworth (2008). "A synthetic lethal siRNA screen identifying genes mediating sensitivity to a PARP inhibitor." The EMBO journal 27(9): 1368-1377.
Wei, C.-H., H.-Y. Kao and Z. Lu (2013). "PubTator: a web-based text mining tool for assisting biocuration." Nucleic acids research: gkt441.
Wei, C.-H., H.-Y. Kao and Z. Lu (2015). "GNormPlus: An Integrative Approach for Tagging Genes, Gene Families, and Protein Domains." BioMed research international 2015.
Wu, M., X. Li, F. Zhang, X. Li, C.-K. Kwoh and J. Zheng (2014). "In Silico Prediction of Synthetic Lethality by Meta-Analysis of Genetic Interactions, Functions, and Pathways in Yeast and Human Cancer." Cancer informatics 13(Suppl 3): 71-80.
Xie, B., Q. Ding, H. Han and D. Wu (2013). "miRCancer: a microRNA–cancer association database constructed by text mining on literature." Bioinformatics: btt014.
Xu, R., L. Li and Q. Wang (2013). "Towards building a disease-phenotype knowledge base: extracting disease-manifestation relationship from literature." Bioinformatics (Oxford, England) 29(17): 2186-2194.
Xu, R. and Q. Wang (2013). "Large-scale extraction of accurate drug-disease treatment pairs from biomedical literature for drug repurposing." BMC bioinformatics 14: 181.
Xu, R. and Q. Wang (2013). "A semi-supervised approach to extract pharmacogenomics-specific drug–gene pairs from biomedical literature for personalized medicine." Journal of biomedical informatics 46(4): 585-593.
Xu, R. and Q. Wang (2013). "A semi-supervised pattern-learning approach to extract pharmacogenomics-specific drug-gene pairs from biomedical literature." Journal of biomedical informatics 46(4): 585-593.
Yang, H.-T., J.-H. Ju, Y.-T. Wong, I. Shmulevich and J.-H. Chiang (2016). "Literature-based discovery of new candidates for drug repurposing." Briefings in bioinformatics: bbw030.
Zhou, D., D. Zhong and Y. He (2014). "Event trigger identification for biomedical events extraction using domain knowledge." Bioinformatics 30(11): 1587-1594.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2016-08-22起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2018-08-09起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw