進階搜尋


 
系統識別號 U0026-3107201823405800
論文名稱(中文) 蛋白質結合親合度與癌症病人臨床資料之關係
論文名稱(英文) The Relationship between Protein Binding Affinity and the Clinical Data of Cancer Patients
校院名稱 成功大學
系所名稱(中) 資訊工程學系
系所名稱(英) Institute of Computer Science and Information Engineering
學年度 106
學期 2
出版年 107
研究生(中文) 簡立銘
研究生(英文) Li-Ming Chien
學號 P76054177
學位類別 碩士
語文別 英文
論文頁數 35頁
口試委員 指導教授-蔣榮先
共同指導教授-林鵬展
口試委員-劉宗霖
口試委員-張文綺
口試委員-沈孟儒
共同指導教授-楊士德
中文關鍵字 次世代定序  單核苷酸變異  癌症  蛋白質模擬  蛋白質交互作用  臨床資料 
英文關鍵字 NGS, SNP, cancer, protein-protein interaction, clinical 
學科別分類
中文摘要 在次世代定序技術(NGS)逐漸成熟的現在,基於NGS資料的研究成果如雨後春筍。有些研究在DNA序列資料中尋找單核苷酸多態性(SNP)並利用統計或機器學習的方法將這些突變跟疾病作關聯。或者更進一步的,將SNP轉成突變特徵(mutational signature),企圖從這些特徵中解析疾病的突變路徑。當然,除SNP以外,也有基於其他突變形式的研究(Ex: CNV, INDEL, Structural Variation)。
然而,我們知道在人體內產生各種化學反應、訊號傳遞的重要角色是蛋白質,這些蛋白質在人體內由DNA生成。而受DNA突變影響,在結構上或序列上產生變化的蛋白質被許多研究發現跟癌症產生、甚至是癌症惡化有很密切的關聯。
於是,此研究將專注在蒐集蛋白質編碼區域(protein coding region)的SNP,並透過蛋白質結構模擬、蛋白質接合模擬還原蛋白質在三維結構上的交互作用關係的改變。並且將這些變化量化,以建立基於蛋白質交互作用(Protein-Protein Interaction)的病人簡歷。期待可以利用計算的方式找出這個簡歷與病人表現型或病例數據的關連。
英文摘要 Because Next Generation Sequencing (NGS) technique gets mature these years, there are more and more accomplishments of research based on NGS data analyzing. Some researches correlate disease with Single Nucleotide Polymorphism (SNP), which are found in NGS data. Or, furthermore, they transform these SNPs into Mutational Signature, and try to explain the mutation route of some kinds of diseases. Besides the strategies mentioned above, there are research based on other mutation types, ex: CNV, INDEL, Structural Variation.
Chemical reaction and physiological signal transmission rely on the attendance of proteins, and these proteins are built according to our DNA sequences. Thus, many researches told us that the occurrence or progression of cancer are strongly related to the structural or sequential alterations on proteins, which are attributed to the mutations on DNA sequences.
This research will focus on collecting SNP on protein coding regions and showing the changes of relationship among proteins 3D structures with protein structure simulation and proteins docking. These changes will be quantized to build a Protein-Protein Interaction (PPI) profile for each patient. These profiles are expected to discover the relationship to clinical status or phenotypes of patients by methods of in-silico evaluations.
論文目次 Chapter 1: Introduction .................................................................................................. 1
1.1 Background ............................................................................................................................. 1
1.2 Aims ........................................................................................................................................ 2
1.3 Organization ............................................................................................................................ 2
Chapter 2: Related Works .................................................................................................... 3
2.1 Previous Studies about Endometrial Carcinoma ..................................................................... 3
2.2 InterPred .................................................................................................................................. 4
2.3 Related Databases ................................................................................................................... 6
2.3.1 STRING ........................................................................................................................... 6
2.3.2 RCSB PDB ....................................................................................................................... 6
2.3.3 NCI GDC ......................................................................................................................... 6
Chapter 3: Methods and Materials ....................................................................................... 7
3.1 Overview ................................................................................................................................. 7
3.2 PPI Retrieving Pipeline ........................................................................................................... 9
3.3 Patients Information .............................................................................................................. 10
3.3.1 Genome Variant for patient ............................................................................................ 10
3.3.2 Patient Clinical Data ....................................................................................................... 10
3.4 Protein Simulation Pipeline ................................................................................................... 11
3.4.1 Basic simulation pipeline ............................................................................................... 11
3.4.2 BLAST ........................................................................................................................... 11
3.4.3 Modeller ......................................................................................................................... 12
3.4.4 TM-align ........................................................................................................................ 12
3.4.5 FiberDock ....................................................................................................................... 13
3.4.6 PPI score and PPI profile ............................................................................................... 13
3.4.7 Wild-type PPI score and the Mutated PPI score ............................................................ 14
3.4.8 PPI score transformation ................................................................................................ 14
3.5 Analysis strategies ................................................................................................................. 15
3.5.1 Fisher’s Exact Test ......................................................................................................... 15
3.5.2 Kaplan-Meier Estimator ................................................................................................. 15
3.5.3 Weka Feature Selection Module .................................................................................... 15
3.5.4 Cluster Patients with K-means ....................................................................................... 16
Chapter 4: Experimental Results ........................................................................................ 17
4.1 Overview of Patient and PPI Panel ....................................................................................... 17
4.2 PPI Mutations and Histology type ........................................................................................ 24
4.3 PPI Mutations and Recurrence Free Survival ....................................................................... 27
4.4 Re-group Patients with Multiple PPI ..................................................................................... 29
Chapter 5: Conclusions and future work ............................................................................ 32
5.1 Conclusions ........................................................................................................................... 32
5.2 Future work ........................................................................................................................... 33
Reference ..............................................................................................34


參考文獻 1. Board, P.D.Q.A.T.E., Endometrial Cancer Treatment (PDQ(R)): Patient Version, in PDQ Cancer Information Summaries. 2002, National Cancer Institute (US): Bethesda (MD).
2. Kong, A., et al., Adjuvant radiotherapy for stage I endometrial cancer. Cochrane Database Syst Rev, 2012(4): p. Cd003916.
3. Colombo, N., et al., Endometrial cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol, 2013. 24 Suppl 6: p. vi33-8.
4. Kandoth, C., et al., Integrated genomic characterization of endometrial carcinoma. Nature, 2013. 497(7447): p. 67-73.
5. Mirabello, C. and B. Wallner, InterPred: A pipeline to identify and model protein-protein interactions. Proteins, 2017. 85(6): p. 1159-1170.
6. Remmert, M., et al., HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods, 2011. 9(2): p. 173-5.
7. Webb, B. and A. Sali, Comparative Protein Structure Modeling Using MODELLER. Curr Protoc Bioinformatics, 2014. 47: p. 5.6.1-32.
8. Zhang, Y. and J. Skolnick, TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res, 2005. 33(7): p. 2302-9.
9. Gray, J.J., et al., Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. J Mol Biol, 2003. 331(1): p. 281-99.
10. Lensink, M.F., R. Mendez, and S.J. Wodak, Docking and scoring protein complexes: CAPRI 3rd Edition. Proteins, 2007. 69(4): p. 704-18.
11. Szklarczyk, D., et al., The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res, 2017. 45(D1): p. D362-d368.
12. Berman, H.M., et al., The Protein Data Bank. Nucleic Acids Res, 2000. 28(1): p. 235-42.
13. Grossman, R.L., et al., Toward a Shared Vision for Cancer Genomic Data. N Engl J Med, 2016. 375(12): p. 1109-12.
14. GDC. Available from: https://portal.gdc.cancer.gov/.
15. Hovelson, D.H., et al., Development and validation of a scalable next-generation sequencing system for assessing relevant somatic variants in solid tumors. Neoplasia, 2015. 17(4): p. 385-99.
16. Wang, K., M. Li, and H. Hakonarson, ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Research, 2010. 38(16): p. e164-e164.
17. Mashiach, E., R. Nussinov, and H.J. Wolfson, FiberDock: Flexible induced-fit backbone refinement in molecular docking. Proteins, 2010. 78(6): p. 1503-19.
18. Mashiach, E., R. Nussinov, and H.J. Wolfson, FiberDock: a web server for flexible induced-fit backbone refinement in molecular docking. Nucleic Acids Res, 2010. 38(Web Server issue): p. W457-61.
19. Danecek, P., et al., The variant call format and VCFtools. Bioinformatics, 2011. 27(15): p. 2156-8.
20. Altschul, S.F., et al., Basic local alignment search tool. J Mol Biol, 1990. 215(3): p. 403-10.
21. Download BLAST Software and Databases Documentation. Available from: https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download.
22. BLAST databases. Available from: ftp://ftp.ncbi.nlm.nih.gov/blast/db/.
23. Webb, B. and A. Sali, Comparative Protein Structure Modeling Using MODELLER. Curr Protoc Bioinformatics, 2016. 54: p. 5.6.1-5.6.37.
24. Modeller 9.15 Release Notes. Available from: https://salilab.org/modeller/9.15/release.html.
25. Peto, R. and J. Peto, Asymptotically Efficient Rank Invariant Test Procedures. Journal of the Royal Statistical Society. Series A (General), 1972. 135(2): p. 185-207.
26. Mantel, N., Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemother Rep, 1966. 50(3): p. 163-70.
27. Linear Rank Tests in Survival Analysis, in Encyclopedia of Biostatistics.
28. Frank, E., M.A. Hall, and I.H. Witten, The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann, Fourth Edition, 2016.
29. Hall, M.A., Correlation-based Feature Selection for Machine Learning. 1999.
30. Assie, G., et al., Integrated genomic characterization of adrenocortical carcinoma. Nat Genet, 2014. 46(6): p. 607-12.
31. Sherry, S.T., et al., dbSNP: the NCBI database of genetic variation. Nucleic Acids Res, 2001. 29(1): p. 308-11.
32. Cheng, D.T., et al., Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT): A Hybridization Capture-Based Next-Generation Sequencing Clinical Assay for Solid Tumor Molecular Oncology. J Mol Diagn, 2015. 17(3): p. 251-64.

論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2019-09-01起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2019-09-01起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw