||Large-Scale Biomedical Literature Mining for Cross-Document Relation Extraction toward Drug Repurposing
||Institute of Computer Science and Information Engineering
Diseases are generally caused by the mutations of genes in human body. For example, the oncogene, a normal gene that is abnormally mutated, transforms normal cells into tumors. On the other hand, the mutation of a tumor suppressor gene, a gene preventing a normal cell from being a tumor, leads normal cells to dysfunction. The role of a drug in disease treatment is basically to deal with the mutations of genes, therefore; drugs, genes, and diseases are closely bound up with each other. The drug development takes up to hundreds of millions of U.S. dollars and more than ten years to be on the market. Plenty of time and cost can be reduced if we are able to reposition approved drugs by the use of existing resources, that is, drug repurposing. Though drug repurposing could play an important role in the future drug development and disease therapeutics, the complicated biology mechanism and the popular information technology lead to the rapid growth of the publicly-accessed biomedical resources. Hence, the objective of this research is to extract the relationships between drugs, genes, and diseases to further explore the new indications of drugs.
In this dissertation, we proposed a novel method to identify protein-protein interactions through semantic similarity measures among protein mentions. Moreover, we shrunk a large volume of biomedical literature by a machine learning approach with features generated using information retrieval techniques to facilitate finding important documents. Finally, we utilized natural language processing methods for inferring indirect drug-disease relationships from large-scale biomedical literature and confirmed the suitability of drug candidates identified for repurposing as anticancer drugs by conducting a manual review of the literature and the clinical trials.
摘 要 i
LIST OF TABLES vii
LIST OF FIGURES viii
Chapter 1. INTRODUCTION 1
1.1 Motivation 1
1.2 Objective 2
1.3 Organization of Dissertation 3
Chapter 2. RELATED WORKS 4
2.1 Tools for Text Mining 4
2.2 Models for Relation Extraction 5
2.3 Databases for Relation Inference 6
Chapter 3. IDENTIFICATION OF NOVEL PROTEIN-PROTEIN INTERACTIONS 8
3.1 Background 8
3.2 Materials and Methods 10
3.2.1 Page count based semantic similarity (PCBSS) 12
3.2.2 Relation-words reinforcement (PCBSS-RWR) 14
3.2.3 Classification 15
3.3 Experiments and Results 16
3.3.1 Datasets 16
3.3.2 Evaluation metrics 16
3.3.3 Performance of the proposed approaches 17
3.3.4 Feature ablation 18
3.3.5 Comparison with other approaches 19
3.4 Discussion of the novel PPIs 20
3.5 Summary 22
Chapter 4. DOCUMENT TRIAGE FOR CHEMICAL-GENE-DISEASE INFORMATION 24
4.1 Background 24
4.2 Materials and Methods 26
4.2.1 Pre-processing 26
4.2.2 Named entities recognition (NER) modules 26
4.2.3 Learning-to-rank approach and feature extraction 27
4.3 Experiments and Results 31
4.3.1 Dataset 31
4.3.2 Performance metrics 31
4.3.3 Evaluation of individual features 32
4.3.4 Comparison with other approaches 33
4.3.5 The web application (ARTs) 34
4.4 Summary 35
Chapter 5. EXTRACTION OF INDIRECT DISEASE-DRUG RELATIONSHIPS TOWARD DRUG REPURPOSING 37
5.1 Background 37
5.2 Materials and Methods 39
5.2.1 Target document collection 39
5.2.2 Relationship extraction 41
5.2.3 Repurposed drug prioritization 44
5.3 Experiments and Results 46
5.3.1 Relationship extraction evaluation 46
5.3.2 Drug similarity evaluation 48
5.3.3 Drug repurposing evaluation 50
5.4 Discussion of the Suitability of Candidates 52
5.4.1 Evaluation of literature review 52
5.4.2 Evaluation of ClinicalTrials.gov 56
5.5 Summary 58
Chapter 6. CONCLUSION AND FUTURE WORKS 60
Adams, C. P. and Brantner, V. V., "Estimating the cost of new drug development: is it really $802 million?," Health Affairs, 25: 420-428, 2006.
Airola, A., Pyysalo, S., Björne, J., et al., "All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning," BMC bioinformatics, 9: S2, 2008.
Algra, A. M. and Rothwell, P. M., "Effects of regular aspirin on long-term cancer incidence and metastasis: a systematic comparison of evidence from observational studies versus randomised trials," The lancet oncology, 13: 518-527, 2012.
Aronson, A. R., "Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program," In: Proceedings of the AMIA Symposium, American Medical Informatics Association, 17, 2001.
Ashburn, T. T. and Thor, K. B., "Drug repositioning: identifying and developing new uses for existing drugs," Nature reviews Drug discovery, 3: 673-683, 2004.
Bader, G. D., Betel, D. and Hogue, C. W., "BIND: the biomolecular interaction network database," Nucleic acids research, 31: 248-250, 2003.
Baker, N. C. and Hemminger, B. M., "Mining connections between chemicals, proteins, and diseases extracted from Medline annotations," Journal of biomedical informatics, 43: 510-519, 2010.
Baldwin, B. and Carpenter, B., "LingPipe," In: Available from World Wide Web: http://alias-i.com/lingpipe, 2003.
Bennett, C. H., Gács, P., Li, M., et al., "Information distance," IEEE Transactions on Information Theory, 44: 1407-1423, 1998.
Bollegala, D., Matsuo, Y. and Ishizuka, M., "A web search engine-based approach to measure semantic similarity between words," IEEE Transactions on Knowledge and Data Engineering, 23: 977-990, 2011.
Bui, Q. C., Katrenko, S. and Sloot, P. M., "A hybrid approach to extract protein–protein interactions," Bioinformatics, 27: 259-265, 2011.
Bunescu, R., Ge, R., Kate, R. J., et al., "Comparative experiments on learning information extractors for proteins and their interactions," Artificial intelligence in medicine, 33: 139-155, 2005.
Bunescu, R. C. and Mooney, R. J., "A shortest path dependency kernel for relation extraction," In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 724-731, 2005.
Burges, C., Shaked, T., Renshaw, E., et al., "Learning to rank using gradient descent," In: Proceedings of the 22nd international conference on Machine learning, ACM, 89-96, 2005.
Calabrese, L. and Fleischer, A. B., "Thalidomide: current and potential clinical applications," The American journal of medicine, 108: 487-495, 2000.
Calvo, R. A., Lee, J. M. and Li, X., "Managing content with automatic document classification," Journal of Digital Information, 5: 2006.
Cao, Z., Qin, T., Liu, T.-Y., et al., "Learning to rank: from pairwise approach to listwise approach," In: Proceedings of the 24th international conference on Machine learning, ACM, 129-136, 2007.
Chapelle, O. and Chang, Y., "Yahoo! Learning to Rank Challenge Overview," In: Yahoo! Learning to Rank Challenge, 1-24, 2011.
Chen, C. H., Hsieh, S. L., Weng, Y. C., et al., "Semantic similarity measure in biomedical domain leverage web search engine," In: Engineering in Medicine and Biology Society (EMBC), 2010 Annual International Conference of the IEEE, IEEE, 4436-4439, 2010.
Chen, E. S., Hripcsak, G., Xu, H., et al., "Automated acquisition of disease–drug knowledge from biomedical and clinical documents: an initial study," Journal of the American Medical Informatics Association, 15: 87-98, 2008.
Chen, P. I. and Lin, S. J., "Word AdHoc network: using Google core distance to extract the most relevant information," Knowledge-Based Systems, 24: 393-405, 2011.
Chen, X. W. and Liu, M., "Prediction of protein–protein interactions using random decision forest framework," Bioinformatics, 21: 4394-4400, 2005.
Cheng, F., Li, W., Wu, Z., et al., "Prediction of polypharmacological profiles of drugs by the integration of chemical, side effect, and therapeutic space," Journal of chemical information and modeling, 53: 753-762, 2013.
Chiang, J. H., Liu, H. H. and Huang, Y. T., "Condensing biomedical journal texts through paragraph ranking," Bioinformatics, 27: 1143-1149, 2011.
Chowdhary, R., Zhang, J. and Liu, J. S., "Bayesian inference of protein–protein interactions from biological literature," Bioinformatics, 25: 1536-1542, 2009.
Chowdhury, F. M., Lavelli, A. and Moschitti, A., "A study on dependency tree kernels for automatic extraction of protein-protein interaction," In: Proceedings of BioNLP 2011 Workshop, Association for Computational Linguistics, 124-133, 2011.
Cilibrasi, R. L. and Vitanyi, P., "The google similarity distance," IEEE Transactions on Knowledge and Data Engineering, 19: 370-383, 2007.
Corney, D. P., Buxton, B. F., Langdon, W. B., et al., "BioRAT: extracting biological information from full-length papers," Bioinformatics, 20: 3206-3213, 2004.
Cruz Díaz, N. P., Maña López, M. J., Vázquez, J. M., et al., "A machine‐learning approach to negation and speculation detection in clinical texts," Journal of the American society for information science and technology, 63: 1398-1410, 2012.
Culotta, A. and Sorensen, J., "Dependency tree kernels for relation extraction," In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, 423, 2004.
Dai, H. J., Huang, C. H., Lin, R. T., et al., "BIOSMILE web search: a web application for annotating biomedical entities and relations," Nucleic acids research, 36: W390-W398, 2008.
Davis, A. P., Murphy, C. G., Johnson, R., et al., "The comparative toxicogenomics database: update 2013," Nucleic acids research, gks994, 2012.
De Marneffe, M. C., MacCartney, B. and Manning, C. D., "Generating typed dependency parses from phrase structure parses," In: Proceedings of LREC, 449-454, 2006.
Debusmann, R. and Kuhlmann, M., "Dependency grammar: Classification and exploration," In: Resource-adaptive cognitive processes, Springer, 365-388, 2010.
DiMasi, J. A., "Risks in new drug development: approval success rates for investigational drugs," Clinical Pharmacology and Therapeutics (St. Louis), 69: 297-307, 2001.
Dogan, R. I., Murray, G. C., Névéol, A., et al., "Understanding PubMed® user search behavior through log analysis," Database, 2009: bap018, 2009.
Fontaine, J. F., Barbosa-Silva, A., Schaefer, M., et al., "MedlineRanker: flexible ranking of biomedical literature," Nucleic acids research, 37: W141-W146, 2009.
Freund, Y., Iyer, R., Schapire, R. E., et al., "An efficient boosting algorithm for combining preferences," The Journal of machine learning research, 4: 933-969, 2003.
Friedman, J. H., "Greedy function approximation: a gradient boosting machine," Annals of statistics, 1189-1232, 2001.
Frijters, R., Van Vugt, M., Smeets, R., et al., "Literature mining for the discovery of hidden connections between drugs, genes and diseases," PLOS Computational Biology, 6: e1000943, 2010.
Fundel, K., Küffner, R. and Zimmer, R., "RelEx—Relation extraction using dependency parse trees," Bioinformatics, 23: 365-371, 2007.
Goldstein, I., Lue, T. F., Padma-Nathan, H., et al., "Oral sildenafil in the treatment of erectile dysfunction. Sildenafil Study Group," The New England Journal of Medicine, 338: 1397-1404, 1998.
Grando, S. A., "Connections of nicotine to cancer," Nature Reviews Cancer, 14: 419-429, 2014.
Gupta, S. C., Sung, B., Prasad, S., et al., "Cancer drug discovery by repurposing: teaching new tricks to old dogs," Trends in pharmacological sciences, 34: 508-517, 2013.
Hakenberg, J., Leaman, R., Ha Vo, N., et al., "Efficient extraction of protein-protein interactions from full-text articles," IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 7: 481-494, 2010.
Hall, M., Frank, E., Holmes, G., et al., "The WEKA data mining software: an update," ACM SIGKDD explorations newsletter, 11: 10-18, 2009.
He, M., Wang, Y. and Li, W., "PPI finder: a mining tool for human protein-protein interactions," PLOS ONE, 4: e4554, 2009.
Hirschman, L., Yeh, A., Blaschke, C., et al., "Overview of BioCreAtIvE: critical assessment of information extraction for biology," BMC bioinformatics, 6: S1, 2005.
Hollstein, M., Sidransky, D., Vogelstein, B., et al., "p53 mutations in human cancers," Science, 253: 49-53, 1991.
Huang, M., Liu, J. and Zhu, X., "GeneTUKit: a software for document-level gene normalization," Bioinformatics, 27: 1032-1033, 2011.
Huang, M., Névéol, A. and Lu, Z., "Recommending MeSH terms for annotating biomedical articles," Journal of the American Medical Informatics Association, 18: 660-667, 2011.
Iorio, F., Bosotti, R., Scacheri, E., et al., "Discovery of drug mode of action and drug repositioning from transcriptional responses," Proceedings of the National Academy of Sciences, 107: 14621-14626, 2010.
Jakarta, A., "Apache Lucene-a high-performance, full-featured text search engine library," In: 2004.
Jang, H., Lim, J., Lim, J. H., et al., "Finding the evidence for protein-protein interactions from PubMed abstracts," Bioinformatics, 22: e220-e226, 2006.
Jelier, R., Jenster, G., Dorssers, L. C., et al., "Co-occurrence based meta-analysis of scientific texts: retrieving biological relationships between genes," Bioinformatics, 21: 2049-2058, 2005.
Jessop, D. M., Adams, S. E., Willighagen, E. L., et al., "OSCAR4: a flexible architecture for chemical text-mining," Journal of Cheminformatics, 3: 41, 2011.
Ju, J. H., Chen, Y. D. and Chiang, J. H., "DrTW: A Biomedical Term Weighting Method for Document Recommendation," In: BioCreative 2012 Workshop, Washington DC, USA: 45-50, 2012.
Kanehisa, M., Goto, S., Sato, Y., et al., "KEGG for integration and interpretation of large-scale molecular data sets," Nucleic acids research, gkr988, 2011.
Kerrien, S., Alam-Faruque, Y., Aranda, B., et al., "IntAct—open source resource for molecular interaction data," Nucleic acids research, 35: D561-D565, 2007.
Kim, H. J., Yim, G. W., Nam, E. J., et al., "Synergistic effect of cox-2 inhibitor on paclitaxel-induced apoptosis in the human ovarian cancer cell line ovcar-3," Cancer research and treatment: official journal of Korean Cancer Association, 46: 81, 2014.
Kim, J. D., Ohta, T., Pyysalo, S., et al., "Overview of BioNLP'09 shared task on event extraction," In: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task, Association for Computational Linguistics, 1-9, 2009.
Kim, S., Yoon, J., Yang, J., et al., "Walk-weighted subsequence kernels for protein-protein interaction extraction," BMC bioinformatics, 11: 107, 2010.
Krzywinski, M., Schein, J., Birol, I., et al., "Circos: an information aesthetic for comparative genomics," Genome research, 19: 1639-1645, 2009.
Kumar, B. P., Rajput, S., Dey, K. K., et al., "Celecoxib alleviates tamoxifen-instigated angiogenic effects by ROS-dependent VEGF/VEGFR2 autocrine signaling," BMC cancer, 13: 273, 2013.
Leaman, R., Doğan, R. I. and Lu, Z., "DNorm: disease name normalization with pairwise learning to rank," Bioinformatics, btt474, 2013.
Leaman, R. and Gonzalez, G., "BANNER: an executable survey of advances in biomedical named entity recognition," In: Pacific Symposium on Biocomputing, World Scientific, 652-663, 2008.
Li, B., Gao, S., Wei, F., et al., "Simultaneous targeting of EGFR and mTOR inhibits the growth of colorectal carcinoma cells," Oncology reports, 28: 15-20, 2012.
Li, J., Xue, L., Hao, H., et al., "Rapamycin combined with celecoxib enhanced antitumor effects of mono treatment on chronic myelogenous leukemia cells through downregulating mTOR pathway," Tumor Biology, 35: 6467-6474, 2014.
Li, M. and Vitányi, P., "An introduction to Kolmogorov complexity and its applications," Springer Science & Business Media, 2013.
Liu, T. Y., Xu, J., Qin, T., et al., "Letor: Benchmark dataset for research on learning to rank for information retrieval," In: Proceedings of SIGIR 2007 workshop on learning to rank for information retrieval, 3-10, 2007.
Matos, S., Arrais, J. P., Maia-Rodrigues, J., et al., "Concept-based query expansion for retrieving gene related publications from MEDLINE," BMC bioinformatics, 11: 212, 2010.
Metzler, D. and Croft, W. B., "Linear feature-based models for information retrieval," Information Retrieval, 10: 257-274, 2007.
Miwa, M., Sætre, R., Miyao, Y., et al., "Protein–protein interaction extraction by leveraging multiple kernels and parsers," International journal of medical informatics, 78: e39-e46, 2009.
Névéol, A., Doğan, R. I. and Lu, Z., "Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction," Journal of biomedical informatics, 44: 310-318, 2011.
Niu, Y., Otasek, D. and Jurisica, I., "Evaluation of linguistic features useful in extraction of interactions from PubMed; application to annotating known, high-throughput and predicted interactions in I2D," Bioinformatics, 26: 111-119, 2010.
Palumbo, A., Facon, T., Sonneveld, P., et al., "Thalidomide for treatment of multiple myeloma: 10 years later," Blood, 111: 3968-3977, 2008.
Pang, H., Lin, A., Holford, M., et al., "Pathway analysis using random forests classification and regression," Bioinformatics, 22: 2028-2036, 2006.
Pao, W., Miller, V., Zakowski, M., et al., "EGF receptor gene mutations are common in lung cancers from “never smokers” and are associated with sensitivity of tumors to gefitinib and erlotinib," Proceedings of the National Academy of Sciences of the United States of America, 101: 13306-13311, 2004.
Percha, B., Garten, Y. and Altman, R. B., "Discovery and explanation of drug-drug interactions via text mining," In: Pac Symp Biocomput, World Scientific, 421, 2012.
Peri, S., Navarro, J. D., Amanchy, R., et al., "Development of human protein reference database as an initial platform for approaching systems biology in humans," Genome research, 13: 2363-2371, 2003.
Ponte, J. M. and Croft, W. B., "A language modeling approach to information retrieval," In: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, ACM, 275-281, 1998.
Pyysalo, S., Airola, A., Heimonen, J., et al., "Comparative analysis of five protein-protein interaction corpora," BMC bioinformatics, 9: S6, 2008.
Rak, R., Rowley, A., Black, W., et al., "Argo: an integrative, interactive, text mining-based workbench supporting curation," Database, 2012: bas010, 2012.
Ramprasath, M. and Hariharan, S., "Using ontology for measuring semantic similarity for question answering system," In: 2012 IEEE International Conference on Advanced Communication Control and Computing Technologies (ICACCCT), IEEE, 218-223, 2012.
Robertson, S., Walker, S., Jones, S., et al., "MG Okapi at TREC-3," In: Proceedings of the Third Text REtrieval Conference (TREC 1994), 1994.
Romano, L., Kouylekov, M., Szpektor, I., et al., "Investigating a Generic Paraphrase-Based Approach for Relation Extraction," In: European Association of Chinese Linguistics, 2006.
Sætre, R., Sagae, K. and Tsujii, J. i., "Syntactic features for protein-protein interaction extraction," LBM (Short Papers), 319: 2007.
Salton, G. and McGill, M., "Introduction to modern information retrieval. McGraw-Hill. 1983," 1983.
Sarafraz, F. and Nenadic, G., "Using SVMs with the command relation features to identify negated events in biomedical literature," In: Proceedings of the Workshop on Negation and Speculation in Natural Language Processing, Association for Computational Linguistics, 78-85, 2010.
Shannon, P., Markiel, A., Ozier, O., et al., "Cytoscape: a software environment for integrated models of biomolecular interaction networks," Genome research, 13: 2498-2504, 2003.
Slamon, D. J., Clark, G. M., Wong, S. G., et al., "Human breast cancer: correlation of relapse and survival with amplification of the HER-2/neu oncogene," Science, 235: 177-182, 1987.
Stark, C., Breitkreutz, B. J., Reguly, T., et al., "BioGRID: a general repository for interaction datasets," Nucleic acids research, 34: D535-D539, 2006.
Surdeanu, M., Harabagiu, S., Williams, J., et al., "Using predicate-argument structures for information extraction," In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1, Association for Computational Linguistics, 8-15, 2003.
Swanson, D. R., "Fish oil, Raynaud's syndrome, and undiscovered public knowledge," Perspectives in biology and medicine, 30: 7-18, 1986.
Swanson, D. R., "Migraine and magnesium: eleven neglected connections," Perspectives in biology and medicine, 31: 526-557, 1988.
Taurin, S., Nehoff, H., Van Aswegen, T., et al., "A novel role for raloxifene nanomicelles in management of castrate resistant prostate cancer," BioMed research international, 2014: 2014.
Toscano, W. A. and Oehlke, K. P., "Systems biology: new approaches to old environmental health problems," International journal of environmental research and public health, 2: 4-9, 2005.
Weeber, M., Vos, R., Klein, H., et al., "Generating hypotheses by discovering implicit associations in the literature: a case report of a search for new potential therapeutic uses for thalidomide," Journal of the American Medical Informatics Association, 10: 252-259, 2003.
Wei, C. H. and Kao, H. Y., "Cross-species gene normalization by species inference," BMC bioinformatics, 12: S5, 2011.
Wei, C. H., Kao, H. Y. and Lu, Z., "PubTator: A PubMed-like interactive curation system for document triage and literature curation," In: Proceedings of BioCreative, 145-150, 2012.
Whirl‐Carrillo, M., McDonagh, E., Hebert, J., et al., "Pharmacogenomics knowledge for personalized medicine," Clinical Pharmacology & Therapeutics, 92: 414-417, 2012.
WHO, "Anatomical therapeutic chemical (ATC) classification index with defined daily doses (DDDs)," Oslo: WHO Collaborating Centre for Drug Statistics Methodology, 2000.
Wiegers, T. C., Davis, A. P., Cohen, K. B., et al., "Text mining and manual curation of chemical-gene-disease networks for the comparative toxicogenomics database (CTD)," BMC bioinformatics, 10: 326, 2009.
Wiegers, T. C., Davis, A. P. and Mattingly, C. J., "Collaborative biocuration—text-mining development task for document prioritization for curation," Database, 2012: bas037, 2012.
Wolff, T., Miller, T. and Ko, S., "Aspirin for the primary prevention of cardiovascular events: an update of the evidence for the US Preventive Services Task Force," Annals of internal medicine, 150: 405-410, 2009.
Wu, C., Gudivada, R. C., Aronow, B. J., et al., "Computational drug repositioning through heterogeneous network clustering," BMC systems biology, 7: S6, 2013.
Wu, C. H., Arighi, C. N., Cohen, K. B., et al., "BioCreative-2012 virtual issue," Database, 2012: bas049, 2012.
Wu, Q., Burges, C. J., Svore, K. M., et al., "Adapting boosting for information retrieval measures," Information Retrieval, 13: 254-270, 2010.
Xenarios, I., Rice, D. W., Salwinski, L., et al., "DIP: the database of interacting proteins," Nucleic acids research, 28: 289-291, 2000.
Xiao, J., Su, J., Zhou, G. d., et al., "Protein-protein interaction extraction: a supervised learning approach," In: Proc Symp on Semantic Mining in Biomedicine, 51-59, 2005.
Xu, J. and Li, H., "Adarank: a boosting algorithm for information retrieval," In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, ACM, 391-398, 2007.
Xu, R. and Wang, Q., "Large-scale extraction of accurate drug-disease treatment pairs from biomedical literature for drug repurposing," BMC bioinformatics, 14: 181, 2013.
Yakushiji, A., Miyao, Y., Tateisi, Y., et al., "Biomedical information extraction with predicate-argument structure patterns," In: Proceedings of the first International Symposium on Semantic Mining in Biomedicine (SMBM), Hinxton, Cambridgeshire, UK, April, 2005.
Yang, Z., Lin, H. and Li, Y., "BioPPISVMExtractor: A protein–protein interaction extractor for biomedical literature using SVM and rich feature sets," Journal of biomedical informatics, 43: 88-96, 2010.
Zanzoni, A., Montecchi-Palazzi, L., Quondam, M., et al., "MINT: a Molecular INTeraction database," FEBS letters, 513: 135-140, 2002.
Zhai, C. and Lafferty, J., "A study of smoothing methods for language models applied to ad hoc information retrieval," In: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, ACM, 334-342, 2001.
Zhu, F., Shi, Z., Qin, C., et al., "Therapeutic target database update 2012: a resource for facilitating target-oriented drug discovery," Nucleic acids research, gkr797, 2011.