進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-3101201910171000
論文名稱(中文) 利用遷移學習、卷積神經網路與具種族多樣性之影像資料集開發乳房X光攝影的全自動乳癌分類系統
論文名稱(英文) Development of automatic breast cancer classification for mammography with convolutional neural network, transfer learning, and racially diverse datasets
校院名稱 成功大學
系所名稱(中) 生物醫學工程學系
系所名稱(英) Department of BioMedical Engineering
學年度 107
學期 1
出版年 108
研究生(中文) 楊濰澤
研究生(英文) Wei-Tse Yang
學號 P86074080
學位類別 碩士
語文別 英文
論文頁數 68頁
口試委員 指導教授-方佑華
口試委員-孫永年
口試委員-王士豪
口試委員-楊境睿
中文關鍵字 深度學習  遷移學習  卷積神經網路  乳房X光線攝影  乳癌  召回率 
英文關鍵字 Deep Learning  Convolutional Neural Network  Mammography  Transfer Learning  Breast Cancer  Recall Rate 
學科別分類
中文摘要 乳癌是女性最常見的癌症。在台灣與亞洲,診斷的平均年齡比西方國家年輕10歲。為了檢測早期的乳癌,乳房X光線攝影是廣泛使用的影像造影方式。但是,篩檢有非常高的召回率。為了解決這個問題,幾十年來,研究試圖使用影像處理方法來建置電腦輔助診斷系統。最近,研究開始使用深度學習的方法。這些深度學習的研究在整張X光影像的分類上,已經展現出不錯的結果。然而,放射科醫生通常知道病變的位置,但他們在一些很難診斷的病例中,並不知道這些病例是不是癌症。因此,在我們的研究中,我們嘗試建構一個系統,該系統賴於手動圈選ROI而非整個圖像。此外,為了檢驗此模型是否可以降低高召回率,我們分析了模型在BI-RADS 3和4的表現。最後,我們檢測用西方數據訓練的模型是否可以應用於亞洲人口。

我們收集了三個公開數據集分別來自CBIS-DDSM,BCDR和INbreast,我們也從成大醫院收集了一個數據集。為了預測各種影像尺寸的ROI,我們提出了patch classification-based model和ROI pooling。Patch classification-based model是指CNN被延伸來處理更大影像。我們的方法可以處理四種不同大小的ROI。 ROI pooling可以處理任何尺寸的ROI。為了解決數據不平衡問題,我們調整了一個mini batch-size的數據組成,並我們也使用class weight。在研究中,patch classification-based model可以獲得整體73%的準確度,AUC也可以達到0.70以上。相比之下,ROI只能有61%的準確度。我們的研究發現,當陰性(negative)病例比陽性(positive)病例多20倍時,class weight只達到55%的準確率,但調整數據組成可達到約65%的準確率。當我們的模型與人類專家進行比較時,我們的實驗顯示專家在BI-RADS 3和4中僅具有50%的準確度,但我們的模型可以保持67%的準確度。此外,我們的模型在應用於成大醫院數據集時可以達到78%的準確率。我們的研究結果證明深度學習有很大的潛力來降低診斷的高召回率。此外,它也證明用西方數據集訓練的模型似乎可以適用於亞洲人口而無需任何再訓練。雖然我們能需要更多資料來驗證我們的模型,目前研究結果已經展現出深度學習在降低召回率以及應用在亞洲人口上有不錯的結果。
英文摘要 Breast Cancer is the most common cancer for women. In Taiwan and Asia countries, the average age of diagnosis is 10 years younger than that of western countries. To detect the breast cancer in the early stage, mammography is the most widely used modality. However, such screening usually results in a high recall rate or a high false positive rate. For decades, to solve this problem, researchers have tried to use image processing methods to build the CAD system. Recently, researchers have started to use methods of deep learning. The research of deep learning has shown promising results to build classifiers for whole X-ray images. Nonetheless, experts generally have known the location of the lesion, but it was difficult for them to diagnose some cases, especially in BI-RADS 3 and 4. Therefore, in our research, we tried to build the system that was dependent on the manual ROI extraction rather than whole images. Furthermore, to examine whether the model can reduce the high recall rate, we analyzed the performance of BI-RADS 3 and 4. Lastly, we examined whether the model trained with western data could be applied to the Asian population.

We collected three public datasets from CBIS-DDSM, BCDR, and INbreast and one private dataset from NCKU Hospital. To predict ROIs in various sizes, we adopted the patch classification-based model and ROI pooling. Patch classification-based model means the common convolutional neural networks was extended to process larger images. Our patch classification-based model can process ROIs in four different sizes. ROI pooling can process ROIs in any sizes. To solve the problem of data imbalance, we adjusted the data composition in one min-batch size and utilized the class weight. In patch classification-based model, our research result has shown the overall accuracy of 73.1% and achieved the AUC above 0.70 in any ROI sizes. Compared with patch classification-based model, ROI pooling only got the accuracy of 61%. In addition, our research found that when negative cases were 20 times more than positive cases, the class weight only achieved the accuracy of 55%, but the adjustment of data composition can achieve about 65%. When we compared with human experts, our experiment showed experts only possessed the accuracy of 50% in BI-RADS 3 and 4, but our models can maintain 67%. Moreover, our model can achieve the accuracy of 78% when it was applied to the dataset of NCKU Hospital. Our research results have shown that deep learning had the potential to reduce the high recall rate in clinics. Besides, it has demonstrated that the model trained with western dataset seemed to be applicable to Asian population without any fine-tuning. Although we still needed more clinical data to verify our results, our proposed model has shown promising results in the reduction of recall rate and the application of the Asian population.
論文目次 Chapter 1 Overview 1
Chapter 2 Introduction 2
2.1 Breast Cancer 2
2.1.1 Mammography 3
2.1.2 The High Recall Rate 5
2.2 Computer Aided Diagnosis (CAD) System 7
2.2.1 The Benefit of the CAD System 7
2.2.2 Image Processing Technology 7
2.2.2.1 Texture Analysis 8
2.2.2.2 Detection for Masses 8
2.2.2.3 Detection for microcalcification 11
2.2.2.4 The Advantage of image processing technology 12
2.2.3 The Current CAD system 12
2.2.4 The CAD System on the Asian population 13
2.3 Deep Learning 15
2.3.1 Transfer Learning 16
2.3.2 Current Deep Learning Research on Mammography 18
2.3.2.1 Multiple-Stage Classification 19
2.3.2.2 Methods for Segmentation 20
2.3.2.3 Methods for Object Detection 21
2.3.2.4 Patch-Classification-Based Methods 22
2.4 Specific Aims 24
Chapter 3 Material and Methods 26
3.1 Data Acquisition 26
3.1.1 CBIS-DDSM 27
3.1.2 INbreast and Breast Cancer Digital Repository 27
3.1.3 National Cheng Kung University Hospital 27
3.2 Data Preprocessing and the Preparation for Training 30
3.2.1 Preprocessing 30
3.2.2 The Definition of the Size of Region of Interest 30
3.2.3 The Way to Split the Data 30
3.2.4 Data Augmentation 33
3.2.5 The Extraction from Background 33
3.3 Proposed Architecture 34
3.3.1 The Extension of the Convolutional Neural Network 36
3.3.1.1 Block Design 36
3.3.2 ROI Pooling 37
3.4 Data Generator 38
3.5 Imbalanced Classes 39
3.5.1 Class Weights 39
3.5.2 The Adjustment of Data Composition in One Mini-Batch Size 39
3.6 Training Detail 40
3.7 Li Shen’s Research 43
3.7.1 The Dataset 43
3.7.2 Data Augmentation 43
3.7.3 The Architecture of the Method 43
3.7.4 Training Detail 44
3.8 Operating Environment 45
Chapter 4 Results and Discussion 46
4.1 Class weights and the adjustment of data composition in one mini-batch size 46
4.2 Classification with 3 classes and 2 classes 47
4.3 The performance of each channel of the classification with 2 classes 49
4.4 The comparison between ROI pooling and patch classification-based model 52
4.5 The analysis with BI-RADS and CBIS-DDSM 53
4.6 The comparison between Shen’s research and my research 54
4.6.1 The number of iterations 55
4.6.2 The resizing 56
4.6.3 The bug in their program 56
4.7 The results from the data from NCKU Hospital 57
4.8 The Summary 59
4.8.1 Class Weight vs. The adjustment of data composition in one mini-batch size 59
4.8.2 The classification with 2class vs. 3 class 59
4.8.3 Patch classification-based model vs. ROI pooling 59
4.8.4 The deep learning model vs. Human experts 60
4.8.5 Li Shen’s research vs. My model 60
4.8.6 The performance of NCKU Hospital 60
4.9 Future works 61
Chapter 5 Conclusion 62
References 63
參考文獻 [1] American Cancer Society, “Breast Cancer Facts & Figures,” 2018. [Online]. Available: https://www.cancer.org/research/cancer-facts-statistics/breast-cancer-facts-figures.html.
[2] Centers for Disease Control and Prevention, “What Are the Risk Factors for Breast Cancer?” [Online]. Available: https://www.cdc.gov/cancer/breast/basic_info/risk_factors.htm. [Accessed: 13-Jan-2019].
[3] Y.Ping Chen, Y.-W.Lu, andC.-C.Yang, “Breast cancer trend in Taiwan,” 2014.
[4] A.Oliver et al., “A review of automatic mass detection and segmentation in mammographic images,” Med. Image Anal., vol. 14, no. 2, pp. 87–110, Apr.2010.
[5] “Breast Cancer Screening Methods - Our Bodies Ourselves.” [Online]. Available: https://www.ourbodiesourselves.org/book-excerpts/health-article/breast-cancer-screening-methods/. [Accessed: 14-Jan-2019].
[6] The Radiology Assistant, “The Radiology Assistant : Bi-RADS for Mammography and Ultrasound 2013.” [Online]. Available: http://www.radiologyassistant.nl/en/p53b4082c92130/bi-rads-for-mammography-and-ultrasound-2013.html. [Accessed: 20-Dec-2018].
[7] American College of Radiology. BI-RADS Committee., ACR BI-RADS atlas : breast imaging reporting and data system. American College of Radiology, 2013.
[8] T.Kooi et al., “Large scale deep learning for computer aided detection of mammographic lesions,” Med. Image Anal., vol. 35, pp. 303–312, Jan.2017.
[9] D. R.Rebecca Sawyer Lee, Francisco Gimenez, Assaf Hoogi, “CBIS-DDSM,” 2016. [Online]. Available: https://wiki.cancerimagingarchive.net/display/Public/CBIS-DDSM#688d8158918a466dbb57df5af4fd4e11.
[10] G. S.Thea Norman, Justin Guinney, “The Digital Mammogrpahy DREAM Challenge,” 2017. [Online]. Available: https://www.synapse.org/#!Synapse:syn4224222/wiki/401743.
[11] F. M.Hall, J. M.Storella, D. Z.Silverstone, andG.Wyshak, “Nonpalpable breast lesions: recommendations for biopsy based on suspicion of carcinoma at mammography.,” Radiology, vol. 167, no. 2, pp. 353–358, May1988.
[12] K. M.Yee, “Radiologist factors influence mammography recall rates.” [Online]. Available: https://www.auntminnie.com/index.aspx?sec=sup&sub=wom&pag=dis&ItemID=121365. [Accessed: 20-Dec-2018].
[13] Jo Cavallo, “Study Finds Deep Learning Can Distinguish Recalled-Benign Mammogram Images From Malignant and Negative Cases.” [Online]. Available: http://www.ascopost.com/News/59370.
[14] G.Castellano, L.Bonilha, L. M.Li, andF.Cendes, “Texture analysis of medical images,” Clin. Radiol., vol. 59, no. 12, pp. 1061–1069, Dec.2004.
[15] R. M.Haralick, K.Shanmugam, andI.Dinstein, “Textural Features for Image Classification,” IEEE Trans. Syst. Man. Cybern., vol. SMC-3, no. 6, pp. 610–621, 1973.
[16] C.Varela, P. G.Tahoces, A. J.Méndez, M.Souto, andJ. J.Vidal, “Computerized detection of breast masses in digitized mammograms,” Comput. Biol. Med., vol. 37, no. 2, pp. 214–226, Feb.2007.
[17] N.Karssemeijer andG. M.teBrake, “Detection of stellate distortions in mammograms,” IEEE Trans. Med. Imaging, vol. 15, no. 5, pp. 611–619, 1996.
[18] T. C.Wang, N. B.Karayiannis, andN. B.Karayiannis, “Detection of microcalcifications in digital mammograms using wavelets,” IEEE Trans. Med. Imaging, vol. 17, no. 4, pp. 498–509, 1998.
[19] T. W.Freer andM. J.Ulissey, “Screening mammography with computer-aided detection: Prospective study of 12,860 patients in a community breast center,” Radiology, vol. 220, no. 3, pp. 781–786, 2001.
[20] L. A. L.Khoo, P.Taylor, andR. M.Given-Wilson, “Computer-aided detection in the United Kingdom National Breast Screening Programme: Prospective study,” Radiology, vol. 237, no. 2, pp. 444–449, 2005.
[21] P.Taylor, J.Champness, R.Given-Wilson, K.Johnston, andH.Potts, “Impact of computer-aided detection prompts on the sensitivity and specificity of screening mammography HTA Health Technology Assessment NHS R&D HTA Programme,” Health Technol. Assess. (Rockv)., vol. 9, no. 6, 2005.
[22] J. J.Fenton et al., “Effectiveness of Computer-Aided Detection in Community Mammography Practice,” JNCI J. Natl. Cancer Inst., vol. 103, no. 15, pp. 1152–1161, Aug.2011.
[23] G. E.Hinton, S.Osindero, andY.-W.Teh, “A Fast Learning Algorithm for Deep Belief Nets,” Neural Comput., vol. 18, no. 7, pp. 1527–1554, Jul.2006.
[24] Y.Bengio, P.Lamblin, D.Popovici, andH.Larochelle, “Greedy Layer-Wise Training of Deep Networks.”
[25] Andrew Beam, “Deep Learning 101 - Part 1: History and Background.” [Online]. Available: https://beamandrew.github.io/deeplearning/2017/02/23/deep_learning_101_part1.html. [Accessed: 23-Dec-2018].
[26] J.Deng, W.Dong, R.Socher, L.-J.Li, K.Li, andL.Fei-Fei, “ImageNet: A Large-Scale Hierarchical Image Database.”
[27] M. Z.Alom et al., “The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches,” Mar.2018.
[28] K.Simonyan andA.Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” Sep.2014.
[29] C.Szegedy et al., “Going Deeper with Convolutions.” 2015.
[30] K.He, X.Zhang, S.Ren, andJ.Sun, “Deep Residual Learning for Image Recognition,” Dec.2015.
[31] S.Ioffe andC.Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” Feb.2015.
[32] Marco Jacobs, “Deep Learning in Five and a Half Minutes.” [Online]. Available: https://www.embedded-vision.com/industry-analysis/blog/deep-learning-five-and-half-minutes. [Accessed: 24-Dec-2018].
[33] S. J.Pan andQ.Yang, “A Survey on Transfer Learning,” 2009.
[34] N.Tajbakhsh et al., “Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning?,” IEEE Trans. Med. Imaging, vol. 35, no. 5, pp. 1299–1312, May2016.
[35] S.Kornblith, J.Shlens, andQ.V.Le, “Do Better ImageNet Models Transfer Better?,” May2018.
[36] L.Shen, “End-to-end Training for Whole Image Breast Cancer Diagnosis using An All Convolutional Design,” vol. 3000, pp. 1–12, 2017.
[37] E. L.Ridley, “AI reduces false positives in screening mammography.” [Online]. Available: https://www.auntminnie.com/index.aspx?sec=ser&sub=def&pag=dis&ItemID=122130. [Accessed: 22-Dec-2018].
[38] M.Kallenberg et al., “Unsupervised Deep Learning Applied to Breast Density Segmentation and Mammographic Risk Scoring,” IEEE Trans. Med. Imaging, vol. 35, no. 5, pp. 1322–1331, May2016.
[39] N.Dhungel, G.Carneiro, andA. P.Bradley, “Automated Mass Detection in Mammograms Using Cascaded Deep Learning and Random Forests,” in 2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2015, pp. 1–8.
[40] J.Melendez, C. I.Sánchez, B.vanGinneken, andN.Karssemeijer, “Improving mass candidate detection in mammograms via feature maxima propagation and local feature selection,” Med. Phys., vol. 41, no. 8Part1, p. 081904, Jul.2014.
[41] Y.Guan, “GuanLab’s Solution to 2017 Digital Mammography Challenge,” 2017. [Online]. Available: https://www.synapse.org/#!Synapse:syn7221819/wiki/411277.
[42] O.Ronneberger, P.Fischer, andT.Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” May2015.
[43] DeepLearning 0.1 documentation, “U-Net — DeepLearning 0.1 documentation.” [Online]. Available: http://deeplearning.net/tutorial/unet.html. [Accessed: 26-Dec-2018].
[44] J.Long, E.Shelhamer, andT.Darrell, “Fully Convolutional Networks for Semantic Segmentation.”
[45] Sasank Chilamkurthy, “A 2017 Guide to Semantic Segmentation with Deep Learning.” [Online]. Available: http://blog.qure.ai/notes/semantic-segmentation-deep-learning-review. [Accessed: 26-Dec-2018].
[46] H.Li, D.Chen, W. H.Nailon, M. E.Davies, andD.Laurenson, “Improved Breast Mass Segmentation in Mammograms with Conditional Residual U-Net,” Springer, Cham, 2018, pp. 81–89.
[47] D.Ribli, A.Horváth, Z.Unger, P.Pollner, andI.Csabai, “Detecting and classifying lesions in mammograms with Deep Learning,” Sci. Rep., vol. 8, no. 1, p. 4165, 2018.
[48] S.Ren, K.He, R.Girshick, andJ.Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.” pp. 91–99, 2015.
[49] J.Sulam andP.Kisilev, “Maximizing AUC with Deep Learning for Classification of Imbalanced Mammogram Datasets,” Eurographics Work. Vis. Comput. Biol. Med., 2017.
[50] M. A.Guevara Lopez andB. C. D. R.Consortium, “Breast Cancer Digital Repository,” 2012. [Online]. Available: http://bcdr.inegi.up.pt.
[51] I. C.Moreira, I.Amaral, I.Domingues, A.Cardoso, M. J.Cardoso, andJ. S.Cardoso, “INbreast: Toward a Full-field Digital Mammographic Database,” Acad. Radiol., vol. 19, no. 2, pp. 236–248, Feb.2012.
[52] K.He, X.Zhang, S.Ren, andJ.Sun, “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition BT - Computer Vision – ECCV 2014,” 2014, pp. 346–361.
[53] Tomasz Grel, “Region of interest pooling explained.” [Online]. Available: https://deepsense.ai/region-of-interest-pooling-explained/. [Accessed: 31-Dec-2018].
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2019-02-12起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw