進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-3107202014481800
論文名稱(中文) 藉由注意力網路區塊模組驅動深度類神經網路與其應用
論文名稱(英文) Attention Network Block Module Drives Deep Neural Network and Their Applications
校院名稱 成功大學
系所名稱(中) 資訊工程學系
系所名稱(英) Institute of Computer Science and Information Engineering
學年度 108
學期 2
出版年 109
研究生(中文) 王麒詳
研究生(英文) Chi-Shiang Wang
電子信箱 hyaline0317@iir.csie.ncku.edu.tw
學號 P78041049
學位類別 博士
語文別 英文
論文頁數 62頁
口試委員 指導教授-蔣榮先
口試委員-孫永年
口試委員-李宗儒
口試委員-賀保羅
口試委員-張育誌
口試委員-郝沛毅
口試委員-楊家融
中文關鍵字 注意力機制  深度學習  不確定性資訊  醫療影像  推薦系統 
英文關鍵字 Attention Mechanism  Deep Learning  Uncertainty Information  Medical Image  Recommender System 
學科別分類
中文摘要 近年來,深度類神經網路成功應用在許多領用上,像是影像處理任務、自然語 言處理、推薦系統。為了提升深度類神經模型的特徵擷取能力與預測效能,網路架 構的設計自然成為了決定性的重要關鍵。一般研究中大致從三個面向來分析並嘗試 強化類神經網路,包含:(1)搭建更深與更廣的網路架構、(2)網路架構的自動搜尋及 (3)藉由注意機制的模組強化特徵。而其中(3)藉由注意力機制模組強化的方式擁有相 對較低的開發成本,並能有效地應用在各種現有著名的網路架構上。
因此本研究著重在探討與建置創新的注意力網路模組來強化既有的現有的類神 經網路架構,分別針對推薦系統與影像上設計不同類型的注意力網路模組,來強化 模型的效能與加強特徵擷取。在推薦系統上,我們主要發展兩種不同的注意力網路 模組,來強化模型在擷取使用者喜好上,以此加強模型預測的效能。另一方面,我 們同時也設計了新穎的 Uncertainty Attention (UA) 注意力模組應用在電腦視覺的任 務上,讓模型除了關注原先特徵圖上重要的資訊外,同時也讓模型考量其他不確定 性的資訊,藉此發掘具有潛在資訊的區域,以此提升模型預測效能。
我們將所設計的注意力網路模組分別應用在推薦系統與醫學影像的任務上。在 不同資料集的實驗結果中,我們發現在我們所設計的注意力網路模組下,與原始模 型及其他的注意力模組有更顯著的效能提升。由實驗結果所示,更加呈現注意力網 路模組可以在最小的改動下加強模型的特徵擷取能力與效能。
英文摘要 In recent years, the deep neural network (DNN) successfully applied on many different tasks, such as computer visions (CV), natural language processing (NLP), and recommender system (RS). Therefore, many studies working on increasing model performance in practice. In general, three different ideas that can improve the DNN included the (1) deeper and wider network architecture, (2) automatic architecture search, and (3) the convolutional attention block. The attention network block module is a flexible and lower-cost approach. It makes the model to extract more efficient features.
Therefore, we designed novel attention network block modules to improve models on RS and CV tasks. We developed two different attention network modules, FuzzAttention and Topic Diversity Discovering (TDD) on RS, to improve the user preference representation, then increase the recommended ability. On the other side, we also designed an Uncertainty Attention (UA) that is an attention network to discover the potential information from the uncertain region on feature maps in CV tasks.
We evaluated our designed attention network blocks to combine with existed models on RS and CV tasks. In the experimental results, our attention modules outperformed the original model and other attention approaches. According to the results showed that the attention network is one of the efficient ways to increase the model’s performance.
論文目次 摘要 i
Abstract ii
Acknowledgement iii
Contents v
List of Tables viii
List of Figures ix
Chapter 1. Introduction 1
1.1 Background and Motivation 1
1.2 Purpose and Specific Aims 5
1.3 Organization of the Dissertation 6
Chapter 2. Related Work 7
2.1 Deeper Network Architecture 7
2.2 Automatic Architecture Search 8
2.3 Attention Block Module 9
Chapter 3. Attention Network in RS 10
3.1 Attention Mechanism 10
3.2 FuzzAttention Module in Session-based RS 12
3.3 Topic Diversity Discovering Module in BPR 14
3.3.1 Multi-Embedding 14
3.3.2 Topic Diversity Discovering Module 15
3.3.3 TDD Module in Bayesian Personalized Ranking 17
3.4 Summary 19
Chapter 4. Attention Network in CV 20
4.1 Convolutional Attention Block 20
4.1.1 Squeeze-and-Excitation 20
4.1.2 Convolutional Block Attention Module 22
4.2 Uncertainty Attention Block 24
4.2.1 Uncertainty Estimation 24
4.2.2 Uncertainty Attention Block 25
4.2.3 UA Block in Network 29
4.3 Summary 30
Chapter 5. Experimental Results 32
5.1 Attention Network Block Module in RS 32
5.1.1 FuzzAttention in Session-based RS 32
5.1.2 TDD Block in Bayesian Personalized Ranking 35
5.1.3 Hyperparameters in TDD Block 39
5.2 Uncertainty Attention Block Module in CV 41
5.2.1 Pneumonia Detection by Segmentation Model 41
5.2.2 Pneumothorax Segmentation 46
5.2.3 Hyperparameters in UA Block 49
5.2.4 The Position of UA Block in Network 50
Chapter 6. Conclusion and Future Work 52
REFERENCES 55
參考文獻 Bahdanau, D., Cho, K., & Bengio, Y. "Neural Machine Translation by Jointly Learning to Align and Translate". ArXiv Preprint ArXiv:1409.0473, 2014.
Barkan, O., & Koenigstein, N. "Item2vec: neural item embedding for collaborative filtering". 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), 1–6, 2016.
Bello, I., Zoph, B., Vaswani, A., Shlens, J., & Le, Q. V. "Attention augmented convolutional networks". Proceedings of the IEEE International Conference on Computer Vision, 3286–3295, 2019.
Blei, D. M., Ng, A. Y., & Jordan, M. I. "Latent dirichlet allocation". Journal of Machine Learning Research, 3(Jan), 993–1022, 2003.
Chen, L., Zhang, H., Xiao, J., Nie, L., Shao, J., Liu, W., & Chua, T.-S. "Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5659–5667, 2017.
Chen, X., Xu, H., Zhang, Y., Tang, J., Cao, Y., Qin, Z., & Zha, H. "Sequential Recommendation with User Memory Networks". Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining - WSDM ’18, 108–116, 2018.
Chiu, J. P. C., & Nichols, E. "Named entity recognition with bidirectional LSTM-CNNs". Transactions of the Association for Computational Linguistics, 4, 357–370, 2016.
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. "Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling". ArXiv Preprint ArXiv:1412.3555, 1–9, 2014.
Computation, N. "Long Short-term Memory", 2016.
Cortes, C., Gonzalvo, X., Kuznetsov, V., Mohri, M., & Yang, S. "Adanet: Adaptive structural learning of artificial neural networks". Proceedings of the 34th International Conference on Machine Learning-Volume 70, 874–883, 2017.
Geng, X., Zhang, H., Bian, J., & Chua, T. S. "Learning image and user features for recommendation in social networks". Proceedings of the IEEE International Conference on Computer Vision, 2015 Inter, 4274–4282, 2015.
Ghiasi, G., Lin, T.-Y., & Le, Q. V. "Nas-fpn: Learning scalable feature pyramid architecture for object detection". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7036–7045, 2019.
Girshick, R. "Fast r-cnn". Proceedings of the IEEE International Conference on Computer Vision, 1440–1448, 2015.
Graves, A., Wayne, G., & Danihelka, I. "Neural Turing Machines". ArXiv Preprint ArXiv:1410.5401, 2014.
Hastings, W. K. "Monte Carlo sampling methods using Markov chains and their applications", 1970.
He, K., Gkioxari, G., Dollár, P., & Girshick, R. "Mask r-cnn". Proceedings of the IEEE International Conference on Computer Vision, 2961–2969, 2017.
He, K., Zhang, X., Ren, S., & Sun, J. "Deep residual learning for image recognition". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016.
He, K., Zhang, X., Ren, S., & Sun, J. "Identity mappings in deep residual networks". European Conference on Computer Vision, 630–645, 2016.
He, Xiangnan, Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T.-S. "Neural Collaborative Filtering". Proceedings of the 26th International Conference on World Wide Web., 173–182, 2017.
He, Xin, Zhao, K., & Chu, X. "AutoML: A Survey of the State-of-the-Art". ArXiv Preprint ArXiv:1908.00709, 2019.
Hidasi, B., Quadrana, M., Karatzoglou, A., & Tikk, D. "Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations". 241–248, 2016.
Hu, J., Shen, L., & Sun, G. "Squeeze-and-Excitation Networks". Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 7132–7141, 2018.
Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. "Densely connected convolutional networks". Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017-Janua, 2261–2269, 2017.
Jannach, D., & Ludewig, M. "When Recurrent Neural Networks meet the Neighborhood for Session-Based Recommendation". Proceedings of the Eleventh ACM Conference on Recommender Systems - RecSys ’17, 2017.
Jiang, S., Qian, X., Shen, J., Fu, Y., & Mei, T. "Author topic model-based collaborative filtering for personalized POI recommendations". IEEE Transactions on Multimedia, 17(6), 907–918, 2015.
Kabbur, S., Ning, X., & Karypis, G. "Fism: factored item similarity models for top-n recommender systems". Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 659–667, 2013.
Kim, D., Park, C., Oh, J., Lee, S., & Yu, H. "Convolutional matrix factorization for document context-aware recommendation". RecSys 2016 - Proceedings of the 10th ACM Conference on Recommender Systems, 233–240, 2016.
Kim, Y. "Convolutional neural networks for sentence classification". ArXiv Preprint ArXiv:1408.5882, 2014.
Krizhevsky, A., Hinton, G., & others. "Learning multiple layers of features from tiny images", 2009.
Liaw, A., Wiener, M., & others. "Classification and regression by randomForest". R News, 2(3), 18–22, 2002.
Lin, C.-T., Lee, C. S. G., Lin, C.-T., & Lin, C. T. "Neural fuzzy systems: a neuro-fuzzy synergism to intelligent systems (Vol. 205)", 1996.
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. "Feature pyramid networks for object detection". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2117–2125, 2017.
Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. "Learning entity and relation embeddings for knowledge graph completion". Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015.
Liu, Q., Wu, S., Wang, D., Li, Z., & Wang, L. "Context-Aware sequential recommendation". Proceedings - IEEE International Conference on Data Mining, ICDM, 2017.
Long, J., Shelhamer, E., & Darrell, T. "Fully convolutional networks for semantic segmentation". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3431–3440, 2015.
Luong, M.-T., Pham, H., & Manning, C. D. "Effective approaches to attention-based neural machine translation". ArXiv Preprint ArXiv:1508.04025, 2015.
Murtagh, F. "Multilayer perceptrons for classification and regression". Neurocomputing, 2(5–6), 183–197, 1991.
Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., & Lin, D. "Libra r-cnn: Towards balanced learning for object detection". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 821–830, 2019.
Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. "Deep contextualized word representations". ArXiv Preprint ArXiv:1802.05365, 2018.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. "You only look once: Unified, real-time object detection". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 779–788, 2016.
Redmon, J., & Farhadi, A. "YOLO9000: better, faster, stronger". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7263–7271, 2017.
Ren, S., He, K., Girshick, R., & Sun, J. "Faster r-cnn: Towards real-time object detection with region proposal networks". Advances in Neural Information Processing Systems, 91–99, 2015.
Rendle, S., Freudenthaler, C., Gantner, Z., & Schmidt-Thieme, L. "BPR: Bayesian personalized ranking from implicit feedback". Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, 452–461, 2009.
Ronneberger, O., Fischer, P., & Brox, T. "U-net: Convolutional networks for biomedical image segmentation". Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9351, 234–241, 2015.
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. "Grad-cam: Visual explanations from deep networks via gradient-based localization". Proceedings of the IEEE International Conference on Computer Vision, 618–626, 2017.
Srivastava, R. K., Greff, K., & Schmidhuber, J. "Highway networks". ArXiv Preprint ArXiv:1505.00387, 2015.
Sugeno, M. "Industrial applications of fuzzy control", 1985.
Sutskever, I., Vinyals, O., & Le, Q. V. "Sequence to sequence learning with neural networks". Advances in Neural Information Processing Systems, 3104–3112, 2014.
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. A. "Inception-v4, inception-resnet and the impact of residual connections on learning". Thirty-First AAAI Conference on Artificial Intelligence, 2017.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., … Rabinovich, A. "Going deeper with convolutions". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–9, 2015.
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. "Rethinking the inception architecture for computer vision". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2818–2826, 2016.
Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard, A., & Le, Q. V. "Mnasnet: Platform-aware neural architecture search for mobile". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2820–2828, 2019.
Tan, M., & Le, Q. V. "Efficientnet: Rethinking model scaling for convolutional neural networks". ArXiv Preprint ArXiv:1905.11946, 2019.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. "Attention is all you need". Advances in Neural Information Processing Systems, 5998–6008, 2017.
Wang, C. S., & Chiang, J. H. "FuzzAttention on Session-based Recommender System". IEEE International Conference on Fuzzy Systems, 2019-June, 2019.
Wang, G., Li, W., Aertsen, M., Deprest, J., Ourselin, S., & Vercauteren, T. "Aleatoric uncertainty estimation with test-time augmentation for medical image segmentation with convolutional neural networks". Neurocomputing, 338, 34–45, 2019.
Wang, H., Zhang, F., Wang, J., Zhao, M., Li, W., Xie, X., & Guo, M. "Ripplenet: Propagating user preferences on the knowledge graph for recommender systems". Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 417–426, 2018.
Wang, S., Hu, L., Cao, L., Huang, X., Lian, D., & Liu, W. "Attention-based transactional context embedding for next-item recommendation". 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, 2532–2539, 2018.
Weston, J., Chopra, S., & Bordes, A. "Memory Networks". ArXiv Preprint ArXiv:1410.3916, 2014.
Wickstrøm, K., Kampffmeyer, M., & Jenssen, R. "Uncertainty and interpretability in convolutional neural networks for semantic segmentation of colorectal polyps". Medical Image Analysis, 60, 101619, 2020.
Woo, S., Park, J., Lee, J.-Y., & So Kweon, I. "Cbam: Convolutional block attention module". Proceedings of the European Conference on Computer Vision (ECCV), 3–19, 2018.
Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. "Aggregated residual transformations for deep neural networks". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1492–1500, 2017.
Yu, X. P. "Item-Based collaborative filtering recommendation algorithm". Proceedings of the 2009 International Conference on Machine Learning and Cybernetics, 3, 1570–1576, 2009.
Zhou, Z., Rahman Siddiquee, M. M., Tajbakhsh, N., & Liang, J. "Unet++: A nested u-net architecture for medical image segmentation". Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11045 LNCS, 3–11, 2018.
Zöller, M.-A., & Huber, M. F. "Survey on automated machine learning". ArXiv Preprint ArXiv:1904.12054, 9, 2019.
Zoph, B., & Le, Q. V. "Neural architecture search with reinforcement learning". ArXiv Preprint ArXiv:1611.01578, 2016.
Zoph, B., Vasudevan, V., Shlens, J., & Le, Q. V. "Learning transferable architectures for scalable image recognition". Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 8697–8710, 2018.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw