進階搜尋


 
系統識別號 U0026-1307201018510100
論文名稱(中文) 整合詞頻、亂度與關聯探勘之高效性影像語意註解
論文名稱(英文) Effective Semantic Image Annotation Using Term Frequency, Entropy and Association Mining
校院名稱 成功大學
系所名稱(中) 資訊工程學系碩博士班
系所名稱(英) Institute of Computer Science and Information Engineering
學年度 98
學期 2
出版年 99
研究生(中文) 周建利
研究生(英文) Chien-Li Chou
學號 p7697430
學位類別 碩士
語文別 英文
論文頁數 83頁
口試委員 指導教授-曾新穆
口試委員-吳宗憲
口試委員-謝孫源
口試委員-廖弘源
口試委員-鄭卜壬
中文關鍵字 影像註解  詞頻與反向文件頻率  亂度  關聯規則探勘  視覺樣式  支持向量機 
英文關鍵字 image annotation  term frequency and inverse document frequency (TFIDF)  entropy  association mining  visual pattern  Support Vector Machine 
學科別分類
中文摘要 隨著數位影像擷取技術的蓬勃發展,人們接觸到的影像數量有著爆炸性的成長,如此大量的影像需要足夠的相關資訊才能有效率的進行影像檢索,而如何利用各種模型或演算法來找出影像低階特徵值與影像語意的關聯來做自動影像註解是相當困難的工作。近年來,為了完成這個工作,很多研究提出了多種模型來做自動影像註解,但因為影像特徵值與語意上的表現差異,讓自動影像註解無法真正實用。在本研究中,我們提出一個整合詞頻、亂度與關聯探勘的高效性影像語意註解系統,藉由詞頻、反向文件頻率、亂度和關聯探勘,我們可以從影像語意分佈中發現並釐清影像特徵值與影像語意的關聯性,達到高效性的自動影像註解。由實驗的結果顯示,我們提出的方法能夠有效的釐清影像特徵值與影像語意的關聯,比起其他既有的影像註解系統,我們的方法能夠提供較精確且完整的影像註解。
英文摘要 The primary research goal on automatic image annotation is to understand semantics of images automatically from viewpoint of human sense. To this end, a number of studies have been done on how to conceptualize the images by modeling a set of visual features. Unfortunately, up to the present, the contemporary studies are still far from reaching this goal due to some critical problems like diverse regularities between visual features and human concepts. Such diverse regularities make it hard to annotate the image semantics correctly. In this thesis, we propose a novel approach called AICDM (Annotation by Image-Concept Distribution Model) for image annotation by discovering the associations between visual features and human concepts from image-concept distribution model. The uncertain regularities between visual features and human concepts can be clarified for achieving high-quality image annotation. The empirical evaluation results reveal that our proposed AICDM method can effectively alleviate the uncertain regularity problem and bring out better annotation results than other existing approaches in terms of precision, recall and the area under precision-recall curve.
論文目次 ABSTRACT I
中文摘要 III
誌謝 IV
CONTENTS V
List of Tables VI
List of Figures VII
Chapter 1 Introduction 1
1.1 Background 1
1.2 Motivation 3
1.3 Overview of Proposed Method 5
1.4 Contributions 6
1.5 Thesis Organization 8
Chapter 2 Related Work 9
2.1 Classification-Based Annotation 9
2.2 Probabilistic-Based Annotation 10
2.3 Retrieval-Based Annotation 11
Chapter 3 Proposed Image Annotation Technique 14
3.1 Overview of the Proposed Image Annotator 14
3.2 Feature Extraction 17
3.3 Offline Learning 22
3.4 Online Prediction 33
Chapter 4 Experimental Evaluations 41
4.1 Experimental Settings 41
4.2 Experiments on Visual Features 52
4.3 Study of Experimental Evaluations 54
4.4 Experimental Discussions 66
Chapter 5 Conclusions and Future Work 68
5.1 Conclusions 68
5.2 Future Work 69
References 71
VITA 82
Publications 83
參考文獻 [1] Luis von Ahn and Laura Dabbish. “Labeling Images with a Computer Game.” In Proceedings of the SIGCHI Conference on Human Factors on Computer Systems, pages 319-326, April 2004. http://www.espgame.org/
[2] Bing Images, http://www.bing.com/images
[3] Abdelmajid Bouajila, Christopher Claus and Andreas Herkersdorf. “MPEG-7 eXperimentation Model (XM)." Software Avaliable at: http://www.lis.e-technik.tu-muenchen.de/research/bv/topics/mmdb/e_mpeg7.html.
[4] Kobus Barnard, Pinar Duygulu, David Forsyth, Nando de Freitas, David M. Blei and Michael I. Jordan. “Matching Words and Pictures.” The Journal of Machine Learning Research, Vol. 3, pages 1107-1135, March 2003.
[5] Roberto Brunelli and Ornella Mich. “Image Retrieval by Examples.” IEEE Transactions on Multimedia, Vol. 2, No. 3, pages 164-171, September 2000.
[6] Gustavo Carneiro, Antoni B. Chan, Pedro J. Moreno and Nuno Vasconcelos. “Supervised Learning of Semantic Classes for Image Annotation and Retrieval.” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, No. 3, pages 394-410, March 2007.
[7] Claudio Cusano, Gianluigi Ciocca and Raimondo Schettini. “Image Annotation Using SVM.” In Proceedings of Internet Imaging IV, SPIE, Vol. 5304, pages 330-338, January 2003.
[8] Chih-Chung Chang and Chih-Jen Lin. “LIBSVM : a library for support vector machines.” 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
[9] Liangliang Cao, Jiebo Luo, Henry Kautz and Thomas S. Huang. “Image Annotation Within the Context of Personal Photo Collections Using Hierarchical Event and Scene Models.” IEEE Transactions on Multimedia, Vol. 11, No. 2, pages 208-219, February 2009.
[10] Matthew Cooper. “Image Categorization Combining Neighborhood Methods and Boosting.” In Proceedings of the 1st ACM workshop on Large-Scale Multimedia Retrieval and Mining in conjunction with ACM Multimedia, pages 11-18, October 2009.
[11] Ingrid Daubechies. “Ten Lectures on Wavelets.” Capital City Press, 1992.
[12] Jesse Davis and Mark Goadrich. “The Relationship Between Precision-Recall and ROC Curves.” In Proceedings of the 23rd International Conference on Machine Learning, pages 233-240, June 2006.
[13] Ritendra Datta, Dhiraj Joshi, Jia Li and James Z. Wang. “Image Retrieval: Ideas, Influences, and Trends of the New Age.” ACM Computing Surveys, Vol. 40, No. 2, Article 5, pages 1-60, April 2008.
[14] Arthur P. Dempster, Nan M. Laird and Donald B. Rubin. “Maximum Likelihood from Incomplete Data via the EM Algorithm.” Journal of the Royal Statistical Society. Series B (Methodological), Vol. 39, Issue 1, pages 1-38, 1977.
[15] Marie Dumont, Raphaël Marée, Louis Wehenkel and Pierre Geurts. “Fast Multi-Class Image Annotation with Random Subwindows and Multiple Output Randomized Trees.” In Proceedings of the 2009 International Conference on Computer Vision Theory and Applications, Vol. 2, page 196-203, February 2009.
[16] Exchangable Image File Format for Digital Still Cameras: Exif Version 2.2 JEITA CP-3451, Technical Standardization Committee on AV and IT Storage Systems and Equipment and Standard of Japan Electronics and Information Technology Industries Association, April 2002.
[17] Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John Winn and Andrew Zisserman. “The PASCAL Visual Classes Challenge 2007 (VOC2007) Results.” http://www.pascal-network.org/challenges/VOC/voc2007/workshopind
ex.html, 2009.
[18] Facebook, http://www.facebook.com/
[19] Flickr, http://www.flickr.com/
[20] Jianping Fan, Yuli Gao and Hangzai Luo. “Hierarchical Classification for Automatic Image Annotation.” In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 111-118, July 2007.
[21] Shaolei Feng, Raghavan Manmatha and Victor Lavrenko. “Multiple Bernoulli Relevance Models for Image and Video Annotation.” In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1002-1009, June 2004.
[22] Huamin Feng, Rui Shi and Tat-Seng Chua. “A Bootstrapping Framework for Annotating and Retrieving WWW Images.” In Proceedings of the 12th Annual ACM International Conference on Multimedia, pages 960-967, October 2004.
[23] Google Images, http://images.google.com/
[24] Chuanghua Gui, Jing Liu, Changsheng Xu and Hanqing Lu. “Web Image Retrieval via Learning Semantics of Query Image.” In Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, pages 1476-1479, June 2009.
[25] Nicolas Herve and Nozha Boujemaa. “Visual Word Pairs for Automatic Image Annotation.” In Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, pages 430-433, June 2009.
[26] Kyoji Hirata and Toshikazu Kato. “Query by Visual Example - Content Based Image Retrieval.” In Proceedings of the 3rd International Conference on Extending Database Technology: Advances in Database Technology, pages 56–71, March 1992.
[27] Jiawei Han and Micheline Kamber. “Data Mining: Concepts and Techniques.” Morgan Kaufmann Publisher, 1999.
[28] John Anthony Hartigan and Manchek Anthony Wong. “A K-means Clustering Algorithm.” Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 28, No. 1, pages 100-108, 1979.
[29] ISO/IEC 15938-3: MPEG-7 Visual, 2002-03-10.
[30] Anca Loredana Ion. “Image Annotation Based on Semantic Rules.” Human-Computer Systems Interation, AISC 60, pages 83-94, 2009.
[31] Yohan Jin, Kibum Jin, Latifur Khan and Balakrishnan Prabhakaran. “The Randomized Approximating Graph Algorithm for Image Annotation Refinement Problem.” In Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pages 1-8, June 2008.
[32] Yohan Jin, Latifur Khan and Balakrishnan Prabhakaran. “Knowledge Based Image Annotation Refinement.” Journal of Signal Processing Systems, Vol. 58, Issue 3, pages 387-406, March 2010.
[33] Jiwoon Jeon, Victor Lavrenko and Raghavan Manmatha. “Automatic Image Annotation and Retrieval using Cross-Media Relevance Models.” In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 119-126, July 2003.
[34] Jiwoon Jeon and Raghavan Manmatha. “Using Maximum Entropy for Automatic Image Annotation.” In Proceedings of the 3rd International Conference on Image and Video Retrieval, pages 24-32, July 2004.
[35] Leonard Kaufman and Peter J. Rousseeuw. “Finding Groups in Data: an Introduction to Cluster Analysis.” John Wiley & Sons, 1990.
[36] Toshikazu Kato. “Database Architecture for Content-based Image Retrieval.” In Proceedings of the International Society for Optical Engineering, SPIE, Vol. 1662, pages 112-123, 1992.
[37] Xirong Li, Le Chen, Lei Zhang, Fuzong Lin and Wei-Ying Ma. “Image Annotation by Large-Scale Content-based Image Retrieval.” In Proceedings of the 14th Annual ACM International Conference on Multimedia, pages 607-610, October 2006.
[38] Victor Lavrenko, Shaolei Feng and Raghavan Manmatha. “Statistical Models for Automatic Video Annotation and Retrieval.” In Proceedings of the 2004 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 17-21, May 2004.
[39] Jing Liu, Mingjing Li, Wei-Ying Ma, Qingshan Liu and Hanqing Lu. “An Adaptive Graph Model for Automatic Image Annotation.” In Proceedings of the 8th ACM SIGMM International Workshop on Multimedia Information Retrieval in Conjunction with ACM Multimedia, pages 61-70, October 2006.
[40] Zhixin Li, Xi Liu, Zhiping Shi and Zhongzhi Shi. “Learning Image Semantics with Latent Aspect Model.” In Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, pages 366-369, June 2009.
[41] Victor Lavrenko, Raghavan Manmatha and Jiwoon Jeon. “A Model for Learning the Semantics of Pictures.” In Proceedings of the 16th Conference on Advances in Neural Information Processing Systems 16, December 2003.
[42] Zhixin Li, Huifang Ma, Zhiping Shi and Zhongzhi Shi. “A Probabilistic Model for Automatic Image Annotation and Retrieval.” In Proceedings of the 9th IEEE International Conference on Computer and Information Technology, pages 14-19, October 2009.
[43] Teng Li, Tao Mei, Shuicheng Yan, In-So Kweon and Chilwoo Lee. “Contextual Decomposition of Multi-Label Images.” In Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 2270-2277, June 2009.
[44] Wei Li and Maosong Sun. “Automatic Image Annotation Using Maximum Entropy Model.” In Proceedings of the 2nd International Joint Conference on Natural Language Processing, pages 34-45, October 2005.
[45] Xirong Li, Cees G.M. Snoek and Marcel Worring. “Learning Tag Relevance by Neighbor Voting for Social Image Retrieval.” In Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval, pages 180-187, October 2008.
[46] Jia Li and James Z. Wang. “Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach.” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 25, No. 9, pages 1075-1088, September 2003.
[47] Jia Li and James Z. Wang. “Real-Time Computerized Annotation of Pictures.” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 6, pages 985-1002, June 2008.
[48] Florent Monay and Daniel Gatica-Perez. “PLSA-based Image Auto-Annotation: Constraining the Latent Space.” In Proceedings of the 12th Annual ACM International Conference on Multimedia, pages 348-351, October 2004.
[49] Bangalore S. Manjunath, Jens-Rainer Ohm, Vinod V. Vasudevan and Akio Yamada. “Color and Texture Descriptors.” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 11, No. 6, pages 703-715, June 2001.
[50] Ameesh Makadia, Vladimir Pavlovic and Sanjiv Kumar. “Baselines for Image Annotation.” International Journal of Computer Vision May 2010. DOI: 10.1007/s11263-010-0338-6. Online Publish: http://www.springerlink.com/conte
nt/g43x5630014323v0/
[51] Tao Mei, Yong Wang, Xian-Sheng Hua, Shaogang Gong and Shipeng Li. “Coherent Image Annotation by Learning Semantic Distance.” In Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1-8, June 2008.
[52] Hideki Nakayama, Tatsuya Harada and Yasuo Kuniyoshi. “Canonical Contextual Distance for Large-Scale Image Annotation and Retrieval.” In Proceedings of the 1st ACM workshop on Large-Scale Multimedia Retrieval and Mining in conjunction with ACM Multimedia, pages 3-10, October 2009.
[53] Milind Naphade, John R. Smith, Jelena Tesic, Shih-Fu Chang, Winston Hsu, Lyndon Kennedy, Alexander Hauptmann and Jon Curtis. “Large-Scale Concept Ontology for Multimedia.” IEEE Transactions on Multimedia, Vol. 13, No. 3, pages 86-91, July-September 2006.
[54] Gulisong Nasierding, Grigorios Tsoumakas and Abbas Z. Kouzani. “Clustering Based Multi-Label Classification for Image Annotation and Retrieval.” In Proceedings of the 2009 IEEE International Conference on System, Man and Cybernetics, October 2009.
[55] Luong-Dong Nguyen, Ghim-Eng Yap, Ying Liu, Ah-Hwee Tan, Liang-Tien Chia and Joo-Hwee Lim. “A Bayesian Approach Integrating Regional and Global Features for Image Semantic Learning.” In Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, pages 546-549, June 2009.
[56] Picasa, http://picasaweb.google.com/
[57] Jia-Yu Pan, Hyung-Jeong Yang, Christos Faloutsos and Pinar Duygulu. “Automatic Multimedia Cross-modal Correlation Discovery.” In Proceedings of the 10th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 653-658, August 2004.
[58] Xiaojun Qi and Yutao Han. “Incorporating Multiple SVMs for Automatic Image Annotation.” Pattern Recognition, Vol. 40, Issue 2, pages 728-741, February 2007.
[59] Vijay Raghavan, Peter Bollmann, Gwang S. Jung. "A Critical Investigation of Recall and Precision as Measures of Retrieval System Performance." ACM Transactions on Information System, Vol. 7, Issue 3, pages 205-229, 1989.
[60] Xiaoguang Rui, Mingjing Li, Zhiwei Li, Wei-Ying Ma and Nenghai Yu. “Bipartite Graph Reinforcement Model for Web Image Annotation.” In Proceedings of the 15th Annual ACM International Conference on Multimedia, pages 585-594, September 2007.
[61] Ja-Hwung Su, Yu-Ting Huang, Hsin-Ho Yeh and Vincent. S. Tseng. "Effective Content-based Video Retrieval Using Pattern Indexing and Matching Techniques." Expert Systems with Applications, Vol. 37, Issue 7, pages 5068-5085, July 2010.
[62] Jianbo Shi and Jitendra Malik. “Normalized Cuts and Image Segmentation.” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No. 8, pages 888-905, August 2000.
[63] Arnold W.M. Smeulders, Marcel Worring, Simone Santini, Amarnath Gupta and Ramesh Jain. “Content-based Image Retrieval at the End of the Early Years.” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No. 12, pages 1349-1380, December 2000.
[64] Chih-Fong Tsai and Chihli Hung. “Automatically Annotating Images with Keywords: A Review of Image Annotation Systems.” Recent Patterns on Computer Science, No. 1, pages 55-68, Janyary 2008.
[65] Chih-Fong Tsai, Ken Mcgrarry and John Tait. “CLAIRE: A Modular Support Vector Image Indexing and Classification System.” ACM Transactions on Information Systems, Vol. 24, No. 3, pages 353-379, July 2006.
[66] Vincent S. Tseng, Ja-Hwung Su, Bo-Wen Wang and Yu-Ming Lin. “Web Image Annotation by Fusing Visual Features and Textual Information.” In Proceedings of the 22nd Annual ACM Symposium on Applied Computing, pages 1056-1060, March 2007.
[67] Jakob Verbeek, Matthieu Guillaumin, Thomas Mensink and Cordelia Schmid. “Image Annotation with TagProp on the MIRFLICKR Set.” In Proceedings of the 11th ACM International Conference on Multimedia Information Retrieval, pages 537-546, March 2010.
[68] Khanh Vu, Kien A. Hua, Ning Jiang. “Improving Image Retrieval Effectiveness in Query-by-Example Environment.” In Proceedings of the 2003 ACM symposium on Applied Computing, pages 774-781, March 2003.
[69] Lei Wu, Steven C.H. Hoi and Nenghai Yu. “Semantics-Preserving Bag-of-Words Models for Efficient Image Annotation.” In Proceedings of the 1st ACM workshop on Large-Scale Multimedia Retrieval and Mining in Conjunction with ACM Multimedia, pages 19-26, October 2009.
[70] Changhu Wang, Feng Jing, Lei Zhang and Hong-Jiang Zhang. “Scalable Search-based Image Annotation.” ACM Multimedia Systems Journal, Vol. 14, No. 4, pages 205-220, 2008.
[71] Roger C. F. Wong and Clement H. C. Leung. “Automatic Semantic Annotation of Real-World Web Images.” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 11, pages 1933-1944, November 2008.
[72] Fei Wu, Dingyi Xia, Yueting Zhuang, Hanwang Zhang and Wenhao Liu. “Web Image Interpretation: Semi-Supervised Mining Annotated Words.” In Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, pages 1512-1515, June 2009.
[73] Xin-Jing Wang, Lei Zhang, Feng Jing and Wei-Ying Ma. “Annosearch: Image Auto-Annotation by Search.” In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1483-1490, June 2006.
[74] Xin-Jing Wang, Lei Zhang, Xirong Li and Wei-Ying Ma. “Annotating Images by Mining Image Search Results.” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 11, pages 1919-1932, November 2008.
[75] Changhu Wang, Lei Zhang and Hong-Jiang Zhang. “Learning to Reduce the Semantic Gap in Web Image Retrieval and Annotation.” In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 355-362, July 2008.
[76] Hongtao Xu, Xiangdong Zhou and Lan Lin. “WISA: A Novel Web Image Semantic Analysis System.” In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 777-778, July 2008.
[77] Yahoo! Image Search, http://images.search.yahoo.com/
[78] Changbo Yang, Ming Dong and Jing Hua. “Region-based Image Annotation using Asymmetrical Support Vector Machine-based Multiple-Instance Learning.” In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 2057-2063, June 2006.
[79] Akira Yanagawa, Winston Hsu and Shih-Fu Chang. “Brief Descriptions of Visual Features for Baseline TRECVID Concept Detectors.” Columbia University ADVENT Technical Report #219-2006-5, July 2006.
[80] Oksana Yakhnenko and Vasant Honavar. “Multiple Label Prediction for Image Annotation with Multiple Kernel Correlation Models.” In Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pages 8-15, June 2009.
[81] Hua-Jun Zeng, Qi-Cai He, Zheng Chen, Wei-Ying Ma and Jinwen Ma. “Learning to Cluster Web Search Results.” In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 210-217, July 2004.
[82] Jianke Zhu, Steven C. H. Hoi, Michael R. Lyu and Shuicheng Yan. “Near-Duplicate Keyframe Retrieval by Nonrigid Image Matching.” In Proceedings of the 16th Annual ACM International Conference on Multimedia, pages 41-50, October 2008.
[83] Zheng-Jun Zha, Xian-Sheng Hua, Tao Mei, Jingdong Wang, Guo-Jun Qi and Zengfu Wang. “Joint Multi-Label Multi-Instance Learning for Image Classification.” In Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1-8, June 2008.
[84] Ruofei Zhang, Zhongfei (Mark) Zhang, Mingjing Li, Wei-Ying Ma and Hong-Jiang Zhang. “A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieval.” ACM Multimedia Systems Journal, Vol. 12, No. 1, pages 27-33, August 2006.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2012-07-29起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2012-07-29起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw