進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-1607202003252500
論文名稱(中文) 視頻重排序的感知流形學習
論文名稱(英文) Learning a Perceptual Manifold for Animation Video Resequencing
校院名稱 成功大學
系所名稱(中) 資訊工程學系
系所名稱(英) Institute of Computer Science and Information Engineering
學年度 108
學期 2
出版年 109
研究生(中文) 查爾斯
研究生(英文) Charles C. Morace
學號 P76037010
學位類別 碩士
語文別 英文
論文頁數 32頁
口試委員 指導教授-李同益
口試委員-孫永年
口試委員-林昭宏
口試委員-紀明德 
口試委員-林士勛
中文關鍵字 none 
英文關鍵字 Computer Graphics  Video Re-­sequencing  Transfer Learning  Manifold Learn­ing 
學科別分類
中文摘要 none
英文摘要 This work proposes a framework for animation video re­sequencing using deep learning and optimal graph traversal techniques. The proposed system produces new animation sequences by reordering a collection of animation images or existing animation video. To maintain tem­ poral coherence in the generated animation sequences, a perceptual distance is utilized so that adjacent frames in the re­sequenced animations are as perceptually similar as possible. To measure perceptual distance, we extract image features using activations of deep convolu­ tional neural networks and learn a perceptual distance by training these activation features on a small network with data comprised of human perceptual judgments. With this perceptual metric and graph­based manifold learning techniques, the framework can produce smooth and visually appealing animation results for a variety of animation styles. In contrast to pre­ vious work on animation re­sequencing, the proposed framework applies to a broader range of image styles and does not require hand­crafted feature extraction, background subtrac­ tion, or feature correspondence. The framework has additional applications to sequencing unstructured collections of images.
論文目次 Abstract i
Acknowledgements ii
Table of Contents iii
List of Tables v
List of Figures vi
Chapter 1. Introduction 1
Chapter 2. Related Work 3
Chapter 3. System Overview 7
Chapter 4. Method 9
Chapter 5. Results 15
Chapter 6. Conclusion 28
References 29
Appendix 32
參考文獻 [1] Hadar Averbuch­Elor and Daniel Cohen­Or. Ringit: Ring­ordering casual photos of a temporal event. ACM Trans. Graph., 34(3):33:1–33:11, May 2015.
[2] Hadar Averbuch­Elor, Daniel Cohen­Or, and Johannes Kopf. Smooth image sequences for data­driven morphing. Comput. Graph. Forum, 35(2):203–213, May 2016.
[3] Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput., 15(6):1373–1396, June 2003.
[4] Qifeng Chen and Vladlen Koltun. Photographic image synthesis with cascaded refine­ ment networks. CoRR, abs/1707.09405, 2017.
[5] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Intro­ duction to Algorithms, Third Edition. The MIT Press, 3rd edition, 2009.
[6] Christina de Juan and Bobby Bodenheimer. Cartoon textures. In Proceedings of the 2004 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’04, pages 267–276, Aire­la­Ville, Switzerland, Switzerland, 2004. Eurographics Associa­ tion.
[7] Alexey Dosovitskiy and Thomas Brox. Generating images with perceptual similarity metrics based on deep networks. CoRR, abs/1602.02644, 2016.
[8] OhadFried,ShaiAvidan,andDanielCohen­Or.Patch2Vec:GloballyConsistentImage Patch Representation. Computer Graphics Forum, 2016.
[9] Michael R. Garey and David S. Johnson. Computers and Intractability; A Guide to the Theory of NP­Completeness. W. H. Freeman & Co., New York, NY, USA, 1990.
[10] L. A. Gatys, A. S. Ecker, and M. Bethge. Image style transfer using convolutional neu­ ral networks. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2414–2423, June 2016.
[11] Daniel Holden, Jun Saito, Taku Komura, and Thomas Joyce. Learning motion mani­ folds with convolutional autoencoders. In SIGGRAPH Asia 2015 Technical Briefs, SA ’15, pages 18:1–18:4, New York, NY, USA, 2015. ACM.
[12] D. P. Huttenlocher, G. A. Klanderman, and W. J. Rucklidge. Comparing images using the hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelli­ gence, 15(9):850–863, Sep 1993.
[13] Justin Johnson, Alexandre Alahi, and Fei­Fei Li. Perceptual losses for real­time style transfer and super­resolution. CoRR, abs/1603.08155, 2016.
[14] Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. Reinforcement learning: A survey. J. Artif. Int. Res., 4(1):237–285, May 1996.
[15] M. G. Kendall. A new measure of rank correlation. Biometrika, 30(1/2):81–93, 1938.
[16] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Sys­ tems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Pro­ ceedings of a meeting held December 3­6, 2012, Lake Tahoe, Nevada, United States., pages 1106–1114, 2012.
[17] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th International Con­ ference on Neural Information Processing Systems ­ Volume 1, NIPS’12, pages 1097– 1105, USA, 2012. Curran Associates Inc.
[18] J. B. Kruskal. On the shortest spanning subtree of a graph and the traveling salesman problem. In Proceedings of the American Mathematical Society, 7, 1956.
[19] J. B. Kruskal. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29(1):1–27, Mar 1964.
[20] A. Kushal, B. Self, Y. Furukawa, D. Gallup, C. Hernandez, B. Curless, and S. M. Seitz. Photo tours. In 2012 Second International Conference on 3D Imaging, Modeling, Pro­ cessing, Visualization Transmission, pages 57–64, Oct 2012.
[21] Gilbert Laporte. The traveling salesman problem: An overview of exact and approxi­ mate algorithms. European Journal of Operational Research, 59(2):231 – 247, 1992.
[22] Haibin Ling and David W. Jacobs. Shape classification using the inner­distance. IEEE Trans. Pattern Anal. Mach. Intell., 29(2):286–299, February 2007.
[23] Margarita Osadchy, Yann Le Cun, and Matthew L. Miller. Synergistic face detection and pose estimation with energy­based models. J. Mach. Learn. Res., 8:1197–1215, May 2007.
[24] Sam T. Roweis and Lawrence K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323–2326, 2000.
[25] Arno Schödl and Irfan A. Essa. Machine learning for video­based rendering. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Process­ ing Systems 13, pages 1002–1008. MIT Press, 2001.
[26] Arno Schödl and Irfan A. Essa. Controlled animation of video sprites. In Proceedings of the 2002 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’02, pages 121–127, New York, NY, USA, 2002. ACM.
[27] Arno Schödl, Richard Szeliski, David H. Salesin, and Irfan Essa. Video textures. In Proceedings of the 27th Annual Conference on Computer Graphics and Interac­ tive Techniques, SIGGRAPH ’00, pages 489–498, New York, NY, USA, 2000. ACM Press/Addison­Wesley Publishing Co.
[28] K. Schoeffmann and D. Ahlstrom. Similarity­based visualization for image browsing revisited. In 2011 IEEE International Symposium on Multimedia, pages 422–427, Dec 2011.
[29] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large­ scale image recognition. CoRR, abs/1409.1556, 2014.
[30] E. W. Stacy. A generalization of the gamma distribution. The Annals of Mathematical Statistics, 33(3):1187–1192, 1962.
[31] Wolfram Research, Inc. Mathematica 11.3, 2018.
[32] J. Yu, D. Liu, D. Tao, and H. S. Seah. On combining multiple features for cartoon char­ acter retrieval and clip synthesis. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 42(5):1413–1427, Oct 2012.
[33] J. Yu, M. Wang, and D. Tao. Semisupervised multiview distance metric learning for cartoon synthesis. IEEE Transactions on Image Processing, 21(11):4636–4648, Nov 2012.
[34] Jun Yu, Jun Cheng, and Dacheng Tao. Interactive cartoon reusing by transfer learning. Signal Process., 92(9):2147–2158, September 2012.
[35] RichardZhang,PhillipIsola,AlexeiA.Efros,EliShechtman,andOliverWang.Theun­ reasonable effectiveness of deep features as a perceptual metric. CoRR, abs/1801.03924, 2018.
[36] Shang­Wei Zhang, Charles C. Morace, Thi Ngoc Hanh Le, Chih­Kuo Yeh, Sheng­Yi Yao, Shih­Syun Lin, and Tong­Yee Lee. Animation video resequencing with a con­ volutional autoencoder. In SIGGRAPH Asia 2019 Posters, SA 2019, Brisbane, QLD, Australia, November 17­20, 2019, pages 19:1–19:2. ACM, 2019.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2021-07-29起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2023-10-15起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw