進階搜尋


   電子論文尚未授權公開,紙本請查館藏目錄
(※如查詢不到或館藏狀況顯示「閉架不公開」,表示該本論文不在書庫,無法取用。)
系統識別號 U0026-1402201616270500
論文名稱(中文) 考量社群互動樣態及發文演變趨勢之社群媒體資料分析方法
論文名稱(英文) Methods of Data Analytics for Social Media by Considering Social Interaction Patterns and Post Evolving Trends
校院名稱 成功大學
系所名稱(中) 資訊工程學系
系所名稱(英) Institute of Computer Science and Information Engineering
學年度 104
學期 1
出版年 105
研究生(中文) 傅夢璇
研究生(英文) Meng-Hsuan Fu
學號 P78971252
學位類別 博士
語文別 英文
論文頁數 108頁
口試委員 召集委員-李宗南
指導教授-郭耀煌
口試委員-陳培殷
口試委員-高宏宇
口試委員-陳俊良
口試委員-洪盟峰
口試委員-李健興
中文關鍵字 社群媒體  資料分析法  文章趨勢預測  意見分析 
英文關鍵字 social media  data analytics  post trend forecast  sentiment analysis 
學科別分類
中文摘要 近年來社群媒體服務廣泛被使用,舉凡個人生活分享、同好社群聚集、公司店鋪行銷或政府機構資訊流通等皆透過社群媒體迅速散佈訊息,其中隱含大量具有意義的資訊,因此本篇論文著重於開發社群媒體資料分析之技術,藉由蒐集社群媒體平台上公開資訊,挖掘隱藏其中之有效資訊,發展基於使用者行為之資料分析技術,以期提供更有效率的預測方法、更準確的分類技術及更適切的服務。首先,擷取社群媒體平台上文字內容之語意分類方法,融合社群媒體使用者之間的社群互動關係,計算使用者之間語意之影響層級、情緒表現一致性及社交熱衷程度,發展一套整合式意見傾向分析技術,並將技術實際應用於社群媒體平台上關於總統大選及食安議題之討論文章,實驗結果顯示此技術之分類準確率優於先前研究者提出之意見傾向分類方法。再者,因應社群媒體資訊快速流通之特性,社群媒體使用者所發表之文章經由時間序列集合可形成該文章之演變軌跡,藉由社群網路文章演變趨勢之歷史資料建立文章早期軌跡模型,用以預測社群網路文章之早期少量軌跡之趨勢型態,並將此方法實際應用於社群媒體平台新進文章早期軌跡之型態預測,實驗結果顯示此方法之早期型態預測錯誤率低於先前研究者提出之時間序列方法,且可更精確的將型態細分為四類,並同時降低運算過程之時間成本。此外,為改善一位使用者同時擁有多個社群媒體空間之管理問題,提出一套社群資料分析整合平台,其中包含跨平台社群媒體資料管理之整合介面並整合資料分析服務;在資料管理之整合介面中,使用者放置於多個社群媒體空間之檔案,包含照片、影片及文件等,基於檔案安全性考量自動備份儲存至個人網路附加儲存裝置,使用者僅需透過整合檔案管理之介面即可管理儲存於不同社群媒體空間之檔案;使用者可以透過個人化社群媒體服務之整合介面同時連結多個社群媒體並使用資料分析服務,包含熱門文章排序、意見傾向分類方法、意見一致使用者推薦服務;使用者可透過私有社群網路空間之介面分享隱私檔案;實驗結果顯示此平台有益於加速及簡化跨平台社群媒體之檔案管理,並同時提供資料分析服務。以上所提及之資料分析技術及其實驗成果將完整呈現於本篇論文。
英文摘要 Social media provides a platform for people to share their life experiences, for grouping friends together, for advertising business products, and for governments to announce information. With the various services provided by social media, the number of users continues to gradually increase. However, determining how best to understand users’ demands and perspectives from their information sharing in social media is an important challenge. Therefore, this dissertation focuses on the development of data analytic technologies based on the implicit information of post relative content and user behaviours hiding in social media.
Firstly, the sentiment analysis method related to social media users based on integrating their posts of textual opinion and social interactions is proposed herein. With this method, a social opinion graph which indicates users’ social actions and relationships is constructed; the sentiment guiding matrix denotes the influential strength between users’ sentiments, the textual sentiment classifier is built for classifying textual opinion, and social enthusiasm is considered as the care degree between two users. The method is applied to the real cases of Taiwan’s presidential election and hot social issues. The experimental results of the integrated sentiment classification achieve better accuracy compared to previous research for both cases.
Secondly, the early forecast method of post-based evolution trajectory (TEF) on social media is presented herein, In model generation phase, the historical data of post forwarding and responding activities on social media are collected in consecutive time for forming the post-based evolution trajectory. Then, the classification function of each trend type is defined for classifying post-based evolution trajectory which is then labeled as one of the trend types, the L-type, B-type, D-type and G-type. The post-based evolution trajectory model is then generated by random forests for further forecasting. In trajectory forecast phase, the real-time data of post forwarding and responding activities are collected in consecutive time for forming the target trajectory, which is forecasted according to the stages of the classification estimation, correlation evaluation and distance calculation between the target trajectory and post-based evolution trajectory model. TEF is applied on the social media posts, achieving excellent performance in trend type forecasting and forecasting the trend type into four classifications, while also reducing the time consumed in data processing.
Thirdly, the integrated social data analytics hub is built with an integrated content viewer and social data analytic services. In this hub, user’s files, which are stored in different social media platforms are backed up in the local network attached storage. The files such as video, pictures or document stored in user’s multiple social media spaces are managed only through the viewer of unified file management. Moreover, the viewer of private social network is constructed for sharing files with close friends and family instead of in public social media spaces. Furthermore, the microblog information in different social media platforms displays at a time through the viewer of personal social media services. In this viewer, three social data analytic services are provided including hotly discussed post analysis, sentiment analysis and resonance user mining. The experimental results show that the hub builds an environment for managing cross-platform information through the integrated content viewer easily, and provides various social data analytic services.
論文目次 List of Tables X
List of Figures XII
1. Introduction 1
1.1 Background 1
1.2 Motivation 2
1.3 Contribution 4
1.4 Organization 6
2. Related Work 7
2.1 Characters of Microblog 7
2.2 User-Generated Content on Social Media 8
2.3 Textual Opinion Mining of Social Media Content 9
2.4 Non-Textual Social Information Analysis 11
2.5 Time Series Data Analytics 11
3. Integrated Sentiment Analysis from Textual Content and Social Interactions 16
3.1 Data Collection and Social Opinion Graph 17
3.1.1 Social Opinion Graph Construction 17
3.1.2 Problem Formulation 19
3.2 Training Phase and Textual Sentiment Classification 20
3.2.1 Sentiment Guiding Matrix Construction 20
3.2.2 Textual Sentiment Classifier 23
3.2.3 User-Level Textual Sentiment Classification 24
3.3 Integrated Sentiment Analysis 25
3.3.1 Emotion Homophily 25
3.3.2 Social Enthusiasm 26
3.3.3 Relaxation Labeling 27
3.3.4 Integrated User-Level Sentiment Classification 28
3.4 Experimental Results 30
3.4.1 2012 Presidential Election in Taiwan 30
3.4.2 2014 Hot Social Issues in Taiwan 40
4. Early Forecast via Post-based evolution trajectory Analytics 49
4.1 Post-based evolution trajectory Model Generation 50
4.1.1 Concrete Problem Setting 50
4.1.2 Post-based evolution trajectory 51
4.1.3 Setting of the Trend Types 52
4.1.4 Post-based evolution trajectory Model Generation 55
4.2 Trajectory Early Forecast Method 58
4.3 Performance Analysis 62
4.3.1 Data Processing 63
4.3.2 Model Generation Results 66
4.3.3 Trajectory Forecast Results 67
5. Integrated Social Data Analytics Hub 74
5.1 System Architecture and Functions 75
5.1.1 System Architecture 75
5.1.2 System Functions 76
5.2 Social Data Analytic Services 77
5.2.1 Hotly Discussed Posts Analysis 78
5.2.2 Sentiment Classification 82
5.2.3 Resonance User Mining 83
5.3 Performance Analysis 86
5.3.1 System Performance 86
5.3.2 Results of Social Data Analytic Services 91
6. Conclusion 93
References 96
Author’s Publications 107
參考文獻 [Agr12] Agrawal, A., Kumar, V., Pandey, A. and Khan, I. (2012), “An Application of Time Series Analysis for Weather Forecasting,” Journal of Engineering Research and Application, vol. 2(2), pp. 974-980.
[Aie12] Aiello, L. M., Barrat, A., Schifanella, R., Cattuto, C., Markines, B. and Menczer, F. (2012), “Friendship Prediction and Homophily in Social Media,” ACM Transactions on the Web, vol. 6(2), No. 9, pp. 1-33.
[Ama14] Amaral, F., Tiago, T. and Tiago, F. (2014), “User-generated Content: Tourists' Profiles on TripAdvisor,” International Journal on Strategic Innovative Marketing, vol. 1(3), pp. 137-147.
[Ang06] Angelova, R. and Weikum, G. (2006), “Graph-based Text Classification: Learn from Your Neighbors,” ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 485–492.
[Asi13] Asimakopoulos, S. and Dix, A. (2013), “Forecasting Support Systems Technologies-in-practice: A Model of Adoption and Use for Product Forecasting,” International Journal of Forecasting, vol. 29(2), pp. 322-336.
[Asu11] Asur, S., Huberman, B. A., Szabo, G. and Wang, C. (2011), “Trends in Social Media: Persistence and Decay,” Proceedings of Computing Research Repository.
[Bar14] Barnett, A., Mumtaz, H. and Theodoridis, K. (2014), “Forecasting UK GDP Growth and Inflation under Structural Change. A Comparison of Models with Time Varying Parameters,” International Journal of Forecasting, vol. 30(1), pp. 129-143.
[Bös14] Böse, M., Allen, R. M., Brown, H., Cua, G., Fischer, M., Hauksson, E., Heaten, T. H., Hellweg, M., Liukis, M., Neuhauser, D., Maechling, P. J., Solanki, K., Vinci, M., Henson, I., Khainovski, O., Kuyuk, S., Carpio, M., Meier, M.-A. and Jordan, T. (2014), “CISN ShakeAlert: An Earthquake Early Warning Demonstration System for California,” Early Warning for Geological Disasters, Part of the Series Advanced Technologies in Earth Sciences, Chapter 3, pp. 49-69.
[Box76] Box, G. E. P. and Jenkins, G. M. (1976), “Time Series Analysis: Forecasting and Control,” San Francisco: Holden-Day.
[Bre96] Breiman, L. (1996), “Bagging Predictors,” Machine Learning, vol. 24(2), pp. 123-140.
[Bre01] Breiman, L. (2001), “Random Forests,” Machine Learning, vol. 45(1), pp. 5-32.
[Bro11] Brody, S. and Diakopoulos, N. (2011), “Cooooooooooooooollllllllllllll!!!!!!!!!!!!!!: Using Word Lengthening to Detect Sentiment in Microblogs,” Proceedings Conference on Empirical Methods in Natural Language Processing, Edinburgh, United Kingdom, pp. 562–570.
[Brü06] Brück, T. and Stephan, A. (2006), “Do Eurozone Countries Cheat with their Budget Deficit Forecasts?” KYKLOS, International Review for Social Sciences, vol. 59(1), pp. 3-15.
[Cat10] Cataldi, M., Caro, L. D. and Schifanella, C.(2010), “Emerging Topic Detection on Twitter based on Temporal and Social Terms Evaluation,” In Proceedings of the Tenth International Workshop on Multimedia Data Mining, No. 4, pp. 1-10.
[Che13] Chen, N., Ribeiro, B., Vieira, A. and Chen, A. (2013), “Clustering and Visualization of Bankruptcy Trajectory using Self-organizing Map,” Expert Systems with Applications, vol. 40(1), pp. 385-393.
[Cla08] Clauset, A., Moore, C. and Newman, M. E. J. (2008), “Hierarchical Structure and the Prediction of Missing Links in Networks,” Nature 453, pp. 98-101.
[Dav10] Davidov, D., Tsur, O. and Rappoport, A. (2010), “Enhanced Sentiment Learning Using Twitter Hashtags and Smileys,” International Conference on Computational Linguistics, pp. 241–249.
[Don13] Dong, X. and Pi, D. C. (2013), “Novel Method for Hurricane Trajectory Prediction based on Data Mining,” Natural Hazards and Earth System Sciences, vol. 13, pp. 3111-3220.
[Erc07] Ercan, G. and Cicekli, I. (2007), “Using Lexical Chains for Keyword Extraction,” Information Processing and Management, vol. 43(6), pp. 1705-1714.
[Fri99] Friedman, N., Getoor, L. and Koller, D. (1999), “Learning Probabilistic Relational Models,” International Joint Conference on Artificial Intelligence, pp. 1300–1309.
[Fu12] Fu, M.-H., Lin, F.-Y. Lee, K.-R. and Kuo, Y.-H. (2012), “Resonance-relationship Network Construction by Information Analysis Based on Microblog Interactions,” International Conference on Creative Content Technologies, pp. 8-13.
[Gaf99] Gaffney, S. and Smyth, P. (1999), “Trajectory Clustering with Mixtures of Regression Models,” Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 63–72.
[Gol14] Goldhammer, M., Doll, K., Brunsmann, U., Gensler, A. and Sick, B. (2014), “Pedestrian's Trajectory Forecast in Public Traffic with Artificial Neural Networks,” International Conference on Intelligent Transportation System, pp. 1758-1763.
[Gra14] Grahl, J. Rothlauf, F. and Hinz, O. (2014), “The Impact of User-Generated Content on Sales: A Randomized Field Experiment,” Working Paper Series, Technischen Universität Darmstadt.
[Hat93] Hatzivassiloglou, V. and Mckeown. K. R. (1993), “Towards the Automatic Identification of Adjectival Scales: Clustering Adjectives According to Meaning,” Proceeding of 31st Annual Meeting of the Association for Computational Linguistics, Columbus, pp. 172-182.
[Hec07] Heckerman, D., Meek, C. and Koller, D. (2007), “Probabilistic Entity-relationship Models, PRMs, and Plate Models,” Introduction to Statistical Relational Learning, pp. 201-239.
[Hsi13] Hsiao, L.-F., Yang, M.-J., Lee, C.-S., Kuo, H.-C., Shih, D.-S., Tsai, C.-C., Wang, C.-J., Chang, L.-Y., D. Chen, Y.-C., Feng, L., Hong, J.-S., Fong, C.-T., Chen, D.-S., Yeh, T.-C., Huang, C.-Y., Guo, W.-D. and Lin, G.-F. (2013), “Ensemble Forecasting of Typhoon Rainfall and Floods over a Mountainous Watershed in Taiwan,” Journal of Hydrology, pp. 55-68.
[Hu09] Hu, X., Sun, N., Zhang, C. and Chua, T.-S. (2009), “Exploiting Internal and External Semantics for the Clustering of Short Texts using World Knowledge,” Proceeding of the 18th ACM Conference on Information and Knowledge Management, New York, USA, pp. 919-928.
[Jah12] Jahanbakhsh, K., King, V. and Shoja, G. C. (2012), “Predicting Missing Contacts in Mobile Social Networks,” Pervasive and Mobile Computing, pp. 698–716.
[Jia11] Jiang, L., Yu, M., Zhou, M., Liu, X. and Zhao, T. (2011), “Target-dependent Twitter Sentiment Classification.,” Proceedings 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, pp. 151–160.
[Kap11] Kaplan, A. M. and Haenlein, M. (2011), “The Early Bird Catches the News: Nine Things You Should Know about Micro-blogging,” Business Horizons, pp. 105-113.
[Kit85] Kittler, J. and Illingworth, J. (1986), “Relaxation Labelling Algorithms - A Review,” Journal of Image and Vision Computing, vol. 3(4), pp. 206–216.
[Kle02] Kleinberg, J. (2002), “Bursty and Hierarchical Structure in Streams,” Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 91–101.
[Kot04] Kotsiantis, S. and Pintelas, P. (2004), “Combining Bagging and Boosting,” International Journal of Computational Intelligence, vol. 1(4), pp. 324-333.
[Kum14] Kumar, M. and Anand, M. (2014), “An Application of Time Series ARIMA Forecasting Model for Predicting Sugarcane Production in India,” Studies in Business and Economics, vol. 9(1), pp. 81-94.
[Kuo15] Kuo, Y.-H., Fu, M.-H., Tsai, W.-H., Lee, K.-R., and Chen, L.-Y. (2015), “Integrated Microblog Sentiment Analysis from Users’ Social Interaction Patterns and Textual Opinions,” Journal of Applied Intelligence, pp. 1-15.
[Len05] Lenser, S. and Veloso, M. (2005), “Nonparametric Time Series Classification,” IEEE International Conference on Robotics and Automation, pp. 3918-3923.
[Lia13] Liang, P.-W. and Dai, B.-R. (2013), “Opinion Mining on Social Media Data,” Proceedings 14th IEEE International Conference on Mobile Data Management, pp. 91-96.
[Lin98] Lin, D. (1998), “An Information-theoretic Definition of Similarity,” Proceedings of the Fifteenth International Conference on Machine Learning, pp. 296–304.
[Liu10a]Liu, B. (2010), Sentiment Analysis and Subjectivity, Chapter 26, Handbook of Natural Language Processing, 2nd ed., Chapman and Hall, pp. 627-666.
[Liu10b]Liu, Z., Yu, W., Chen, W., Wang, S. and Wu, F. (2010), “Short Text Feature Selection for Micro-Blog Mining,” Proceedings of International Conference on Computation Intelligence and Software Engineering, pp. 1-4.
[Liu13] Liu, W.-C., Fu, M.-H., Lee, K.-R. and Kuo, Y.-H. (2013), “A Content Fusion System Based on User Participation Degree on Microblog,” International Conference on Industrial, Engineering Applications of Artificial Intelligence and Expert Systems, pp. 83-90.
[Mat10] Mathioudakis, M. and Koudas, N. (2010), “Twittermonitor: Trend Detection Over the Twitter Stream,” Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 1155–1158.
[McP01] McPherson, M., Smith-Lovin, L. and Cook, J. M. (2001), “Birds of a Feather: Homophily in Social Networks,” Annual Review of Sociology, vol. 27, pp. 415-444.
[Nag10] Nagin, S. D. and Odgers, C. L. (2010), “Group-Based Trajectory Modeling in Clinical Research,” The Annual Review of Clinical Psychology, vol. 6, pp. 109–38.
[Nik12] Nikolov, S. and Shah, D. (2012), “A Nonparametric Method for Early Detection of Trending Topics,” Workshop on Information and Decision in Social Networks.
[Pak10] Pak, A. and Paroubek, P. (2010), “Twitter as a Corpus for Sentiment Analysis and Opinion Mining,” Proceedings of the Seventh Conference on Language Resources and Evaluation, vol. 10, pp. 1320–1326.
[Pan04] Pang, B. and Lee, L. (2004), “A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization based on Minimum Cuts,” Association for Computational Linguistics, pp. 271-278.
[Pan08] Pang, B. and Lee, L. (2008), “Opinion Mining and Sentiment Analysis,” Foundations and Trends in Information Retrieval, vol. 2, pp. 1-135.
[Qui96] Quinlan, J. R. (1996), “Improved Use of Continuous Attributes in C4.5,” Journal of Artificial Intelligence Research, vol. 4, pp. 77-90.
[Rey13] Reyes, J., Morales-Esteban, A. and Martínez-Álvarez, F. (2013), “Neural Networks to Predict Earthquakes in Chile,” Applied Soft Computing, vol. 13, pp. 1314-1328.
[Sch99] Schapire, R. E. (1999), “A Brief Introduction to Boosting,” International Joint Conference on Artificial Intelligence, vol. 2, pp. 1401-1406.
[Sta15] Stavrianea, A. and Kavoura, A. (2015), “Social Media's and Online User-generated Content's Role in Services Advertising,” AIP Conference Proceedings, vol. 1644(1), pp. 318-324.
[Suh10] Suh, B., Hong, L., Pirolli, P. and Chi, E. H. (2010), “Want to be Retweeted? Large Scale Analytics on Factors Impacting Retweet in Twitter Network,” IEEE Second International Conference on Social Computing, pp. 177–184.
[Sun09] Sun, D., Zhou, T., Liu, J.-G., Liu, R.-R., Jia, C.-X. and Wang, B.-H. (2009), “Information Filtering based on Transferring Similarity,” Physical Review E 80, 017101.
[Tan11] Tan, C., Lee, L., Tang, J., Jiang, L., Zhou, M. and Li, P. (2011), “User-Level Sentiment Analysis Incorporating Social Networks,” Proceedings of Knowledge Discovery and Data Mining (SIGKDD), California, USA, pp. 1397–1405.
[Tur02] Turney, P. D. (2002), “Thumbs up or Thumbs down? Semantic Orientation applied to Unsupervised Classification of Reviews,” Proceedings Association for Computational Linguistics, pp. 417–424.
[Tur03] Turney, P. D. and Littman M. L. (2003), “Measuring Praise and Criticism: Inference of Semantic Orientation from Association,” ACM Transaction of Information System, No. 21, pp. 315–346.
[Wan96] Wang, J.-H. and Leu, J.-Y (1996), “Stock Market Trend Prediction using ARIMA-based Neural Networks,” International Conference on Neural Networks, vol. 4, pp. 2160-2165.
[Wes14] West, R., Paskov, H., Leskovec, J. and Potts, C. (2014), “Exploiting Social Network Structure for Person-to-person Sentiment Analysis,” Transactions of the Association for Computational Linguistics, vol. 2, pp. 297-310.
[Yu07] Yu, K., Chu, W. and Yu, S. (2007), “Stochastic Relational Models for Discriminative Link Prediction,” Advances in Neural Information Processing Systems, pp. 1553–1560.
[Yun05] Xia, Y., Wong, K. and Gao, W. (2005), “NIL is not Nothing: Recognition of Chinese Network Informal Language Expressions,” Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, pp. 95-102.
[Zam10] Zaman, T. R., Herbrich, R., Gael, J. v. and Stern, D. (2010), “Predicting Information Spreading in Twitter,” Proceedings of Workshop on Computational Social Science and the Wisdom of Crowds, NIPS, pp. 1-4.
[Zha08] Zhang, X. and Yao, T. (2008), “A Study of Network Informal Language Using Minimal Supervision Approach,” Autonomous Systems – Self-organization, Management, and Control, pp. 169-175.
論文全文使用權限
  • 同意授權校內瀏覽/列印電子全文服務,於2021-02-19起公開。
  • 同意授權校外瀏覽/列印電子全文服務,於2026-01-26起公開。


  • 如您有疑問,請聯絡圖書館
    聯絡電話:(06)2757575#65773
    聯絡E-mail:etds@email.ncku.edu.tw