||Exploring Call Detail Records for Churn Analysis in a Telecommunication Company
||Department of Engineering Science (on the job class)
call detail records
social network analysis
As the market of mobile communication gradually gets saturated in recent years, the competition among telecommunication operators also becomes severe. While it is difficult to effectively attract new subscribers, to reduce customer churn has become a critical issue. In this work, we utilize techniques of social network analysis and graph theory to explore the call detail records so as to understand more about user behavior in their communities. Moreover, we propose in this work an algorithm to discover core members in their respective communities. It can be easily noticed that core members are with significant influence power and network values. More customer churns may happen as core members transfer to another telecommunication operator. Thus, the possibility of customer churn can be reduced by satisfying core members. Through empirical studies, our approach is not only of solid theoretical basis but also feasible in real telecommunication environment. Consequently, our approach can provide telecommunication operators a valuable reference to identify and to reduce customer churn in early stages.
第一章 緒論 1
1.1 研究背景與動機 1
1.2 本論文之貢獻 3
第二章 文獻探討 4
2.1 顧客流失之定義與成因 4
2.2 電信環境之資料管理 7
2.3 顧客流失議題之既有研究探討 9
2.3.1 資料探勘與預測子之建立 10
2.3.2 顧客流失預測與偵測之既有研究探討 12
2.4 圖形理論與呈現 15
2.5 社群與社會網路 16
第三章 研究方法 20
3.1 通話關聯圖之形成 20
3.2 行動電話用戶社群之探勘 22
3.2.1 行動電話用戶社群之形成要素 22
3.2.2 行動電話用戶社群之正式定義 23
3.2.3 緊密度 (Closeness) 24
3.2.4 分群係數 (Clustering Coefficient) 26
3.2.5 行動電話用戶社群發掘之範例 28
3.3 核心成員之發現 30
3.3.1 PageRank演算法 30
3.3.2 核心成員發現演算法 33
3.4 核心成員與用戶流失分析與評估之關聯 36
第四章 實驗探討與結果 38
4.1 實驗環境與測試資料 38
4.1.1 實驗環境 38
4.1.2 通聯記錄資料集 39
4.1.3 通聯記錄資料集之資料前置處理 39
4.2 行動電話用戶社群之呈現 41
4.2.1 行動電話用戶社群之分佈 41
4.2.2 行動電話用戶社群發掘過程之觀察 43
4.3 核心成員之呈現 48
4.3.1 核心成員發現演算法之適切性評估 48
4.3.2 阻尼因子對成員強度之影響 51
4.3.3 門檻值對核心成員數量之影響 53
第五章 結論與未來工作 56
W.-H. Au, K.C.C. Chen, and X. Yao, "A Novel Evolutionary Data Mining Algorithm with Applications to Churn Prediction," IEEE Transactions on Evolutionary Computation, 7(6):532–545, 2003.
L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan, “Group Formation in Large Social Networks: Membership, Growth and Evolution,” Proceedings of the 12th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 44-54, August 2006.
C. Borgelt and M. R. Berhold, “Mining Molecular Fragments: Finding Relevant Substructures of Molecules,” Proceedings of International Conference on Data Mining, pages 51-58, December 2002.
S. Brin and L. Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Proceedings of the 7th International WWW Conference, pages 107-117, April 1998.
N. S. Cardell, M. Golovnya, and D. Steinberg, "Churn Modeling for Mobile Telecommunications: Winning the Duke/NCR Teradata Center for CRM Competition," Informs Marketing Science Conference, June 2003.
A. Clauset, M.E.J. Newman, and C. Moore, “Finding Community Structure in Very Large Networks,” Physical Review E 70(6):066111, December 2004.
J. Craig and D. Julta, "e-Business Readiness: A Customer-Focused Framework." Addison-Wesley Information Technology Series, 2001.
K. Dasgupta, R. Singh, B. Viswanathan, D. Chakraborty, S. Mukherjea, A. A. Nanavati, and A. Joshi, “Social Ties and their Relevance to Churn in Mobile Telecom Networks,” Proceedings of the 11th International Conference on Extending Database Technology, pages 668-677, March 2008.
Y.-F. Duan and W.-N. Wu, "The Introduction of China's Mobile Business Analysis System," Telecommunication Technology, November 2001.
M. Girvan and M. E. J. Newman, “Community Structure in Social and Biological Networks,” Proceedings of the National Academy of Sciences, pages 7821-7826, June 2002.
S. Gupta, W. Kamakura, J. Lu, C. Mason, and S. Nelin, "Churn Modeling Tournament," CRM Presentation, Informs Marketing Science Conference, June 2003.
J. Huan, W. Wang, A. Washington, J. Prins, and A. Tropsha, “Accurately Classification of Protein Structural Families Using Coherent Subgraph Analysis,” Proceedings of Pacific Symposium on Biocomputing, pages 411-422, January 2004.
S.-Y. Hung, D. C. Yen, and H.-Y. Wang, “Applying Data Mining to Telecom Churn Management,” Journal of Expert Systems with Applications, 31(3):515-524, October 2006.
C.-C. Hung, W.-C. Peng, and J.-L. Huang, “Exploring Regression for Mining User Moving Patterns in a Mobile Computing System,” Proceedings of the 1st International Conference on High Performance Computing and Communications, pages 878-887, September 2005.
A. Inokuchi, T. Washio, K. Nishimura, and H. Motoda, “A Fast Algorithm for Mining Frequent Connected Subgraphs,” Technical Report RT0448, IBM Research, Tokyo 48 Research Laboratory, 2001.
G. Kalna and D. J. Higham, “Clustering Coefficients for Weighted Networks,” Proceeding of Adaptation in Artificial and Biological System, 2006.
C. Kiss, A. Scholz, and M. Bichler, “Evaluating Centrality Measures in Large Call Graphs,” Proceedings of the 8th IEEE International Conference on E-Commerce Technology and the 3rd IEEE International Conference on Enterprise Computing, E-Commerce, and E-Services, pages 8, 2006.
J. M. Kleinberg, “Authoritative Sources in a Hyperlinked Environment,” Proceedings of the 9th ACM-SIAM Symposium on Discrete Algorithms, 1998.
R. Kumar, J. Novak, and A. Tomkins, “Structure and Evolution of Online Social Networks,” Proceedings of the 12th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 611-617, August 2006.
M. Kuramochi and G. Karypis, “An Efficient Algorithm for Discovering Frequent Subgraphs,” IEEE Transactions on Knowledge and Data Engineering, 16(9):1038-1051, January 2001.
J. Lu, "Predicting Customer Churn in the Telecommunications Industry – An Application of Survival Analysis Modeling Using SAS," Proceedings of the 27th International Annual Conference on SAS Group, paper 114-27, 2002.
B. Luo, P.-J. Shao, and J. Liu, “Customer Churn Prediction Based on the Decision Tree in Personal Handyphone System Service,” Proceedings of the International Conference on Service Systems and Service Management, pages 1-5, June 2007.
Z. Mo, S. Zhao, L. Li, and A.-J. Liu, “A Predictive Model of Churn in Telecommunications Based on Data Mining,” Proceedings of the IEEE International Conference on Control and Automation, pages 809–813, May 30 – June 1 2007.
M. C. Mozer, R. Wolniewicz, D. B. Grimes, E. Johnson, and H. Kaushansky, “Predicting Subscriber Dissatisfaction and Improving Retention in the Wireless Telecommunications Industry,” IEEE Transactions on Neural Networks, 11(3):690-696, 2000.
A. Nanavati, R. Singh, D. Chakraborty, K. Dasgupta, S. Mukherjea, G. Das, S. Gurumurthy, and A. Joshi, “Analyzing the Structure and Evolution of Massive Telecom Graphs,” IEEE Transactions on Knowledge and Data Engineering, 20(5):703-718, May 2008.
L. Page, S. Brin, R. Motwani, and T. Winograd "The PageRank Citation Ranking: Bringing Order to The Web," Technical Report, Stanford Digital Libraries SIDL-WP-1999-0120, 1999.
C. Ridings and M. Shishigin "PageRank Uncovered," Technical Report, 2002.
S. Rosset and E. Neumann, "Integrating Customer Value Considerations into Predictive Modeling," Proceedings of the 3rd IEEE International Conference on Data Mining, pages 283-290, November 2003.
T. Schank and D. Wagner, “Approximating Clustering Coefficient and Transitivity,” Journal of Graph Algorithms and Applications, 9(2), pages 265–275, 2005.
G.-J. Song, D.-Q. Yang, L. Wu, T.-J. Wang, and S.-W. Tang, “A Mixed Process Neural Network and its Application to Churn Prediction in Mobile Communications,” Proceedings of the 6th IEEE International Conference on Data Mining Workshops, pages 798–802, December 2006.
J. Sun, H. Qu, D. Chakrabarti, and C. Faloutsos, “Relevance Search and Anomaly Detection in Bipartite Graphs,” SIGKDD Explorations, 7(2):48-55, June 2005.
W.-G. Teng and M.-C. Chou, “Mining Communities of Acquainted Mobile Users on Call Detail Records,” Proceedings of the 22nd Annual ACM Symposium on Applied Computing, pages 957-958, March 2007.
C. Wang, W. Wang, J. Pei, Y. Zhu, and B. Shi, “Scalable Mining of Large Disk-based Graph Database,” Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 316-325, August 2004.
S. Wasserman and K. Faust, “Social Network Analysis: Methods and Applications,” Cambridge University Press, 1994.
C.-P. Wei and I.-T. Chiu, "Turning Telecommunications Call Details to Churn Prediction: A Data Mining Approach," Expert Systems with Applications, 23(2):103-112, 2002.
G. M. Weiss, “Data Mining in Telecommunications,” The Data Mining and Knowledge Discovery Handbook, pages 1189-1201, 2005.
S. Wu, X.-D. Gao, and M. Bastian, "Data Warehouse and Data Mining," Metallurgicall Industry Press, 2003.
X. Yan and J. Han, “gSpan: Graph-based Substructure Pattern Mining,” Proceedings of the 2002 IEEE International Conference on Data Mining, pages 721-724, December 2002.
L. Yan, D.J. Miller, M.C. Mozer, and R. Wolniewicz, "Improving Prediction of Customer Behavior in Nonstationary Environments," Proceedings of the International Joint Conference on Neural Networks, pages 2258-2263, 2001.
L. Yan, M. Fassion, and P. Baldasare, “Predicting Customer Behavior via Calling Links,” Proceedings of the International Joint Conference on Neural Networks, pages 2555-2560, July 31 - August 4 2005.
Y.-M. Yang, H. Wang, L. Li, T.-Y. Li, W.-M. Li, Q. Yang, W. Lv, and P. Huang, "Multi-Dimensional Model-Based Clustering for User-Behavior Mining in Telecommunications Industry," Proceedings of the 3rd International Conference on Machine Learning and Cybernetics, pages 1650-1655, August 2004.
R.-Z. Yang, "The Market Marketing Management of the Modern Telecommunication Enterprise," People Posts and Telecommunications Publishing House, November 2002.
W. Yu, D.N. Jutla, and S.C. Sivakumar, "A Churn-Strategy Alignment Model for Managers in Mobile Telecom,” Proceedings of the 3rd Annual Communication Networks and Services Research Conference, pages 48–53, May 2005.
Y.-M. Zhang, J.-Y. Qi, H.-Y. Shu, and J.-T. Cao, “A Hybrid KNN-LR Classifier and its Application in Customer Churn Prediction,” Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pages 3265–3269, October 2007.
G.-Z. Zhang, “Customer Segmentation Based on Survival Character,” Proceedings of the International Conference on Wireless Communications, Networking and Mobile Computing, pages 3391-3396, September 2007.
Anderson Consulting, "Battling Churn to Increase Shareholder Value: Wireless Challenge for the Future," Anderson Consulting Research Report, 2000.
Clustering coefficient-Wikipedia, http://en.wikipedia.org/wiki/Clustering_coefficient
Graph theory-Wikipedia, http://en.wikipedia.org/wiki/Graph_theory
Social network - Wikipedia, http://en.wikipedia.org/wiki/Social_network