Integrated Churn Prediction and Customer Segmentation Framework for Telco Business


Wu, Shuli and Yau, Wei Chuen and Ong, Thian Song and Chong, Siew Chin (2021) Integrated Churn Prediction and Customer Segmentation Framework for Telco Business. IEEE Access, 9. pp. 62118-62136. ISSN 2169-3536

[img] Text
Integrated Churn Prediction and Customer Segmentation....pdf
Restricted to Repository staff only

Download (4MB)


In the telco industry, attracting new customers is no longer a good strategy since the cost of retaining existing customers is much lower. Churn management becomes instrumental in the telco industry. As there is limited study combining churn prediction and customer segmentation, this paper aims to propose an integrated customer analytics framework for churn management. There are six components in the framework, including data pre-processing, exploratory data analysis (EDA), churn prediction, factor analysis, customer segmentation, and customer behaviour analytics. This framework integrates churn prediction and customer segmentation process to provide telco operators with a complete churn analysis to better manage customer churn. Three datasets are used in the experiments with six machine learning classifiers. First, the churn status of the customers is predicted using multiple machine learning classifiers. Synthetic Minority Oversampling Technique (SMOTE) is applied to the training set to deal with the problems with imbalanced datasets. The 10-fold cross-validation is used to assess the models. Accuracy and F1-score are used for model evaluation. F1-score is considered to be an important metric to measure the models for imbalanced datasets since the premise of churn management is to be able to identify customers who will churn. Experimental analysis indicates that AdaBoost performed the best in Dataset 1, with accuracy of 77.19% and F1-score of 63.11%. Random Forest performed the best in Dataset 2, with accuracy of 93.6% and F1-score of 77.20%. Random Forest performed the best in Dataset 3 in terms of accuracy, at 63.09%, while Multi-layer Perceptron performed the best in terms of F1-score, at 42.84%. After implementing churn prediction, Bayesian Logistic Regression is used to conduct the factor analysis and to figure out some important features for churn customer segmentation. Churn customer segmentation is then carried out using K-means clustering.

Item Type: Article
Uncontrolled Keywords: Bayesian analysis
Subjects: Q Science > QA Mathematics > QA273-280 Probabilities. Mathematical statistics
Divisions: Faculty of Information Science and Technology (FIST)
Depositing User: Ms Nurul Iqtiani Ahmad
Date Deposited: 25 May 2021 18:34
Last Modified: 25 May 2021 18:34


Downloads per month over past year

View ItemEdit (login required)