A Hybrid Approach for Customer Segmentation and Loyalty Prediction in E-Commerce
DOI:
https://doi.org/10.17010/pijom/2024/v17i10/173996Keywords:
customer segmentation
, loyalty prediction, e-commerce, k-means clustering, XGBoost.JEL Classifications Codes
, C38, C45, C55, L81, M31Paper Submission Date
, September 20, 2023, Paper sent back for Revision, May 24, 2024, Paper Acceptance Date, July 15, Paper Published Online, October 15, 2024Abstract
Purpose : This study aimed to enhance e-commerce customer segmentation and loyalty prediction by integrating machine learning (ML) with traditional statistical methods.
Design/Methodology/Approach : The research adopted a hybrid approach, utilizing k-means clustering for customer segmentation based on recency, frequency, and monetary values, followed by an XGBoost classifier application for loyalty prediction. The methodology involved analyzing actual e-commerce data and comparing results with established industry benchmarks.
Findings : The hybrid model demonstrated superior performance over conventional methods, evidenced by improved precision, recall, accuracy, and F1 scores in loyalty prediction, alongside higher silhouette scores and lower Davies–Bouldin indices for customer segmentation.
Practical Implications : The approach provided a more generalized, interpretable, and high-quality framework for e-commerce businesses to understand customer behavior and enhance retention strategies.
Originality/Value : The research contributed to the field by presenting a novel method that successfully combines ML and statistical analysis, offering a more effective solution for customer segmentation and loyalty prediction in e-commerce settings.
Downloads
Published
How to Cite
Issue
Section
References
Adelaar, T., Bouwman, H., & Steinfield, C. (2004). Enhancing customer value through click-and-mortar e-commerce: Implications for geographical market reach and customer type. Telematics and Informatics, 21(2), 167–182. https://doi.org/10.1016/S0736-5853(03)00055-8
Agrawal, A., Kaur, P., & Singh, M. (2023). Customer segmentation model using K-means clustering on e-commerce. In 2023 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS) (pp. 1–6). IEEE. https://doi.org/10.1109/icscds56580.2023.10105070
Ahmed, N., & Kumari, A. (2022). The implication of e-commerce emerging markets in the post-Covid era. International Journal of Entrepreneurship and Business Management, 1(1), 21–31. https://doi.org/10.54099/ijebm.v1i1.102
Ajina, A. S. (2019). The role of content marketing in enhancing customer loyalty: An empirical study on private hospitals in Saudi Arabia. Innovative Marketing, 15(3), 71–84. https://doi.org/10.21511/im.15(3).2019.06
Andersson, S., & Börjeson, M. J. (2023). Customer journey management within B2B e-commerce: A case study on how to implement customer journey management (Report No. E2023:046). Chalmers Open Digital Repository. http://hdl.handle.net/20.500.12380/306160
Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., Mitchell, R., Cano, I., Zhou, T., Li, M., Xie, J., Lin, M., Geng, Y., Li, Y., & Yuan, J. (2014). Xgboost: Extreme gradient boosting [dataset]. In CRAN: Contributed Packages. The R Foundation. https://doi.org/10.32614/cran.package.xgboost
Costa, V. G., & Pedreira, C. E. (2023). Recent advances in decision trees: An updated survey. Artificial Intelligence Review, 56(5), 4765–4800. https://doi.org/10.1007/s10462-022-10275-5
Dhote, T., & Zahoor, D. (2017). Framework for sustainability in e-commerce business models: A perspective based approach. Indian Journal of Marketing, 47(4), 35–50. https://doi.org/10.17010/ijom/2017/v47/i4/112681
Hajek, P., Abedin, M. Z., & Sivarajah, U. (2023). Fraud detection in mobile payment systems using an XGBoost-based framework. Information Systems Frontiers, 25, 1985–2003. https://doi.org/10.1007/s10796-022-10346-6
Heilman, C. M., & Bowman, D. (2002). Segmenting consumers using multiple-category purchase data. International Journal of Research in Marketing, 19(3), 225–252. https://doi.org/10.1016/s0167-8116(02)00077-0
Huyut, M. T., & Ustundag, H. (2022). Prediction of diagnosis and prognosis of COVID-19 disease by blood gas parameters using decision trees machine learning model: A retrospective observational study. Medical Gas Research, 12(2), 60–66. https://doi.org/10.4103/2045-9912.326002
Jauhar, S. K., Chakma, B. R., Kamble, S. S., & Belhadi, A. (2024). Digital transformation technologies to analyze product returns in the e-commerce industry. Journal of Enterprise Information Management, 37(2), 456–487. https://doi.org/10.1108/jeim-09-2022-0315
Joshi, D., & Achuthan, S. (2016). A study of trends in B2C online buying in India. Indian Journal of Marketing, 46(2), 22–35. https://doi.org/10.17010/ijom/2016/v46/i2/87248
Kumar, A., Gupta, S. L., & Kishor, N. (2016). The antecedents of customer loyalty: Attitudinal and behavioral perspectives based on Oliver's loyalty model. Indian Journal of Marketing, 46(3), 31–53. https://doi.org/10.17010/ijom/2016/v46/i3/88996
Kushwah, J. S., Kumar, A., Patel, S., Soni, R., Gawande, A., & Gupta, S. (2022). Comparative study of regressor and classifier with decision tree using modern tools. Materials Today: Proceedings, 56(Part 6), 3571–3576. https://doi.org/10.1016/j.matpr.2021.11.635
Lee, H. F., & Jiang, M. (2021). A hybrid machine learning approach for customer loyalty prediction. In H. Zhang, Z. Yang, Z. Zhang, Z. Wu, & T. Hao (eds.), Neural computing for advanced applications. NCAA 2021. Communications in computer and information science (Vol. 1449, pp. 211–226). Springer. https://doi.org/10.1007/978-981-16-5188-5_16
Leninkumar, V. (2017). The relationship between customer satisfaction and customer trust on customer loyalty. International Journal of Academic Research in Business and Social Sciences, 7(4), 450–465. https://doi.org/10.6007/IJARBSS/v7-i4/2821
Myburg, M. E. (2023). Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost. Faculty of Science, Department of Computer Science. http://hdl.handle.net/11427/38088
Nisar, T. M., & Prabhakar, G. (2017). What factors determine e-satisfaction and consumer spending in e-commerce retailing? Journal of Retailing and Consumer Services, 39, 135–144. https://doi.org/10.1016/j.jretconser.2017.07.010
Othayoth, S. P., & Muthalagu, R. (2022). Customer segmentation using various machine learning techniques. International Journal of Business Intelligence and Data Mining, 20(4), 480–496. https://doi.org/10.1504/IJBIDM.2022.123218
Rahayu, S., Cakranegara, P. A., Simanjorang, T. M., Syobah, S. N., & Arifin. (2022). Implementation of customer relationship management system to maintain service quality for customer. Enrichment: Journal of Management, 12(5), 3856–3866. https://doi.org/10.35335/enrichment.v12i5.939
Rizki, B., Ginasta, N. G., Tamrin, M. A., & Rahman, A. (2020). Customer loyalty segmentation on point of sale system using recency-frequency-monetary (RFM) and K-means. Jurnal Online Informatika, 5(2), 130–136. https://doi.org/10.15575/join.v5i2.511
Schapire, R. E. (2003). The boosting approach to machine learning: An overview. In D. D. Denison, C. C. Holmes, M. H. Hansen, B. Mallick, & B. Yu (eds.), Nonlinear estimation and classification. Lecture notes in statistics (Vol. 171, pp. 149–171). Springer. https://doi.org/10.1007/978-0-387-21579-2_9
Shobana, J., Gangadhar, C., Arora, R. K., Renjith, P. N., Bamini, J., & Chincholkar, Y. D. (2023). E-commerce customer churn prevention using machine learning-based business intelligence strategy. Measurement: Sensors, 27, Article ID 100728. https://doi.org/10.1016/j.measen.2023.100728
Singh, A., Inamdar, A. G., Kaimal, A. R., Mahajan, V., & Priya, R. (2022). Chapter 5: Customization of product/service on e-commerce websites. In, Changing face of e-commerce in Asia (pp. 79–96). World Scientific Publishing. https://doi.org/10.1142/9789811245992_0005
Tabuena, A. C., Necio, S. M., Macaspac, K. K., Bernardo, M. P., Domingo, D. I., & De Leon, P. D. (2022). A literature review on digital marketing strategies and its impact on online business sellers during the COVID-19 crisis. Asian Journal of Management, Entrepreneurship and Social Science, 2(01), 141–153. https://ajmesc.com/index.php/ajmesc/article/view/43
Thomas, J. C., Penas, M. S., & Mora, M. (2013). New version of Davies-Bouldin Index for clustering validation based on cylindrical distance. In 2013 32nd International Conference of the Chilean Computer Science Society (SCCC). IEEE. https://doi.org/10.1109/sccc.2013.29
Tiwari, R., Saxena, M. K., Mehendiratta, P., Vatsa, K., Srivastava, S., & Gera, R. (2018). Market segmentation using supervised and unsupervised learning techniques for e-commerce applications. Journal of Intelligent & Fuzzy Systems, 35(5), 5353–5363. https://doi.org/10.3233/jifs-169818
Ullah, A., Mohmand, M. I., Hussain, H., Johar, S., Khan, I., Ahmad, S., Mahmoud, H. A., & Huda, S. (2023). Customer analysis using machine learning-based classification algorithms for effective segmentation using recency, frequency, monetary, and time. Sensors, 23(6), 3180. https://doi.org/10.3390/s23063180
Wan, S., Chen, J., Qi, Z., Gan, W., & Tang, L. (2022). Fast RFM model for customer segmentation. In WWW '22: Companion proceedings of the web conference 2022 (pp. 965–972). ACM Digital Library. https://doi.org/10.1145/3487553.3524707
Wu, Z., Jin, L., Zhao, J., Jing, L., & Chen, L. (2022). Research on segmenting e-commerce customer through an improved K-Medoids clustering algorithm. Computational Intelligence and Neuroscience, 2022. Article ID 9930613. https://doi.org/10.1155/2022/9930613
Xiao, J., Lu, J., & Li, X. (2017). Davies Bouldin Index based hierarchical initialization K-means. Intelligent Data Analysis, 21(6), 1327–1338. https://doi.org/10.3233/ida-163129