A Hybrid Approach for Customer Segmentation and Loyalty Prediction in E-Commerce

Authors

  •   Elamurugan Balasundaram Associate Professor, Department of Management Studies, Sri Manakula Vinayagar Engineering College, Madagadipet, Mannadipet Commune - 605 107, Puducherry
  •   P. Aranganathan Associate Professor, Gnanam School of Business, Mary’s Nagar, Trichy-Thanjavur Express Highway, Sengipatty, Thanjavur - 613 402, Tamil Nadu
  •   Krishna Sudhir Annavajjala Professor & HOD, Department of MBA, Koneru Lakshmaiah Education Foundation, Green Fields, Vaddeswaram, Guntur District, Andhra Pradesh
  •   R. Sivakumar Assistant Professor, Department of Management Studies, Sri Manakula Vinayagar Engineering College, Madagadipet, Mannadipet Commune - 605 107, Puducherry
  •   Mathiazhagan Arumugam Assistant Professor, Department of Management Studies, Sri Manakula Vinayagar Engineering College, Madagadipet, Mannadipet Commune - 605 107, Puducherry
  •   A. Vinoth Assistant Professor, Department of Management Studies, Sri Manakula Vinayagar Engineering College, Madagadipet, Mannadipet Commune - 605 107, Puducherry

DOI:

https://doi.org/10.17010/pijom/2024/v17i10/173996

Keywords:

customer segmentation

, loyalty prediction, e-commerce, k-means clustering, XGBoost.

JEL Classifications Codes

, C38, C45, C55, L81, M31

Paper Submission Date

, September 20, 2023, Paper sent back for Revision, May 24, 2024, Paper Acceptance Date, July 15, Paper Published Online, October 15, 2024

Abstract

Purpose : This study aimed to enhance e-commerce customer segmentation and loyalty prediction by integrating machine learning (ML) with traditional statistical methods.

Design/Methodology/Approach : The research adopted a hybrid approach, utilizing k-means clustering for customer segmentation based on recency, frequency, and monetary values, followed by an XGBoost classifier application for loyalty prediction. The methodology involved analyzing actual e-commerce data and comparing results with established industry benchmarks.

Findings : The hybrid model demonstrated superior performance over conventional methods, evidenced by improved precision, recall, accuracy, and F1 scores in loyalty prediction, alongside higher silhouette scores and lower Davies–Bouldin indices for customer segmentation.

Practical Implications : The approach provided a more generalized, interpretable, and high-quality framework for e-commerce businesses to understand customer behavior and enhance retention strategies.

Originality/Value : The research contributed to the field by presenting a novel method that successfully combines ML and statistical analysis, offering a more effective solution for customer segmentation and loyalty prediction in e-commerce settings.

Downloads

Download data is not yet available.

Published

2024-10-15

How to Cite

Balasundaram, E., Aranganathan, P., Annavajjala, K. S., Sivakumar, R., Arumugam, M., & Vinoth, A. (2024). A Hybrid Approach for Customer Segmentation and Loyalty Prediction in E-Commerce. Prabandhan: Indian Journal of Management, 17(10), 56–69. https://doi.org/10.17010/pijom/2024/v17i10/173996

References

Adelaar, T., Bouwman, H., & Steinfield, C. (2004). Enhancing customer value through click-and-mortar e-commerce: Implications for geographical market reach and customer type. Telematics and Informatics, 21(2), 167–182. https://doi.org/10.1016/S0736-5853(03)00055-8

Agrawal, A., Kaur, P., & Singh, M. (2023). Customer segmentation model using K-means clustering on e-commerce. In 2023 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS) (pp. 1–6). IEEE. https://doi.org/10.1109/icscds56580.2023.10105070

Ahmed, N., & Kumari, A. (2022). The implication of e-commerce emerging markets in the post-Covid era. International Journal of Entrepreneurship and Business Management, 1(1), 21–31. https://doi.org/10.54099/ijebm.v1i1.102

Ajina, A. S. (2019). The role of content marketing in enhancing customer loyalty: An empirical study on private hospitals in Saudi Arabia. Innovative Marketing, 15(3), 71–84. https://doi.org/10.21511/im.15(3).2019.06

Andersson, S., & Börjeson, M. J. (2023). Customer journey management within B2B e-commerce: A case study on how to implement customer journey management (Report No. E2023:046). Chalmers Open Digital Repository. http://hdl.handle.net/20.500.12380/306160

Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., Mitchell, R., Cano, I., Zhou, T., Li, M., Xie, J., Lin, M., Geng, Y., Li, Y., & Yuan, J. (2014). Xgboost: Extreme gradient boosting [dataset]. In CRAN: Contributed Packages. The R Foundation. https://doi.org/10.32614/cran.package.xgboost

Costa, V. G., & Pedreira, C. E. (2023). Recent advances in decision trees: An updated survey. Artificial Intelligence Review, 56(5), 4765–4800. https://doi.org/10.1007/s10462-022-10275-5

Dhote, T., & Zahoor, D. (2017). Framework for sustainability in e-commerce business models: A perspective based approach. Indian Journal of Marketing, 47(4), 35–50. https://doi.org/10.17010/ijom/2017/v47/i4/112681

Hajek, P., Abedin, M. Z., & Sivarajah, U. (2023). Fraud detection in mobile payment systems using an XGBoost-based framework. Information Systems Frontiers, 25, 1985–2003. https://doi.org/10.1007/s10796-022-10346-6

Heilman, C. M., & Bowman, D. (2002). Segmenting consumers using multiple-category purchase data. International Journal of Research in Marketing, 19(3), 225–252. https://doi.org/10.1016/s0167-8116(02)00077-0

Huyut, M. T., & Ustundag, H. (2022). Prediction of diagnosis and prognosis of COVID-19 disease by blood gas parameters using decision trees machine learning model: A retrospective observational study. Medical Gas Research, 12(2), 60–66. https://doi.org/10.4103/2045-9912.326002

Jauhar, S. K., Chakma, B. R., Kamble, S. S., & Belhadi, A. (2024). Digital transformation technologies to analyze product returns in the e-commerce industry. Journal of Enterprise Information Management, 37(2), 456–487. https://doi.org/10.1108/jeim-09-2022-0315

Joshi, D., & Achuthan, S. (2016). A study of trends in B2C online buying in India. Indian Journal of Marketing, 46(2), 22–35. https://doi.org/10.17010/ijom/2016/v46/i2/87248

Kumar, A., Gupta, S. L., & Kishor, N. (2016). The antecedents of customer loyalty: Attitudinal and behavioral perspectives based on Oliver's loyalty model. Indian Journal of Marketing, 46(3), 31–53. https://doi.org/10.17010/ijom/2016/v46/i3/88996

Kushwah, J. S., Kumar, A., Patel, S., Soni, R., Gawande, A., & Gupta, S. (2022). Comparative study of regressor and classifier with decision tree using modern tools. Materials Today: Proceedings, 56(Part 6), 3571–3576. https://doi.org/10.1016/j.matpr.2021.11.635

Lee, H. F., & Jiang, M. (2021). A hybrid machine learning approach for customer loyalty prediction. In H. Zhang, Z. Yang, Z. Zhang, Z. Wu, & T. Hao (eds.), Neural computing for advanced applications. NCAA 2021. Communications in computer and information science (Vol. 1449, pp. 211–226). Springer. https://doi.org/10.1007/978-981-16-5188-5_16

Leninkumar, V. (2017). The relationship between customer satisfaction and customer trust on customer loyalty. International Journal of Academic Research in Business and Social Sciences, 7(4), 450–465. https://doi.org/10.6007/IJARBSS/v7-i4/2821

Myburg, M. E. (2023). Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost. Faculty of Science, Department of Computer Science. http://hdl.handle.net/11427/38088

Nisar, T. M., & Prabhakar, G. (2017). What factors determine e-satisfaction and consumer spending in e-commerce retailing? Journal of Retailing and Consumer Services, 39, 135–144. https://doi.org/10.1016/j.jretconser.2017.07.010

Othayoth, S. P., & Muthalagu, R. (2022). Customer segmentation using various machine learning techniques. International Journal of Business Intelligence and Data Mining, 20(4), 480–496. https://doi.org/10.1504/IJBIDM.2022.123218

Rahayu, S., Cakranegara, P. A., Simanjorang, T. M., Syobah, S. N., & Arifin. (2022). Implementation of customer relationship management system to maintain service quality for customer. Enrichment: Journal of Management, 12(5), 3856–3866. https://doi.org/10.35335/enrichment.v12i5.939

Rizki, B., Ginasta, N. G., Tamrin, M. A., & Rahman, A. (2020). Customer loyalty segmentation on point of sale system using recency-frequency-monetary (RFM) and K-means. Jurnal Online Informatika, 5(2), 130–136. https://doi.org/10.15575/join.v5i2.511

Schapire, R. E. (2003). The boosting approach to machine learning: An overview. In D. D. Denison, C. C. Holmes, M. H. Hansen, B. Mallick, & B. Yu (eds.), Nonlinear estimation and classification. Lecture notes in statistics (Vol. 171, pp. 149–171). Springer. https://doi.org/10.1007/978-0-387-21579-2_9

Shobana, J., Gangadhar, C., Arora, R. K., Renjith, P. N., Bamini, J., & Chincholkar, Y. D. (2023). E-commerce customer churn prevention using machine learning-based business intelligence strategy. Measurement: Sensors, 27, Article ID 100728. https://doi.org/10.1016/j.measen.2023.100728

Singh, A., Inamdar, A. G., Kaimal, A. R., Mahajan, V., & Priya, R. (2022). Chapter 5: Customization of product/service on e-commerce websites. In, Changing face of e-commerce in Asia (pp. 79–96). World Scientific Publishing. https://doi.org/10.1142/9789811245992_0005

Tabuena, A. C., Necio, S. M., Macaspac, K. K., Bernardo, M. P., Domingo, D. I., & De Leon, P. D. (2022). A literature review on digital marketing strategies and its impact on online business sellers during the COVID-19 crisis. Asian Journal of Management, Entrepreneurship and Social Science, 2(01), 141–153. https://ajmesc.com/index.php/ajmesc/article/view/43

Thomas, J. C., Penas, M. S., & Mora, M. (2013). New version of Davies-Bouldin Index for clustering validation based on cylindrical distance. In 2013 32nd International Conference of the Chilean Computer Science Society (SCCC). IEEE. https://doi.org/10.1109/sccc.2013.29

Tiwari, R., Saxena, M. K., Mehendiratta, P., Vatsa, K., Srivastava, S., & Gera, R. (2018). Market segmentation using supervised and unsupervised learning techniques for e-commerce applications. Journal of Intelligent & Fuzzy Systems, 35(5), 5353–5363. https://doi.org/10.3233/jifs-169818

Ullah, A., Mohmand, M. I., Hussain, H., Johar, S., Khan, I., Ahmad, S., Mahmoud, H. A., & Huda, S. (2023). Customer analysis using machine learning-based classification algorithms for effective segmentation using recency, frequency, monetary, and time. Sensors, 23(6), 3180. https://doi.org/10.3390/s23063180

Wan, S., Chen, J., Qi, Z., Gan, W., & Tang, L. (2022). Fast RFM model for customer segmentation. In WWW '22: Companion proceedings of the web conference 2022 (pp. 965–972). ACM Digital Library. https://doi.org/10.1145/3487553.3524707

Wu, Z., Jin, L., Zhao, J., Jing, L., & Chen, L. (2022). Research on segmenting e-commerce customer through an improved K-Medoids clustering algorithm. Computational Intelligence and Neuroscience, 2022. Article ID 9930613. https://doi.org/10.1155/2022/9930613

Xiao, J., Lu, J., & Li, X. (2017). Davies Bouldin Index based hierarchical initialization K-means. Intelligent Data Analysis, 21(6), 1327–1338. https://doi.org/10.3233/ida-163129