Predicting Travel Insurance Purchases in an Insurance Firm through Machine Learning Methods after COVID-19

Main Article Content

Shiuh Tong Lim
Joe Yee Yuan
Khai Wah Khaw
XinYing Chew

Abstract

Travel insurance serves as a crucial financial safeguard, offering coverage against unforeseen expenses and losses incurred during travel. With the advent of the proliferation of insurance types and the amplified demand for Covid-related coverage, insurance companies face the imperative task of accurately predicting customers’ likelihood to purchase insurance. This can assist the insurance providers in focusing on the most lucrative clients and boosting sales. By employing advanced machine learning techniques, this study aims to forecast the consumer segments most inclined to acquire travel insurance, allowing targeted strategies to be developed. A comprehensive analysis was carried out on a Kaggle dataset comprising prior clients of a travel insurance firm utilizing the K-Nearest Neighbors (KNN), Decision Tree Classifier (DT), Support Vector Machines (SVM), Naïve Bayes (NB), Logistic Regression (LR), and Random Forest (RF) models. Extensive data cleaning was done before model building. Performance evaluation was then based on accuracy, F1 score, and the Area Under Curve (AUC) with Receiver Operating Characteristics (ROC) curve. Inexplicably, KNN outperformed other models, achieving an accuracy of 0.81, precision of 0.82, recall of 0.82, F1 score of 0.80, and an AUC of 0.78. The findings of this study are a valuable guide for deploying machine learning algorithms in predicting travel insurance purchases, thus empowering insurance companies to target the most lucrative clientele and bolster revenue generation.

Article Details

How to Cite
Lim, S. T., Yuan, J. Y., Khaw, K. W., & Chew, X. (2023). Predicting Travel Insurance Purchases in an Insurance Firm through Machine Learning Methods after COVID-19. Journal of Informatics and Web Engineering, 2(2), 43–58. https://doi.org/10.33093/jiwe.2023.2.2.4
Section
Regular issue

References

D. A. Hamzah, A. A. Kalambe, L. S. Goklas, and N. G. Alkhayyat, “Predicting travel insurance policy claim using logistic regression,” Applied Quantitative Analysis, vol. 1, no. 1, 2021.

S. K. C. R. Wickramasinghe and K. ABD Razak, “The Impact Of The Telecommunication Industry As A Moderator on Poverty Alleviation and Educational Programmes To Achieve Sustainable Development Goals In Developing Countries,” Journal of Informatics and Web Engineering, vol. 2, no. 1, pp. 25–37, Mar. 2023, doi: 10.33093/jiwe.2023.2.1.3.

A. A. Hasan and N. C. Abdullah, “Compulsory Travel Insurance in Malaysia: Some Regulatory Considerations,” Procedia Soc Behav Sci, vol. 172, pp. 344–351, Jan. 2015, doi: 10.1016/j.sbspro.2015.01.375.

K. Herbst et al., “Protocol: Leveraging a demographic and health surveillance system for Covid-19 Surveillance in rural KwaZulu-Natal,” Wellcome Open Res, vol. 5, 2020, doi: 10.12688/wellcomeopenres.15949.1.

World Tourism Organization, “IMPACT ASSESSMENT OF THE COVID-19 OUTBREAK ON INTERNATIONAL TOURISM,” The World Tourism Organization, 2020.

M. Diakonidze, “Tourism Insurance Market, Risks and Prospects: The Case Study,” Journal of corporate governance, insurance and risk management, vol. 8, Jun. 2021, doi: 10.51410/jcgirm.8.1.5.

S. L. Kang, “Insurers see rising demand for travel insurance as more countries make it mandatory,” The Edge Malaysia, Nov. 09, 2021.

A. Martinez, C. Schmuck, S. Pereverzyev, C. Pirker, and M. Haltmeier, “A machine learning framework for customer purchase prediction in the non-contractual setting,” Eur J Oper Res, vol. 281, no. 3, pp. 588–596, Mar. 2020, doi: 10.1016/j.ejor.2018.04.034.

R. Esmeli, M. Bader-El-Den, and H. Abdullahi, “Towards early purchase intention prediction in online session based retailing systems,” Electronic Markets, vol. 31, no. 3, pp. 697–715, Sep. 2021, doi: 10.1007/s12525-020-00448-x.

N. Amruthnath and T. Gupta, “A Research Study on Unsupervised Machine Learning Algorithms for Early Fault Detection in Predictive Maintenance,” in 5th International Conference on Industrial Engineering and Applications355978-1-5386-5748-5/18/$31.00 ©2018 IEEE, 2018, pp. 355–361.

F.Y. Osisanwo, J.E.T. Akinsola, O. Awodele, J.O. Hinmikaiye, O. Olakanmi, and J. Akinjobi, “Supervised Machine Learning Algorithms: Classification and Comparison,” International Journal of Computer Trends and Technology, vol. 48, 2017, pp. 128-138.

Y. Liu, Y. Liu, K. Bo, Q. Yi, Z. Wang, Y. Sun, J. Xu, X. Zhang, R. Xu, “Predict Health Insurance Purchase with Machine Learning Techniques,” 2021. [Online]. Available: https://ssrn.com/abstract=3968385

R. Jaiswal, “Prognosticating Customers’ Intention To Purchase An Insurance Plan With Machine Learning,” in Fostering Resilient Business Ecosystems and Economic Growth: Towards the Next Normal, A. Gawande and A. Kumar, Eds., India: Research and Publication Cell , Apr. 2022, pp. 292–296.

M. A. Rubi, M. Hasan Imam Bijoy, S. Chowdhury, and M. K. Islam, “Machine Learning Prediction of Consumer Travel Insurance Purchase Behavior,” in 2022 13th International Conference on Computing Communication and Networking Technologies, ICCCNT 2022, Institute of Electrical and Electronics Engineers Inc., 2022. doi: 10.1109/ICCCNT54827.2022.9984470.

S. B. Imandoust and M. Bolandraftar, “Application of K-nearest neighbor (KNN) approach for predicting economic events theoretical background,” Int J Eng Res Appl, vol. 3, pp. 605–610, Jun. 2013.

S. Zhang, X. Li, M. Zong, X. Zhu, and D. Cheng, “Learning k for kNN Classification,” ACM Trans Intell Syst Technol, vol. 8, pp. 1–19, Jun. 2017, doi: 10.1145/2990508.

M. Shouman, T. Turner, and R. Stocker, “Applying k-Nearest Neighbour in Diagnosing Heart Disease Patients,” International Journal of Information and Education Technology, vol. 2, no. 3, pp. 220–223, 2012.

V. M. Sreeja and K. Umamaheswari, “Type 2 Diabetic Prediction Using Machine Learning Algorithm,” American Scientific Research Journal for Engineering, vol. 45, no. 1, pp. 299–307, 2018, [Online]. Available: http://asrjetsjournal.org/

Y. Lim, K.-W. Ng, P. Naveen, and S.-C. Haw, “Emotion Recognition by Facial Expression and Voice: Review and Analysis,” Journal of Informatics and Web Engineering, vol. 1, no. 2, pp. 45–54, Sep. 2022, doi: 10.33093/jiwe.2022.1.2.4.

A. Mashat, M. Fouad, P. Yu, and T. Gharib, “A Decision Tree Classification Model for University Admission System,” Journal of Advanced Computer Science and Applications(IJACSA), vol. 3, Jun. 2012, doi: 10.14569/IJACSA.2012.031003.

Y. Y. Song and Y. Lu, “Decision tree methods: applications for classification and prediction,” Shanghai Arch Psychiatry, vol. 27, no. 2, pp. 130–135, Apr. 2015, doi: 10.11919/j.issn.1002-0829.215044.

I. H. Sarker, “Machine Learning: Algorithms, Real-World Applications and Research Directions,” SN Computer Science, vol. 2, no. 3. Springer, May 01, 2021. doi: 10.1007/s42979-021-00592-x.

G. Mountrakis, J. Im, and C. Ogole, “Support vector machines in remote sensing: A review,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 66, no. 3. pp. 247–259, May 2011. doi: 10.1016/j.isprsjprs.2010.11.001.

J. Nayak, B. Naik, and Prof. Dr. H. Behera, “A Comprehensive Survey on Support Vector Machine in Data Mining Tasks: Applications & Challenges,” International Journal of Database Theory and Application, vol. 8, pp. 169–186, Jun. 2015, doi: 10.14257/ijdta.2015.8.1.18.

D. Srivastava and L. Bhambhu, “Data classification using support vector machine,” J Theor Appl Inf Technol, vol. 12, pp. 1–7, Jun. 2010.

Q. Li, Q. Meng, J. Cai, H. Yoshino, and A. Mochida, “Applying support vector machine to predict hourly cooling load in the building,” Appl Energy, vol. 86, no. 10, pp. 2249–2256, 2009, doi: 10.1016/j.apenergy.2008.11.035.

L. Auria, R. A. M. Berlin, and R. A. Moro, “Support Vector Machines (SVM) as a Technique for Solvency Analysis,” 2008.

A. P. Wibawa et al., “Naïve Bayes Classifier for Journal Quartile Classification,” International Journal of Recent Contributions from Engineering, Science & IT (iJES), vol. 7, no. 2, p. 91, Jun. 2019, doi: 10.3991/ijes.v7i2.10659.

A. Jamain and D. J. Hand, “The Naive Bayes Mystery: A classification detective story,” Pattern Recognit Lett, vol. 26, no. 11, pp. 1752–1760, Aug. 2005, doi: 10.1016/j.patrec.2005.02.001.

U. Dulhare, “Prediction system for heart disease using Naive Bayes and particle swarm optimization,” Biomedical Research, vol. 29, Jun. 2018, doi: 10.4066/biomedicalresearch.29-18-620.

S. Jyothi and P. Bhargavi, “Applying Naive Bayes Data Mining Technique for Classification of Agricultural Land Soils,” 57. P.Bhargavi , S. Jyothi, vol. 9, Jun. 2009.

S. Kotsiantis, “Supervised Machine Learning: A Review of Classification Techniques,” Informatica (Ljubljana), vol. 31, Jun. 2007.

H. A. Park, “An introduction to logistic regression: from basic concepts to interpretation with particular attention to nursing domain,” J Korean Acad Nurs, vol. 43, no. 2, p. 154—164, Apr. 2013, doi: 10.4040/jkan.2013.43.2.154.

A. Strzelecka, A. Kurdys-Kujawska, and D. Zawadzka, “Application of logistic regression models to assess household financial decisions regarding debt,” Procedia Comput Sci, vol. 176, pp. 3418–3427, Jun. 2020, doi: 10.1016/j.procs.2020.09.055.

C. Y. J. Peng, K. L. Lee, and G. M. Ingersoll, “An introduction to logistic regression analysis and reporting,” Journal of Educational Research, vol. 96, no. 1, pp. 3–14, 2002, doi: 10.1080/00220670209598786.

Q. Ren, H. Cheng, and H. Han, “Research on machine learning framework based on random forest algorithm,” in AIP Conference Proceedings, Jun. 2017, p. 80020. doi: 10.1063/1.4977376.

F. Tarsha-Kurdi, W. Amakhchan, and Z. Gharineiat, “Random Forest Machine Learning Technique for Automatic Vegetation Detection and Modelling in LiDAR Data Mini Review Int J Environ Sci Nat Res,” Journal of Environmental Science and Natural Resources, vol. 28, Jun. 2021, doi: 10.19080/IJESNR.2021.28.556234.

J. Ali, R. Khan, N. Ahmad, and I. Maqsood, “Random Forests and Decision Trees,” 2012. [Online]. Available: www.IJCSI.org

M. Aria, C. Cuccurullo, and A. Gnasso, “A comparison among interpretative proposals for Random Forests,” Machine Learning with Applications, vol. 6, p. 100094, Dec. 2021, doi: 10.1016/j.mlwa.2021.100094.

T. Wood, “What is F-score?,” DeepAI. https://deepai.org/machine-learning-glossary-and-terms/f-score (accessed Jun. 17, 2023).

T. Fawcett, “An introduction to ROC analysis,” Pattern Recognit Lett, vol. 27, no. 8, pp. 861–874, Jun. 2006, doi: 10.1016/j.patrec.2005.10.010.

S. Narkhede, “Understanding AUC - ROC Curve,” Towards Data Science, Jun. 27, 2018. https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5 (accessed Jun. 17, 2023).