Unveiling the Efficacy of AI-based Algorithms in Phishing Attack Detection

Main Article Content

Tajamul Shahzad
Kashif Aman

Abstract

Phishing poses a significant challenge in an ever-evolving world. The increased usage of the Internet has resulted in the emergence of a different kind of theft referred to as cybercrime. The term cybercrime describes the act of invading privacy and illegitimately obtaining personal information using digital platform. Primarily an approach named phishing is employed, which involves the use of spoof emails or bogus websites by the attackers to get the victim's personal information like their account credentials, debit, or credit card’s number, etc. To give the brief knowledge of phishing attacks and their types of the objective of this work is to investigate various AI algorithms. Through a detail literature 14 AI algorithms which are repeatedly used for detection, and these are Random Forests, Convolutional Neural Network, Naïve Bayes, K-Nearest Neighbours algorithm, Decision Trees, long short-term memory, gated recurrent unit, Artificial Neural Network, AdaBoost, Logistic Regression, Gradient Boost, Multi-layer perceptron, Recurrent Neural Network, Extreme gradient boosting, and Support Vector Machine to detect phishing attacks. To verify the effectiveness of these algorithms an experiment is performed on two datasets. Among all the algorithms Convolutional Neural Network, Multi-layer perceptron and AdaBoost achieved more than 90% accuracy, precision and sensitivity and it was showed through results that these algorithms are very efficient and can achieve high accuracy if used to the requirements of specific scenario with proper planning. Moreover, the paper shows how different AI techniques have been employed in multiple studies to detect and address phishing attacks. Also, this paper gives a complete list of current problems with phishing attacks and ideas for future studies in this area.   

Article Details

How to Cite
Shahzad, T., & Aman, K. (2024). Unveiling the Efficacy of AI-based Algorithms in Phishing Attack Detection. Journal of Informatics and Web Engineering, 3(2), 116–133. https://doi.org/10.33093/jiwe.2024.3.2.9
Section
Regular issue

References

M. Somesha, A.R. Pais, R.S.b Rao and V.S.Rathour, "Efficient deep learning techniques for the detection of phishing websites.", Sadhana 45 (2020): 1-18.

J.F. Lai and S.H. Heng, "Secure file storage on cloud using hybrid cryptography.", Journal of Informatics and Web Engineering 1, no. 2 (2022): 1-18.

T. Munusamy and T. Khodadi, "Building Cyber Resilience: Key Factors for Enhancing Organizational Cyber Security.", Journal of Informatics and Web Engineering 2, no. 2 (2023): 59-71.

S.Y. Yerima and M.K. Alzaylaee, "High accuracy phishing detection based on convolutional neural networks.", In 2020 3rd International Conference on Computer Applications & Information Security (ICCAIS), pp. 1-6. IEEE, 2020.

C. Opara, B. Wei and Y. Chen, "HTMLPhish: Enabling phishing web page detection by applying deep learning techniques on HTML analysis.", In 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1-8. IEEE, 2020.

A. Abuzuraiq, M. Alkasassbeh and M. Almseidin, "Intelligent methods for accurately detecting phishing websites.", In 2020 11th International Conference on Information and Communication Systems (ICICS), pp. 085-090. IEEE, 2020.

C. Singh, "Phishing website detection based on machine learning: A survey.", In 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), pp. 398-404. IEEE, 2020.

A. Basit, M. Zafar, X. Liu, A.R. Javed, Z. Jalil and K. Kifayat, "A comprehensive survey of AI-enabled phishing attacks detection techniques.", Telecommunication Systems 76 (2021): 139-154.

H.F. Atlam and O. Oluwatimilehin, "Business Email Compromise Phishing Detection Based on Machine Learning: A Systematic Literature Review, Electronics 2023, 12, 42." (2022).

C. Catal, G. Giray, B. Tekinerdogan, S. Kumar and S. Shukla, "Applications of deep learning for phishing detection: a systematic literature review.", Knowledge and Information Systems 64, no. 6 (2022): 1457-1500.

M. Hussain, C. Cheng, R. Xu and M. Afzal, "CNN-Fusion: An effective and lightweight phishing detection method based on multi-variant ConvNet.", Information Sciences 631 (2023): 328-345.

Z. Alshingiti, R. Alaqel, J. Al-Muhtadi, Q.E.U. Haq, K. Saleem and M.H. Faheem, "A deep learning-based phishing detection system using CNN, LSTM, and LSTM-CNN.", Electronics 12, no. 1 (2023): 232.

B.B. Gupta, A. Tewari, A.K. Jain and D.P. Agrawal, "Fighting against phishing attacks: state of the art and future challenges.", Neural Computing and Applications 28 (2017): 3629-3654.

A. Subasi, E. Molah, F. Almkallawi and T.J. Chaudhery, "Intelligent phishing website detection using random forest classifier.", In 2017 International conference on electrical and computing technologies and applications (ICECTA), pp. 1-5. IEEE, 2017.

A.K. Jain and B.B. Gupta, "Towards detection of phishing websites on client-side using machine learning based approach.", Telecommunication Systems 68 (2018): 687-700.

V. Ra, B.G. HBa, A.K. Ma, S. KPa, P. Poornachandran,and A. Verma, "DeepAnti-PhishNet: Applying deep neural networks for phishing email detection.", In Proc. 1st AntiPhishing Shared Pilot 4th ACM Int. Workshop Secur. Privacy Anal.(IWSPA), pp. 1-11. Tempe, AZ, USA, 2018.

W. Yao, Y. Ding and X. Li, "Logophish: A new two-dimensional code phishing attack detection method.", In 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), pp. 231-236. IEEE, 2018.

V. Patil, P. Thakkar, C. Shah, T. Bhat and S.P. Godse, "Detection and prevention of phishing websites using machine learning approach.", In 2018 Fourth international conference on computing communication control and automation (ICCUBEA), pp. 1-5. Ieee, 2018.

I. Tyagi, J. Shad, S. Sharma, S. Gaur and G. Kaur, "A novel machine learning approach to detect phishing websites.", In 2018 5th International conference on signal processing and integrated networks (SPIN), pp. 425-430. IEEE, 2018.

P. Yang, G. Zhao and P. Zeng, "Phishing website detection based on multidimensional features driven by deep learning.", IEEE access 7 (2019): 15196-15209.

E. Benavides, W. Fuertes, S. Sanchez and M. Sanchez, "Classification of phishing attack solutions by employing deep learning techniques: A systematic literature review.", Developments and Advances in Defense and Security: Proceedings of MICRADS 2019 (2020): 51-64.

Y. Fang, C. Zhang, C. Huang, L. Liu and Yue Yang, "Phishing email detection using improved RCNN model with multilevel vectors and attention mechanism.", IEEE Access 7 (2019): 56329-56340.

K.L. Chiew, C.L. Tan, K. Wong, K.S. Yong and W.K. Tiong, "A new hybrid ensemble feature selection framework for machine learning-based phishing detection system.", Information Sciences 484 (2019): 153-166.

S. Al-Ahmadi, "A deep learning technique for web phishing detection combined URL features and visual similarity.", International Journal of Computer Networks & Communications (IJCNC) Vol 12 (2020).

Y.A. Alsariera, V.E. Adeyemo, A.O. Balogun and A.K. Alazzawi, "Ai meta-learners and extra-trees algorithm for the detection of phishing websites.", IEEE access 8 (2020): 142532-142542.

T. Gangavarapu, C.D. Jaidhar and B. Chanduka, "Applicability of machine learning in spam and phishing email filtering: review and approaches.", Artificial Intelligence Review 53, no. 7 (2020): 5019-5081.

U. Ozker and O.K. Sahingoz, "Content based phishing detection with machine learning.", In 2020 International Conference on Electrical Engineering (ICEE), pp. 1-6. IEEE, 2020.

M. Al-Sarem, F. Saeed, Z.G. Al-Mekhlafi, B.A. Mohammed, T. Al-Hadhrami, M.T. Alshammari, A. Alreshidi and T.S. Alshammari,"An optimized stacking ensemble model for phishing websites detection.", Electronics 10, no. 11 (2021): 1285.

Y. Kontsewaya, E. Antonov and A. Artamonov, "Evaluating the effectiveness of machine learning methods for spam detection.", Procedia Computer Science 190 (2021): 479-486.

R. Yang, K. Zheng, B. Wu, C. Wu and X. Wang, "Phishing website detection based on deep convolutional neural network and random forest ensemble learning.", Sensors 21, no. 24 (2021): 8281.

A. Lakshmanarao, P.S.P. Rao and M.B. Krishna, "Phishing website detection using novel machine learning fusion approach.", In 2021 international conference on artificial intelligence and smart systems (ICAIS), pp. 1164-1169. IEEE, 2021.

A. Hannousse and S. Yahiouche, "Towards benchmark datasets for machine learning based website phishing detection: An experimental study.", Engineering Applications of Artificial Intelligence 104 (2021): 104347.

R.S. Rao, A. Umarekar and A.R. Pais, "Application of word embedding and machine learning in detecting phishing websites.", Telecommunication Systems 79, no. 1 (2022): 33-45.

S. Al-Ahmadi, A. Alotaibi and O. Alsaleh,"PDGAN: Phishing detection with generative adversarial networks.", IEEE Access 10 (2022): 42459-42468.

H. Alqahtani, S.S. Alotaibi, F.S. Alrayes, I. Al-Turaiki, K. A. Alissa, A.S.A. Aziz, M. Maray and M. Al Duhayyim, "Evolutionary Algorithm with Deep Auto Encoder Network Based Website Phishing Detection and Classification.", Applied Sciences 12, no. 15 (2022): 7441.

Yu, Shuaicong, Changqing An, Tao Yu, Ziyi Zhao, Tianshu Li, and Jilong Wang, "Phishing Detection Based on Multi-Feature Neural Network.", In 2022 IEEE International Performance, Computing, and Communications Conference (IPCCC), pp. 73-79. IEEE, 2022.

A.A. Orunsolu, A.S. Sodiya and A.T. Akinwale, "A predictive model for phishing detection.", Journal of King Saud University-Computer and Information Sciences 34, no. 2 (2022): 232-247.

S. Minocha and B. Singh, "A novel phishing detection system using binary modified equilibrium optimizer for feature selection.", Computers & Electrical Engineering 98 (2022): 107689.

D. Guptta, Sumitra, K.T. Shahriar, H. Alqahtani, D. Alsalman and I.H. Sarker, "Modeling hybrid feature-based phishing websites detection using machine learning techniques.", Annals of Data Science 11, no. 1 (2024): 217-242.

F. Zheng, Q. Yan, Victor C.M. Leung, F. R. Yu and Z. Ming, "HDP-CNN: Highway deep pyramid convolution neural network combining word-level and character-level representations for phishing website detection.", Computers & Security 114 (2022): 102584.

D.J. Liu, G.G. Geng and X.C. Zhang. "Multi-scale semantic deep fusion models for phishing website detection." Expert Systems with Applications 209 (2022): 118305.

U.A. Butt, R. Amin, H. Aldabbas, S. Mohan, B. Alouffi and A. Ahmadian, "Cloud-based email phishing attack using machine and deep learning algorithm.", Complex & Intelligent Systems 9, no. 3 (2023): 3043-3070.

P. Bountakas and C. Xenakis, "Helphed: Hybrid ensemble learning phishing email detection.", Journal of network and computer applications 210 (2023): 103545.

P. Wu and H. Zhao, "Some analysis and research of the AdaBoost algorithm.", In International Conference on Intelligent Computing and Information Science, pp. 1-5. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011.

I.A.A. Amra and A.Y. Maghari,"Students performance prediction using KNN and Naïve Bayesian.", In 2017 8th international conference on information technology (ICIT), pp. 909-913. IEEE, 2017.

P. Carmona, F. Climent and A. Momparler, "Predicting failure in the US banking sector: An extreme gradient boosting approach.", International Review of Economics & Finance 61 (2019): 304-323.

V.H. Nhu, A. Shirzadi, H. Shahabi, S.K. Singh, N. Al-Ansari, J.J. Clague, A. Jaafari et al, "Shallow landslide susceptibility mapping: A comparison between logistic model tree, logistic regression, naïve bayes tree, artificial neural network, and support vector machine algorithms.", International journal of environmental research and public health 17, no. 8 (2020): 2749.

A. Aldweesh, A. Derhab and A.Z. Emam, "Deep learning approaches for anomaly-based intrusion detection systems: A survey, taxonomy, and open issues.", Knowledge-Based Systems 189 (2020): 105124.

O. Wisesa, A. Adriansyah and O.I. Khalaf, "Prediction analysis sales for corporate services telecommunications company using gradient boost algorithm.", In 2020 2nd International Conference on Broadband Communications, Wireless Sensors and Powering (BCWSP), pp. 101-106. IEEE, 2020.

P.F. Orru, A. Zoccheddu, L. Sassu, C. Mattia, R. Cozza and S. Arena, "Machine learning approach using MLP and SVM algorithms for the fault prediction of a centrifugal pump in the oil and gas industry.", Sustainability 12, no. 11 (2020): 4776.

H.F. Atlam and O Oluwatimilehin, "Business email compromise phishing detection based on machine learning: A systematic literature review.", Electronics 12, no. 1 (2022): 42.