Ensemble Learning-Powered URL Phishing Detection: A Performance Driven Approach

Shougfta Mushtaq
Tabassum Javed
Mazliham Mohd Su’ud


With the rapid growth in the usage of the Internet, criminals have found new ways to engage in cyber-attacks. The most common and widespread attack is URL phishing. The proposed system focuses on improving phishing website detection using feature selection and ensemble learning. This model uses two datasets, DS-30 and DS-50, each with 30 and 50 features. Ensemble learning using a voting classifier was then applied to train the model, achieving more accuracy. The combination of HEFS with random forest distribution achieved 94.6% accuracy while minimizing the number of features used (20.8% of the base feature set). The classifier works in the proposed model, and the accuracy is 96% and 98% on the DS-30 and DS-50 datasets, respectively. The hybrid model uses a combination of different factors to distinguish phishing websites from legitimate websites.

Mushtaq, S., Javed, T., & Mohd Su’ud, M. (2024). Ensemble Learning-Powered URL Phishing Detection: A Performance Driven Approach. Journal of Informatics and Web Engineering, 3(2), 134–145. https://doi.org/10.33093/jiwe.2024.3.2.10
