Enhancing Email Security through Multi-Level Feature Fusion for Malicious URL Detection

Duc-Tho Mai, Quyet-Long Pham Nguyen, Thanh-Nam Pham

Abstract


Phishing attacks that utilize URLs embedded in emails have become increasingly sophisticated, posing significant risks of sensitive data leakage and bypassing traditional security mechanisms. Conventional approaches, which rely on blacklists or manually crafted features, often struggle to keep pace with the rapidly changing characteristics of malicious URLs. In response to this challenge, this study explores the use of URLNet, a deep learning model that learns feature representations at both character and word levels, for detecting malicious URLs in email systems. A large-scale dataset comprising 927,037 URLs—657,736 benign and 269,301 malicious—was constructed to ensure a diverse and realistic sample. Alongside reproducing the original URLNet N-gram model, this research also evaluates Word–Char Fusion and hybrid architectures that integrate convolutional and recurrent neural networks. The experimental results demonstrate that the Word–Char Fusion model significantly surpasses the N-gram baseline, achieving an accuracy of 97.65%, an F1-score of 0.9596, and a ROC-AUC of 0.9954, while reducing inference time to 0.0763 milliseconds per sample. Although hybrid CNN-RNN models provide comparable detection performance, they entail higher computational costs. These findings suggest
that the Word–Char Fusion CNN architecture strikes an effective balance between detection accuracy and efficiency, making it well-suited for real-time deployment in practical email security settings.


References


“Phishing Activity Trends Reports,” accessed: 2026-02-17. [Online]. Available: https://apwg.org/trendsreports

M. V. Pachpatil and A. Agrawal, “Evolution of malicious url detection: A review of techniques for malicious url detection and classification,” in Advances in Emerging Technologies and Computing Innovations, M. M. Ghonge, H. Liu, M. Khan, and T. A. Tran, Eds. Cham: Springer Nature Switzerland, 2025, pp. 467–475.

Y. Tian, Y. Yu, J. Sun, and Y. Wang, “From past to present: A survey of malicious url detection techniques, datasets and code repositories,” Computer Science Review, vol. 58, p. 100810, 2025.

M. Aljabri, H. S. Altamimi, S. A. Albelali, M. Al-Harbi, H. T. Alhuraib, N. K. Alotaibi, A. A. Alahmadi, F. Al-haidari, R. M. A. Mohammad, and K. Salah, “Detecting malicious urls using machine learning techniques: Re-view and research directions,” IEEE Access, vol. 10, pp.121 395–121 417, 2022.

K. S. Ray and R. Kusshwaha, “Detection of Mali-cious URLs Using Deep Learning Approach,” in The “Essence” of Network Security: An End-to-End Panorama, M. Chakraborty, M. Singh, V. E. Balas, and I. Mukhopadhyay, Eds. Singapore: Springer Singapore, 2021, pp. 189–

R. Patgiri, A. Biswas, and S. Nayak, “deepBF: Malicious URL detection using learned Bloom Filter and evolutionary deep learning,” Computer Communications, vol. 200, pp. 30–41, 2023.

S. Srinivasan, R. Vinayakumar, A. Arunachalam, M. Alazab, and K. Soman, “DURLD: Malicious URL Detection Using Deep Learning-Based Character Level Representations,” in Malware Analysis Using Artificial

Intelligence and Deep Learning, M. Stamp, M. Alazab, and A. Shalaginov, Eds. Cham: Springer International Publishing, 2021, pp. 535–554.

H. Le, Q. Pham, D. Sahoo, and S. C. H. Hoi, “Urlnet: Learning a url representation with deep learning for malicious url detection,” ArXiv, vol. abs/1802.03162, 2018.

P. Maneriker, J. W. Stokes, E. G. Lazo, D. Carutasu, F. Tajaddodianfar, and A. Gururajan, “Urltran: Improving phishing url detection using transformers,” in MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM). IEEE Press, 2021, p. 197–204.

C. Opara, Y. Chen, and B. Wei, “Look before you leap: Detecting phishing web pages by exploiting raw url and html characteristics,” Expert Systems with Applications, vol. 236, p. 121183, Feb. 2024.

“PhishTank Developer Information,” accessed: 2026-02-17. [Online]. Available: https://phishtank.org/developer_info.php

“URLhaus | Browse,” accessed: 2026-02-17. [Online]. Available: https://urlhaus.abuse.ch/browse/

“Malicious URLs Dataset (40k Samples),” accessed: 2026-02-18. [Online]. Available: https://www.kaggle.com/datasets/himadri07/malicious-urls-dataset-15k-rows

A. S. Rafsanjani, N. Binti Kamaruddin, M. Behjati, S. Aslam, A. Sarfaraz, and A. Amphawan, “Enhancing malicious url detection: A novel framework leveraging priority coefficient and feature evaluation,” IEEE Access, vol. 12, pp. 85 001–85 026, 2024.

W. Yang, W. Zuo, and B. Cui, “Detecting malicious urls via a keyword-based convolutional gated-recurrent-unit neural network,” IEEE Access, vol. 7, pp. 29 891–29 900, 2019.

M. Türk, F.; Kılıçaslan, “Malicious url detection with advanced machine learning and optimization-supported deep learning models,” Applied Sciences, vol. 15, no. 18, p. 10090, 2025.

M. Alsaedi, F. A. Ghaleb, F. Saeed, J. Ahmad, and M. Alasli, “Multi-modal features representation-based convolutional neural network model for malicious website detection,” IEEE Access, vol. 12, pp. 7271–7284, 2024.




DOI: http://dx.doi.org/10.21553/rev-jec.460

Copyright (c) 2026 REV Journal on Electronics and Communications


ISSN: 1859-378X

Copyright © 2011-2026
Radio and Electronics Association of Vietnam
All rights reserved