Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Comput. Sci.

Sec. Computer Security

Volume 7 - 2025 | doi: 10.3389/fcomp.2025.1582206

COMPONENT FEATURES BASED ENHANCED PHISHING WEBSITE DETECTION SYSTEM USING EfficientNet, FH-BERT, AND SELU-CRNN METHODS

Provisionally accepted
Mahmoud  MurhejMahmoud Murhej*G  NallasivanG Nallasivan*
  • Vel Tech Rangarajan Dr.Sagunthala R&D Institute of Science and Technology, Chennai, India

The final, formatted version of the article will be published soon.

Phishing is a kind of cybercrime, which is induced by hackers to steal sensitive information of users. So, it becomes essential to detect phishing attacks on websites. Many prevailing works utilized the Uniform Resource Locator (URL) link and Document Object Model (DOM) tree structures for Phishing Website Detection (PWD). However, these existing works caused inaccurate detection results as the phishing website is an imitation of a legitimate website. Therefore, the proposed PWD system is developed by focusing on the important features and components of the websites to enhance the detection efficiency. The proposed work begins with the collection of URL links from the phishing website dataset. Then, the Hypertext Markup Language (HTML) formats are created for those URLs. Afterward, the DOM tree structure is constructed from the HTML format, and the components are extracted from it along with Natural Language Processing (NLP), credentials, URL, DOM tree similarity, and component features extractions. Meanwhile, the DOM-tree components are converted into score values by utilizing the Feature Hasher-Bidirectional Encoder Representations from Transformers (FH-BERT). Further, the component features and the score values of the DOM-tree components are concatenated by the feature fusion process. Thereafter, the significant features are selected using an Entropy-based Chameleon Swarm Algorithm (ECSA). Lastly, the phishing website is detected by employing the SELU-CRNN. The simulated results exhibited that the proposed technique improved the detection performance of PWD with higher accuracy (98.42%) and minimum training time (63003ms) than the prevailing methods.

Keywords: phishing website detection, cybersecurity, Component features, URL, DOM tree, EfficientNet, Phishing attacks, SELU-CRNN

Received: 25 Feb 2025; Accepted: 26 Aug 2025.

Copyright: © 2025 Murhej and Nallasivan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Mahmoud Murhej, Vel Tech Rangarajan Dr.Sagunthala R&D Institute of Science and Technology, Chennai, India
G Nallasivan, Vel Tech Rangarajan Dr.Sagunthala R&D Institute of Science and Technology, Chennai, India

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.