ORIGINAL RESEARCH article

Front. Comput. Sci., 21 January 2026

Sec. Computer Security

Volume 7 - 2025 | https://doi.org/10.3389/fcomp.2025.1687867

Identifying key features for phishing website detection through feature selection techniques

  • 1. Faculty of Information Technology, Zarqa University, Zarqa, Jordan

  • 2. Faculty of Engineering and Information Technology, Palestine Ahliya University, Bethlehem, Palestine

Article metrics

View details

801

Views

67

Downloads

Abstract

Over the past few years, phishing has evolved into an increasingly prevalent form of cybercrime, as more people use the Internet and its applications. Phishing is a type of social engineering that targets users' sensitive or personal information. This paper seeks to achieve two main objectives: first, to identify the most effective classifier for detecting phishing among 40 classifiers representing six learning strategies. Secondly, it aims to determine which feature selection method performs best on websites with phishing datasets. By analyzing three unique datasets on phishing and evaluating eight metrics, this study found that Random Forest and Random Tree were superior at identifying phishing websites compared with other approaches. Similarly, GainRatioAttributeEval, along with InfoGainAttributeEval, performed better than the five alternative feature selection methods considered in this study.

1 Introduction

Due to the widespread use of online services like e-commerce and social media and the increased access afforded by the Internet, users are increasingly susceptible to cyberattacks targeting sensitive information, such as usernames or credit card details. One popular method used by attackers is called phishing, which uses fraudulent websites that appear authentic and trick individuals into divulging their private data (Athulya and Praveen, 2020). This can be accomplished using email or text messages designed solely for this purpose; even communication between clients and companies may contain such deceptive links. Typically motivated by financial gain, malware infections on user machines, or identity theft, most phishing attempts involve these motives.

Recent findings indicate a dramatic increase in unique reported instances, exceeding 199 thousand detections in December 2020 alone—an alarming statistic compared with the Anti-Phishing Working Group's results from previous years (APWG, 2021). Moreover, since the early days of the pandemic in March last year, when global COVID-19 fears were high, scammers have frequently issued phony certificates containing the words “COVID” or “corona.” These scammers have increasingly relied on digital certification policies and HTTPS protocols rather than on traditional tactics (Warburton, 2020).

Broadly, there are two ways to identify phishing: through user knowledge or anti-phishing software. Due to the realism of phishing emails and websites, many users find it challenging to detect them. Consequently, accurate software solutions for detecting these threats have become increasingly necessary. Software-based detection strategies include blocklisting, heuristics, and machine learning (Athulya and Praveen, 2020). Previous studies using machine learning often relied on numerous features to achieve high accuracy; however, extracting these features is not always possible in real-time scenarios, requiring more resilient solutions.

The purpose of this paper is to support the worldwide effort to combat phishing scams by leveraging advanced machine learning techniques to predict fraudulent websites accurately.

Numerous classification models have been proposed and employed to identify phishing websites, claiming superiority over other approaches (Alazaidah et al., 2018). Moreover, this study aims to determine the most suitable classification method (classifier) for phishing datasets. To obtain a comprehensive overview of the findings, more than 40 classifiers across six learning strategies are evaluated using several metrics, including accuracy, precision, recall, and F1-measure.

Feature selection is one of several necessary preprocessing steps when creating any machine learning (ML)-based learning model. Its purpose is to identify relevant features that aid in constructing intended models by selecting non-redundant consistent attributes (Alluwaici M. et al., 2020). The feature selection procedure always prioritizes characteristics that closely align with the objective qualities of the dataset's attributes (Alluwaici M. et al., 2020).

To achieve the goal, 40 classifiers from six well-known learning strategies were selected for assessment. The evaluation phase encompasses eight diverse, commonly used metrics, including accuracy, precision, recall, and AUC. Besides, it aims to implicitly identify the best learning strategy among those considered using four distinct evaluation indicators: accuracy, precision, recall, F-Measure, MCC, PRC area, and ROC-Area (receiver operating characteristics).

The second objective of this study is to determine the optimal feature selection technique for predicting phishing websites. To achieve this objective, five commonly used feature selection methods were assessed and compared with identical classifiers used in the first goal across three evaluation metrics: accuracy, precision, and recall.

The remaining sections of the paper are structured as follows: Section 2 reviews the current literature on implementing ML techniques for phishing. In Section 3, we present our methodology, results, and discussion. Finally, concluding remarks and future directions are proposed in Section 4.

2 Related research

In this section, we examine prior research that has used machine learning techniques to detect phishing. In their study on fuzzy rough set feature selection, Zabihimayvan and Doran (2019) used multiple features to construct a model intended to detect fraudulent activity attempts by criminals intentionally sidestepping existing anti-phishing measures on Iranian banking websites. They trained and tested their system using fuzzy experts, achieving an accuracy of around 88%. Still, they acknowledged that there is scope for optimizing feature selection during the training/testing phases, which could increase predictive power while reducing prediction time.

A different approach was taken by Cui (2019), leveraging data analytics across multiple search engines as its source material identifying idle URLs previously exposed through popular searches or internal links shared between identified related sites along with additional input from frequently visited pages from URL structural similarity evaluation utilizing twelve (12) distinct characteristics depicting intra-relatedness/popularity degrees among entered site structures and components; altogether building classifiers resulting overall classification rates exceeding nearly ninety-five percent success rate coupled at about one-and-a-half false positives per classifying session—however may overlook obfuscated content when analyzing linked materials such as domain name variations generated algorithmically/hosted solely off malicious web domains themselves/limited character string-denser link shortening platforms commonly employed against undetected trapping activities.

Gandotra and Gupta (2021) compared various ML techniques using a 30-feature set comprising approximately 5,000 phishing websites and over 6,000 authentic webpages. This study found that incorporating feature selection enables faster creation of effective phishing detection models while maintaining accuracy. Notably, their results highlight that random forest classification (RF) achieves superior accuracy regardless of whether feature selection is used.

Detecting phishing attempts using ML often involves analyzing lexical features of URLs. This method, pioneered by Abutaha et al. (2021), was intended for use as a browser plug-in that scrutinizes a webpage's URL to alert users before they visit it. To test the efficacy of this technique, over one million legitimate and fraudulent URLs were used in experiments that extracted 22 variables, which were reduced to 10 key ones.

Findings revealed an accuracy rate of 99.89% when combined with SVM classification, surpassing the RF classifier, gradient boosting classifier (GBC), and neural network approaches trialed alongside it.

Chapla et al. (2019) proposed a fuzzy-logic-based framework for detecting phishing websites, using a dataset containing both legitimate and fraudulent URLs. The model achieved 91.4% accuracy but was limited by a small sample size of 1,000 features focused solely on URL-related attributes; as a result, it is less effective at identifying other bypass techniques.

The author in Tan (2018) improved the performance of their phishing URL detection system by using lexical features. A model proposed in Chiew et al. (2019) achieved high accuracy while being independent of third-party services and source code analysis, thereby requiring less processing time. Meanwhile, authors in Abdelhamid et al. (2014) sought to enhance the accuracy of phishing detection systems through feature selection and an ensemble learning approach, achieving 95% accuracy in their experiments.

In yet another effort detailed in article (Su et al., 2023), an innovative approach used seven distinct machine learning algorithms for detecting potential risks posed by various unwanted attacks, including those utilizing zero-day exploits, with selected implemented security features overcoming issues such as language dependency or reliance on external parties during real-time monitoring operations without issue!

Rahman et al.'s research also explored machine learning classifiers' ability concerning various datasets related to phishing practices (Gandotra and Gupta, 2021). This initiative likewise demonstrated equivalent results, with gradient boosting trees (GBT) outperforming all metrics and achieving higher success rates than other methods, such as random forest (RF).

OFS-NN was proposed by Sahingoz et al. (2019) and combines optimal feature selection with a neural network to mitigate overfitting by using a new metric, the feature validity value (FVV). Experimental results on two datasets demonstrated that FVV outperformed information gain and optimal feature selection across various categories, including specific features such as abnormal, domain, HTML/JavaScript, and even address-bar features. The OFS-NN model achieved an overall accuracy of 0.945; however, among the feature types used for detection, the highest accuracy, 0.903, was observed with “address bar,” while the lowest, around half accurate at 0.562, was observed with HTML/JavaScript.

Another phishing detection system was introduced by Sahingoz et al. (2019), which comprises 40 NLP-based traits, along with additional hybrid characteristics derived from word vectorization, totaling about 1,700 more relevant aspects.

In their study, the authors compared seven distinct algorithms offering diverse options but ultimately determined random forest's implementation made using solely natural language processing delivered the most superior performance, scoring almost perfect precision statistics, peaking up to staggering score amounts nearing practically zenith level, i.e., tracing fraudulent websites based upon this criterion managed to reach correct outcomes nearly 98 percent times—rendering maximum efficacy amongst all tested methodologies researched herein.

In Alazaidah et al. (2024), the authors conduct a comparative analysis of 24 classifiers across two datasets using several evaluation metrics. The results revealed the superiority of the random forest, filtered classifier, and J48 classifiers. The author suggests considering additional classification models with different learning strategies, as well as more datasets and evaluation metrics.

The research in Aljofey et al. (2025) proposed a hybrid methodology that combines URL character embeddings with several handcrafted features. Three datasets were used in this work: two are benchmarks, and the third was collected and preprocessed by the authors. The results showed excellent performance across accuracy and other evaluation metrics.

Several deep learning optimization techniques were used in Barik et al. (2025) to improve phishing prediction on websites. The authors used standardization and variational autoencoder techniques in the preprocessing step, and an enhanced grid search optimizer to improve accuracy. The results showed superior performance across accuracy, precision, and F1-score metrics. Unfortunately, utilizing one dataset only does not help in generalizing the finding of the conducted research. Several other related research works could be find in Ganjei and Boostani (2022), Gareth et al. (2023), (Ni et al. (2022), Nti et al. (2022), Rashid et al. (2020), Srivastava (2014), Ubing et al. (2019).

Throughout this literature review, random forests perform comparatively better than their counterparts in detecting phishing using machine learning. However, gradient boosting machines (GBM) were frequently not a subject of comparison, affecting project linearity and requiring deeper exploration, while lackluster attempts, such as minimal input/no-noise coefficient data filtering, were still in early phases, indicating that extensive future research remains vital.

3 Research methodology

The methodology employed in this paper is depicted in Figure 1. The first phase in Figure 1 involves collecting the datasets. Afterward, the datasets are cleaned and preprocessed. Then, several feature selection techniques are trained on the pre-processed datasets and evaluated. Next, 40 classification models are trained on the datasets using the selected features from the previous step. These classifiers are compared using several well-known evaluation metrics.

Figure 1

The description of three website phishing datasets used in this research is provided in Section (A), while Sections (B, C, and D) evaluate the performance of feature selection and machine learning algorithms on these datasets.

Moreover, Section 4 considers which classification model is most appropriate for phishing website datasets. Therefore, three datasets are considered in this section.

In addition to that, Section 5 evaluates and identifies the best among five renowned feature selection methods, as well as identifying the most efficient classifiers, which are outlined in Section 6 before finally discussing primary results obtained from these sections' analyses at length.

In addition, 40 classifiers from six learning strategies are evaluated and contrasted in terms of their predictive efficacy across the three datasets under consideration. These examined classifiers encompass:

Random tree, random forest, REPTree, DecisionStump, HoeffdingTree, LMT, J4B, and REPTree from the Trees learning strategy; BayesNet, NaiveBayesUpdateable, and NaiveBayes from the Bayes learning strategy. Logistic, MultilayerPerceptron, SimpleLogistic, VotedPerceptron, and SMO from the Functions strategy. IBK, KStar, and LWL from the lazy learning strategy; AdaBoostM1, AttributeSelectedClassifier, Bagging, ClassificationViaRegression, FilteredClassifier, IterativeClassifierOptimizer, LogitBoost, MultiClassClassifier, MultiClassClassifierUpdateable, RandomCommittee, RandomizableFilteredClassifier, RandomSubSpace, Stacking, WeightedInstancesHandlerWrapper, vot, and CVParameterSelectionr from the Meta learning strategy; DecisionTable, JRip, OneR, PART, and ZeroR learning strategy. Finally, InputMappedClassifier from the misc learning strategy.

The WEKA software's default settings are utilized for all classification models. This renowned data analysis tool, also known as (Waikato Environment for Knowledge Analysis), is frequently used (Rao et al., 2020). The outcome validation process uses 10-fold cross-validation to ensure the results.

To compare the considered classification models, six performance metrics were analyzed: Accuracy, precision, recall, F-measure, MCC (Matthews correlation coefficient), ROC Area, and PRC Area. Next up are the equations needed to calculate these metrics.

Accuracy is a metric that indicates how frequently a machine learning model predicts the correct outcome. The number of right guesses divided by the total number of forecasts yields accuracy (Alzyoud et al., 2024; Alazaidah et al., 2023a,b).

Precision is a metric that indicates how often a machine learning model correctly predicts the positive class. Precision can be calculated as the number of correct positive predictions (true positives) divided by the total number of positive predictions made by the model (including true and false positives).

Recall is a metric that indicates how often a machine learning model accurately detects positive examples (true positives) from all actual positive samples in the dataset. Divide the number of true positives by the number of positive cases to determine recall. The latter includes true positives (correctly identified cases) and false negatives (missed cases) (Al-Batah et al., 2023; Pei et al., 2022).

MCC is the best single-value classification metric for summarizing a confusion or error matrix. A confusion matrix has four entities:

  • True positives (TP)

  • True negatives (TN)

  • False positives (FP)

  • False negatives (FN)

And is calculated by the formula:

F-measure is an alternative machine learning evaluation metric that assesses the predictive skill of a model by elaborating on its class-wise performance rather than its overall performance, as done by accuracy. The F1 score combines two competing metrics—precision and recall—of a model, making it widely used in recent literature.

ROCArea: a metric that graphically assesses classifier performance across varying thresholds by plotting the false positive rate on the x-axis and the true positive rate on the y-axis.

True Positives (TPs): instances in which the model correctly identifies examples.

True Negatives (TNs): represent cases where the model correctly recognizes and labels negative examples.

False Positives (FPs): occur once the model mistakenly identifies examples as positive. In words, these are instances where negative examples are mistakenly labeled as “positive.”

False Negatives (FNs): arise when positive examples are incorrectly classified as negative. These are cases in which positive examples are incorrectly labeled as “negative.”

3.1 Description of datasets

In the study, three datasets are available for download from the UCI repository. The first dataset, a binary classification set, contains 11,055 instances with 30 integer features. Most of these features are binary. On the other hand, the second dataset comprises three class labels, supports multiclassification, and provides nine integer-type features and 10,000 examples; the third dataset comprises two class labels, consists of 13 integer-type features, and provides 2,670 instances. Table 1 presents the distinguishing qualities of both sets for quick reference. This research focuses on the first two datasets, which are the largest and have 3 class labels, while the third dataset is relatively small with only two classes: selection and understanding.

Table 1

NameInstancesFeaturesNo. of classesFeature typeReferences
DS111,055303IntegerSu et al., 2023
DS210,000183IntegerAlluwaici M. A. et al., 2020
DS32,670132IntegerMohammad et al., 2015

Datasets characteristic.

This step focused on collecting datasets and understanding the attributes. Three datasets, denoted DS1, DS2, and DS3 (Su et al., 2023; Alluwaici M. A. et al., 2020), and DS3 (Mohammad et al., 2015), were selected, as they have different numbers of features and only some are common. Table 2 summarizes the feature categories across the three datasets. DS1, DS2, and DS3 contain both internal features (i.e., derived from webpage URLs and HTML/JavaScript source code available on the webpage itself) and external features (i.e., obtained from querying third-party services such as DNS, search engines, and WHOIS records). DS2 only contains internal features (Mohammad et al., 2015).

Table 2

Dataset codeFeature categoryFeature examples
DS1URL basedhaving_IP_Address, URL_Length, HTTPS_token, etc.
Abnormal basedRequest_URL, URL_of_Anchor, Links_in_tags, etc.
HTML/js BasedRedirect, on_mouseover, RightClick, popUpWindow, etc.
Domain basedDNSRecord, web_traffic, Page_Rank, Google_Index, etc.
DS2HTML/JS based, URL basedRedirect, on_mouseover, RightClick, popUpWindow, etc.
DS2URL basedNumDots, UrlLength, AtSymbol, etc.
AbnormalAbnormalExtFormAction, ExtMetaScriptLinkRT, etc.
HTML/Js BasedRightClickDisabled, ExtFavicon, PopUpWindow, etc.

Categories of features for the two datasets.

3.2 Data preparation

Data preprocessing involves operations such as handling missing values, removing outliers, and eliminating redundant information. As stated in reference (Alazaidah et al., 2023a), the DS1, DS2, and DS3 datasets were free of missing data but required cleaning before use. For instance, the HttpsInHostname attribute in DS3 had all values set to 0, making it unnecessary for analysis.

To identify common attributes across these datasets (DS1-DS2-DS3), the authors checked their descriptions available in references (Mohammad et al., 2015) and (Alzyoud et al., 2024). The authors' citations for each dataset feature significantly simplified this preprocessing step.

It was noted that some feature pairs captured similar information expressed in different formats, such as UrlLength, which is numeric, and its counterpart, “UrlLengthRT,” which is categorical. In cases where those occurred only once, they would be mapped to the same variable, URL_Length, found solely in dataset DP1; otherwise, they would remain separate. Ultimately, after scrutinizing these intricate details across variables, we discovered a match between 18 key attributes among the three aforementioned sources (as shown in Table 3).

Table 3

DS1DS2DS3DS1-1-2-3
having_IP_AddressIpAddressIP_Address
having_Sub_DomainSubdomainLevel*Sub_Domain
Links_pointing_to_pagePctExtHyperlinks*Links _to_page
Submitting_to_emailSubmitInfoToEmailSubmitting_to_email
double_slash_redirectingDoubleSlashInPathdouble_redirecting
URL_LengthUrlLength*URL_Length
FaviconExtFaviconFavicon
Prefix_SuffixNumDashInHostname*Prefix_Suffix
SFHAbnormalFormActionSFH
IframeIframeOrFrameIframe
having_At_SymbolAtSymbol_At_Symbol
SSLfinal_StateNoHttpsSSLfinal_State
on_mouseoverFakeLinkInStatusBaron_mouseover
URL_of_AnchorPctNullSelfRedirectHyperlinks*URL_of_Anchor
popUpWidnowPopUpWindowpopUpWidnow
Request_URLPctExtResourceUrls*Request_URL
RightClickRightClickDisabledRight_Click
Links_in_tags‘ExtMetaScriptLinkRT*Links _tags

The matched features between ds1, ds2 and ds3 dataset with the features after feature selection.

* indicates numeric features, √ indicates selected features.

3.3 Feature selection

The significance of independent features was assessed using P-values, with a threshold of 0.05 to identify statistically significant features.

To begin with, the Spearman rank-order correlation method assessed collinearity between feature pairs. In Figure 2, we show the correlation matrix for the DS1-2-3 matching feature, with the pop-up window and on-mouse-over having the highest observed value at 0.73, followed by the pop-up window and favicon pair, which had a corresponding score of 0.66. Most pairs showed small or negligible correlations.

Figure 2

To identify multicollinearity—where three or more variables converge even when no two have high individual similarities—the Variance inflation factor (VIF) scores were used (Ubing et al., 2019).

Each trait received its VIF rating calculated as follows:

Unadjusted coefficient of determination for regressing the ith independent variable on the remaining ones.

Based on VIF analysis, in addition to p-values, the combined DS1-2-3 data identified 15 features as noteworthy and independent.

This process used various Python packages, including statsmodels to calculate VIF scores and p-values, scikit-learn to build logistic regression models, and Matplotlib and Seaborn to generate visualizations.

For the feature selection and ranking step, four techniques have been considered and evaluated. The first technique is called Correlation Attribute Evaluator (CAE). CAE measures the linear correlations between the input features and the output feature (class) and is usually implemented using Pearson's correlation coefficient. The second technique is the Gain Ratio Attribute Evaluator (GRAE). This technique assesses feature significance by measuring each feature's gain ratio relative to the class label. The third technique is dubbed the Information Gain Attribute Evaluator (IGAE). IAGE measures how a feature is worth based on the value of information gain for this feature with respect to the class label. The last technique is the Principal Components Analysis (PCA). This technique aims to reduce data dimensionality by transforming a large dataset into a smaller one with low-correlated features.

4 Comparative analysis amongst the classification models in the domain of website phishing

This section describes the process of determining the ideal classification model for phishing datasets. To attain this objective, three distinct sets of data cognate to phishing have been analyzed in detail. Table 4 outlines the highlighted attributes associated with these datasets, all of which can be obtained from the UCI repository with ease.

Table 4

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
Treerandom tree90.5020.9050.9050.9050.8070.9650.961
Random forest90.6640.9070.9070.9060.8110.9730.974
REPTree89.5610.8970.8960.8950.7890.9610.962
DecisionStump84.7300.8770.8470.8410.7140.8230.810
HoeffdingTree88.8010.8900.8880.8870.7740.9370.939
LMT90.6100.9060.9060.9060.8100.9710.971
J4B90.0310.9010.9000.9000.7980.9600.958
Avg89.2710.8970.8920.8910.7860.9410.939
BayesBayesNet87.5350.8760.8750.8750.7470.9470.951
NaiveBayes87.5350.8760.8750.8750.7470.9470.951
NaiveBayesUpdateable55.6940.5571.0000.7150.5000.5000.506
Avg76.9210.7670.9160.8210.6640.7980.802
FunctionsLogistic88.6470.8880.8860.8860.7710.9540.956
SGD88.7380.8890.8870.8870.7720.8820.842
SimpleLogistic88.6290.8890.8860.8850.7710.9530.956
SMO88.9550.8910.890.8890.7770.8830.845
VotedPerceptron88.3580.8860.8840.8830.7650.880.84
Avg88.6660.8880.8860.8860.7710.9100.887
LazyIBK90.7550.9080.9080.9070.8120.9730.973
Kstar90.3930.9050.9040.9030.8060.970.972
LWL84.7300.8770.8470.8410.7140.9450.947
Avg88.6520.8910.8860.8850.7730.9270.913
MetaAdaBoostM187.4350.8760.8740.8730.7460.9380.941
AttributeSelectedClassifier87.3630.8760.8740.8730.7450.9350.936
Bagging89.9770.9010.9000.8990.7970.9670.969
ClassificationViaRegression89.0360.8920.8900.890.7780.9590.961
FilteredClassifier90.0310.9010.9000.9000.7980.960.958
IterativeClassifierOptimizer87.8060.8800.8780.8770.7540.9480.951
LogitBoost87.8060.8800.8780.8770.7540.9480.951
MultiClassClassifier88.6470.8880.8860.8860.7710.9540.956
MultiClassClassifierUpdateable88.7380.8890.8870.8870.7720.8820.842
RandomCommittee90.7550.9080.9080.9070.8120.9710.969
RandomizableFilteredClassifier90.2300.9020.9020.9020.8020.9660.966
RandomSubSpace89.0270.8930.8900.8890.7790.9570.959
Stacking55.6940.5570.5570.7150.5000.5000.506
WeightedInstancesHandlerWrapper55.6940.5570.5570.7150.5000.5000.506
vot55.6940.5570.5570.7150.5000.5000.506
CVParameterSelection55.6940.5570.5570.7150.5000.5000.506
Avg80.6010.8070.8050.8450.7060.8360.836
RulesDecisionTable88.1770.8830.8820.8810.760.950.952
JRip89.2710.8950.8930.8920.7840.9040.890
OneR84.7300.8770.8470.8410.7140.8280.794
PART90.3750.9040.9040.9030.8050.9670.966
ZeroR55.6940.5570.5570.7150.5060.5000.506
Avg81.6440.8230.8160.8460.7130.8290.821
MiscInputMappedClassifier55.6940.5570.5570.7150.5000.5000.506
Avg55.6940.5570.5570.7150.5000.5000.506

Comparative analysis of 40 classifiers utilizing feature selection via CAE, on dataset DS1.

The results of using 40 classifiers on the phishing website dataset 1 (DS1) are presented in Table 4 and analyzed with respect to accuracy and pre-session metrics. The data reveal that IBK achieves the highest accuracy, whereas RandomCommittee achieves outstanding accuracy and precision.

Evaluating learning strategies indicates that Lazy achieves optimal accuracy, while RandomCommittee yields superior precision.

The Recall and MCC metric results for the phishing website dataset after applying 40 classifiers are outlined in Table 5. The table shows that random forest classification models have produced superior results when evaluated against these criteria.

Table 5

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree95.9110.9590.9590.9590.9170.9780.969
Random forest96.4360.9640.9640.9640.9280.9930.993
REPTree94.8980.9490.9490.9490.8970.9840.981
DecisionStump88.8910.8890.8890.8890.7740.8820.854
HoeffdingTree94.0020.9400.9400.9400.8780.9830.983
LMT95.7660.9580.9580.9580.9140.9890.988
J4B95.450.9550.9550.9540.9080.9810.977
Avg94.4790.9440.9440.9440.8880.9700.963
BayesBayesNet92.7720.9280.9280.9280.8530.9810.982
NaiveBayes92.7720.9280.9280.9280.8530.9810.982
NaiveBayesUpdateable92.7720.9280.9280.9280.8530.9810.982
Avg92.7720.9280.9280.9280.8530.9810.982
FunctionsLogistic93.3690.9340.9340.9340.8660.9850.986
SGD93.3060.9330.9330.9330.8640.9310.904
SimpleLogistic93.3060.9330.9330.9330.8640.9850.986
SMO93.3150.9330.9330.9330.8640.9310.904
VotedPerceptron93.2880.9330.9330.9330.8640.9320.904
Avg88.6650.8880.8860.8860.7710.9100.887
LazyIBK96.1190.9610.9610.9610.9210.9870.986
Kstar96.1280.9620.9610.9610.9220.9950.995
LWL88.9910.8900.8900.890.7770.9750.976
Avg88.6520.8910.8810.8850.7730.9270.913
MetaAdaBoostM192.5820.9260.9260.9260.8500.9810.982
AttributeSelectedClassifier94.4000.9440.9440.9440.8860.9800.978
Bagging95.4860.9550.9550.9550.9080.9900.990
ClassificationViaRegression94.5360.9450.9450.9450.8890.9880.988
FilteredClassifier95.4500.9550.9550.9540.9080.9810.977
IterativeClassifierOptimizer92.7360.9270.9270.9270.8530.9810.982
LogitBoost92.7360.9270.9270.9270.8530.9810.982
MultiClassClassifier93.3690.9340.9340.9340.8660.9850.986
MultiClassClassifierUpdateable93.3060.9330.9330.9330.8640.9310.904
RandomCommittee96.4080.9640.9640.9640.9270.9890.985
RandomizableFilteredClassifier94.2920.9430.9430.9430.8840.9690.964
RandomSubSpace93.4140.9350.9340.9340.8670.9840.985
Stacking55.6940.5570.5570.7150.5000.5000.506
WeightedInstancesHandlerWrapper55.6940.5570.5570.7150.5000.5000.506
vot55.6940.5570.5570.7150.5000.5070.506
CVParameterSelection55.6940.5570.5570.7150.5000.5000.506
Avg84.4670.8440.8440.8850.7840.8590.857
RulesDecisionTable92.8630.9290.9290.9290.8550.9790.98
JRip94.7530.9480.9480.9470.8940.960.953
OneR88.8910.8890.8890.8890.7740.8860.845
PART95.5850.9560.9560.9560.9110.9850.966
5ZeroR55.6940.5570.5570.7150.5060.5110.506
Avg85.5570.8550.8550.8870.7880.8640.85
MiscInputMappedClassifier55.6940.5570.5570.7150.5000.5060.506
Avg55.6940.5570.5570.7150.5000.5060.506

Comparative analysis of 40 classifiers utilizing feature selection via CAE, on dataset DS1.

Additionally, Tree outperforms other learning strategies on both precision and MCC metrics in this dataset (DS1).

A comparative analysis of 40 classifiers on the phishing dataset, in terms of accuracy and precision, is presented in Table 5.

Random forest outperforms the other considered classifiers in accuracy and precision on the phishing dataset (DS1), as shown in the table.

Moreover, among the eight learning strategies assessed using these two measures, the Functions Tree strategy yields better outcomes than its counterparts.

The precision metrics obtained from applying 40 classifiers to the phishing dataset are shown in Table 6. According to the table, among all classification models, the RandomCommittee learning strategy achieves the highest precision. Similarly, for the Random Forest metric, based on Table 6 and the Trees learning strategy, we can see that the Random Forest classification model delivers superior outcomes.

Table 6

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree95.6400.9560.9560.9560.9120.9810.974
Random forest96.1910.9620.9620.9620.9230.9920.992
REPTree94.7440.9470.9470.9470.8930.9840.982
DecisionStump88.8910.8890.8890.8890.7740.8820.854
HoeffdingTree93.9030.9390.9390.9390.8760.9830.984
LMT95.6760.9570.9570.9570.9120.9880.986
J4B95.1060.9510.9510.9510.9010.9830.98
Avg94.3070.9430.9430.9430.8840.9700.964
BayesBayesNet92.6360.9270.9260.9260.8510.9800.981
NaiveBayes92.6450.9270.9260.9260.8510.9800.981
NaiveBayesUpdateable55.6940.5570.5570.7150.5000.5000.506
Avg80.3270.8030.8030.8550.7340.8200.822
FunctionsLogistic93.3780.9340.9340.9340.8660.9850.986
SGD93.5140.9350.9350.9350.8680.9330.906
SimpleLogistic93.4320.9340.9340.9340.8670.9850.985
SMO93.5230.9350.9350.9350.8690.9330.907
VotedPerceptron93.3600.9340.9340.9340.8650.9330.906
Avg88.6650.8880.8860.8860.7710.9100.887
LazyIBK95.7300.9570.9570.9570.9130.9880.987
Kstar95.6490.9570.9560.9560.9120.9940.994
LWL89.0180.8900.8900.890.7770.9740.974
Avg93.4650.9340.9340.9340.8670.9850.985
MetaAdaBoostM192.5820.9260.9260.9260.850.9810.982
AttributeSelectedClassifier94.3100.9430.9430.9430.8850.9790.977
Bagging95.3860.9540.9540.9540.9060.9900.990
ClassificationViaRegression94.6350.9460.9460.9460.8910.9890.989
FilteredClassifier95.1060.9510.9510.9510.9010.9830.980
IterativeClassifierOptimizer92.7360.9270.9270.9270.8530.9810.982
LogitBoost92.7360.9270.9270.9270.8530.9810.982
MultiClassClassifier93.3780.9340.9340.9340.8660.9850.986
MultiClassClassifierUpdateable93.5140.9350.9350.9350.8680.9330.906
RandomCommittee96.4080.9640.9640.9640.9270.9890.985
RandomizableFilteredClassifier94.7710.9480.9480.9480.8940.9750.971
RandomSubSpace93.4500.9350.9350.9340.8670.9830.984
Stacking55.6940.5570.5570.7150.5000.5000.506
WeightedInstancesHandlerWrapper55.6940.5570.5570.7150.5000.5000.506
vot55.6940.5570.5570.7150.5000.5070.506
CVParameterSelection55.6940.5570.5570.7150.5000.5000.506
Avg84.4870.8440.8440.8840.7850.8590.858
RulesDecisionTable92.9710.930.9300.9300.8580.9780.978
JRip94.5630.9460.9460.9460.8900.9590.952
OneR88.8910.8890.8890.8890.7740.8860.845
PART95.4680.9550.9550.9550.9080.9870.984
ZeroR55.6940.5570.5570.7150.5060.5110.506
Avg85.5170.8550.8550.8870.7870.8640.853
miscInputMappedClassifier55.6940.5570.5570.7150.5000.5060.506
Avg55.6940.5570.5570.7150.5000.5060.506

Comparative analysis of 40 classifiers utilizing feature selection via GRAE, on dataset DS1.

In conclusion, regarding optimizing the precision metrics shown in Table 6, function learning is our preferred approach, yielding the best results compared to other available strategies.

In Table 7, the random forest classification models achieve the best recall and MCC results on the phishing dataset (DS1). The random forest classifier belongs to the Tree learning strategy.

Table 7

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree95.6490.9560.9560.9560.9120.9780.969
Random forest96.2550.9630.9630.9630.9240.9920.991
REPTree94.8530.9490.9490.9490.8960.9830.980
DecisionStump88.8910.8890.8890.8890.7740.8820.854
HoeffdingTree93.9300.9390.9390.9390.8770.9830.983
LMT95.8290.9580.9580.9580.9150.9890.988
J4B95.6300.9560.9560.9560.9110.9850.982
Avg94.4340.9440.9440.9440.8870.9700.963
BayesBayesNet92.7810.9280.9280.9280.8540.9810.982
NaiveBayes92.7810.9280.9280.9280.8540.9810.982
NaiveBayesUpdateable55.6940.5591.0000.7150.5000.5000.506
Avg80.4190.8050.9520.8570.7360.8200.823
FunctionsLogistic93.3870.9340.9340.9340.8660.9850.986
SGD93.3510.9320.9340.9330.8650.9320.904
SimpleLogistic93.3510.9340.9340.9330.8650.9850.986
SMO93.3240.9330.9330.9330.8650.9310.904
VotedPerceptron93.3330.9330.9330.9330.8650.9320.905
Avg88.6650.8880.8860.8860.7710.9100.887
LazyIBK95.8290.9580.9580.9580.9150.9880.986
Kstar95.9830.9600.9600.9600.9190.9940.994
LWL88.9730.8900.8900.8900.7760.9750.975
Avg88.6520.8910.8860.8850.7730.9270.912
MetaAdaBoostM192.5820.9260.9260.9260.8500.9810.982
AttributeSelectedClassifier94.4000.9440.9440.9440.8860.9800.978
Bagging95.4040.9540.9540.9540.9070.9900.990
ClassificationViaRegression94.4360.9440.9440.9440.8870.9880.988
FilteredClassifier95.6300.9560.9560.9560.9110.9850.982
IterativeClassifierOptimizer92.7360.9270.9270.9270.8530.9810.982
LogitBoost92.7360.9270.9270.9270.8530.9810.982
MultiClassClassifier93.3870.9340.9340.9340.8660.9850.986
MultiClassClassifierUpdateable93.3510.9340.9340.9330.8650.9320.904
RandomCommittee90.7550.9080.9080.9070.8120.9710.969
RandomizableFilteredClassifier90.2300.9020.9020.9020.8020.9660.966
RandomSubSpace93.9840.9400.9400.9400.8780.9860.986
Stacking55.6940.5570.5570.7150.5000.5000.506
WeightedInstancesHandlerWrapper55.69430.5570.5570.7150.50.50.506
vot55.69430.5570.5570.7150.50.5090.506
CVParameterSelection55.69430.5570.5570.7150.50.50.506
Avg83.900950.8390.8390.8783750.7731250.8584380.857438
RulesDecisionTable92.99860.930.930.930.8580.9810.981
JRip94.52740.9450.9450.9450.8890.960.953
OneR88.89190.8890.8890.8890.7740.8860.845
PART95.45910.9550.9550.9550.9080.9860.983
ZeroR55.69430.5570.5570.7150.5060.50.506
Avg81.649940.82320.81660.84640.71380.82980.8216
MiscInputMappedClassifier55.69430.5570.5570.7150.50.50.506
Avg55.69430.5570.5570.7150.50.50.506

Comparative analysis of 40 classifiers utilizing feature selection via IGAE, on dataset DS1.

Moreover, regarding the best learning strategy, Table 7 shows that the tree learning strategy achieves the best results for the recall and MCC metrics.

According to Table 8, the classifier in the tree learning strategy, random forest, has the highest precision metric. Additionally, when it comes to the accuracy metric and other compared classifiers, this same classifier performs best again. Furthermore, among the seven considered learning strategies, Tree stands out as achieving superior results across comparisons.

Table 8

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree94.2190.9420.9420.9420.8830.9820.978
Random forest94.4730.9450.9450.9450.8880.9870.986
REPTree93.5230.9360.9350.9350.8690.9770.976
DecisionStump88.8910.8890.8890.8890.7740.8820.854
HoeffdingTree93.0620.9320.9310.930.860.9670.965
LMT94.3730.9440.9440.9440.8860.9860.986
J4B93.7940.9390.9380.9380.8750.9750.974
Avg93.1910.9320.9320.9310.8620.9650.959
BayesBayesNet92.3560.9240.9240.9230.8450.9720.974
NaiveBayes92.3650.9240.9240.9230.8450.9720.974
NaiveBayesUpdateable92.3650.9240.9240.9230.8450.9720.974
Avg92.3620.9240.9240.9230.8450.9720.974
FunctionsLogistic92.6820.9270.9270.9270.8520.9760.977
SGD91.7050.9170.9170.9170.8320.9160.882
SimpleLogistic92.6450.9270.9260.9260.8510.9760.977
SMO91.7140.9170.9170.9170.8320.9160.882
VotedPerceptron92.5550.9260.9260.9250.8490.9240.894
vg88.6650.8880.88660.8860.77120.91040.8878
LazyIBK94.2370.9430.9420.9420.8830.9860.985
Kstar94.1650.9420.9420.9410.8820.9860.986
LWL88.9910.8900.8900.8900.7770.9660.967
Avg88.6520.8910.8860.8850.7730.9270.913
MetaAdaBoostM192.1660.9220.9220.9220.8410.9730.974
AttributeSelectedClassifier92.9350.9310.9290.9290.8580.9610.960
Bagging93.8300.9390.9380.9380.8750.9820.983
ClassificationViaRegression93.1880.9320.9320.9320.8620.9800.981
FilteredClassifier93.7940.9390.9380.9380.8750.9750.974
IterativeClassifierOptimizer92.2200.9230.9220.9220.8420.9740.975
LogitBoost92.4370.9250.9240.9240.8470.9740.975
MultiClassClassifier92.6820.9270.9270.9270.8520.9760.977
MultiClassClassifierUpdateable93.8300.9380.9380.9380.8750.9810.980
RandomCommittee94.4090.9440.9440.9440.8870.9860.984
RandomizableFilteredClassifier90.2300.9020.9020.9020.8020.9660.966
RandomSubSpace92.6910.9270.9270.9270.8520.9730.974
Stacking55.6940.5570.5570.7150.5000.5000.506
WeightedInstancesHandlerWrapper55.6940.5570.5570.7150.5000.5000.506
vot55.6940.5570.5570.7150.5000.5090.506
CVParameterSelection55.6940.5570.5570.7150.5000.5000.506
Avg83.5740.8360.8350.8750.7660.8560.857
RulesDecisionTable93.0250.9310.9300.9300.8590.9770.977
JRip93.3060.9340.9330.9330.8640.9450.936
OneR88.8910.8890.8890.8890.7740.8860.845
PART94.3550.9440.9440.9430.8860.9830.983
ZeroR55.6940.5570.5570.7150.5000.5000.506
Avg85.0540.8510.85060.8820.7760.8580.849
MiscInputMappedClassifier55.6940.5570.5570.7150.5000.5000.506
Avg55.6940.5570.5570.7150.5000.5000.506

Comparative analysis of 40 classifiers utilizing feature selection via PC, on dataset DS1.

The outcomes of the 40 classifiers applied to the phishing website dataset, with respect to recall and MCC, are shown in Table 9.

Table 9

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree88.0110.8800.8800.8800.8670.9680.889
Random forest87.8930.8790.8790.8790.8660.9840.913
REPTree55.2550.5500.5530.5500.4990.9250.613
DecisionStump17.3410.2030.1730.3090.2050.5870.130
HoeffdingTree29.0760.3410.2910.2680.2240.7540.289
J4B60.4220.6020.6040.6030.5580.9460.715
Avg56.3330.5750.5630.5810.5360.8600.591
bayesBayesNet72.7830.7320.7280.7290.6990.9750.822
NaiveBayes25.7970.2950.2580.2420.1870.7400.27
NaiveBayesUpdateable25.7970.2950.2580.2420.1870.740.27
Avg41.4590.4400.4140.4040.3570.8180.454
functionsLogistic27.9890.2910.2800.2490.1870.770.282
MultilayerPerceptron36.4450.3580.3640.3530.2850.8190.385
SimpleLogistic28.1070.2940.2810.2500.1880.7690.281
SMO29.3600.3350.2940.2650.210.7450.243
Avg30.4750.3190.3040.2790.2170.7750.297
LazyIBK87.7170.8770.8770.8770.8630.950.859
Kstar62.7810.6420.6280.6250.5900.9410.698
LWL23.4390.2670.2340.3730.2840.7420.289
Avg57.9790.5950.5790.6250.5790.8770.615
MetaAdaBoostM117.3410.2030.1730.3090.2050.5870.130
AttributeSelectedClassifier65.4040.6670.6540.6470.6150.9660.775
Bagging64.4250.6420.6440.6420.6020.9530.718
ClassificationViaRegression56.2920.5630.5630.5560.5110.9260.627
FilteredClassifier74.6230.7450.7460.7440.7160.9770.863
IterativeClassifierOptimizer34.4000.3590.3440.3380.2700.8100.356
LogitBoost34.4000.3590.3440.3380.2700.8100.356
MultiClassClassifier27.1280.2720.2710.2360.1730.7650.276
MultiClassClassifierUpdateable13.8570.1390.1390.2430.2430.4990.105
RandomCommittee87.8740.8780.8790.8780.8650.9800.936
RandomizableFilteredClassifier87.3360.8730.8730.8730.8590.9490.860
RandomSubSpace74.5840.7450.7460.7430.7160.9740.822
Stacking13.8570.1390.1390.2430.2430.4990.105
WeightedInstancesHandlerWrapper13.8570.1390.1390.2430.2430.4990.105
vot13.8570.1390.1390.2430.2430.4990.105
CVParameterSelection13.8570.1390.1390.2430.2430.4990.105
Avg43.3180.4370.4330.4690.4380.7620.455
RulesDecisionTable65.5510.6670.6560.6480.6160.9660.748
JRip45.2210.6050.4520.4630.4370.8170.463
OneR63.8970.6460.6390.6250.5940.7980.448
PART59.7760.5970.5980.5970.5510.9450.708
ZeroR13.8570.1390.1390.2430.2430.4990.105
Avg49.6610.53080.49680.51520.48820.8050.4944
MiscInputMappedClassifier13.8570.1390.1390.2430.2430.4990.105
Avg13.8570.1390.1390.2430.2430.4990.105

Comparative analysis of 40 classifiers utilizing feature selection via CAE, on dataset DS2.

Analysis of Table 9 indicates that, among all classification models, the random tree classifier achieved the highest accuracy and precision on the given dataset (DS2). Additionally, compared with other learning strategies exhibited by the remaining classifying algorithms in Table 9, the tree strategy was found to outperform others in terms of efficient data processing.

The results from implementing 40 classifiers on the phishing website dataset (DS2) are shown in the table, including accuracy and precision metrics.

The Random Forest model, a tree-based learning strategy, achieves higher accuracy and precision than other classification models, as shown in Table 10.

Table 10

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree91.8860.9190.9190.9190.910.9710.903
Random forest92.1700.9220.9220.9220.9130.9910.947
REPTree57.3490.5710.5730.5710.5220.9310.635
DecisionStump18.1830.140.1820.2430.1350.5960.149
HoeffdingTree27.4020.2320.2740.2330.1670.7200.265
J4B62.3010.6210.6230.6210.5770.9510.741
Avg58.2150.5670.5820.5840.5370.860.606
BayesBayesNet70.9920.7130.7100.7070.6770.9720.807
NaiveBayes27.2160.2300.2720.2310.1640.720.265
NaiveBayesUpdateable27.2160.2300.2720.2310.1640.720.265
Avg41.8080.3910.4180.3890.3350.8040.445
FunctionsLogistic26.5900.1450.2660.155−0.0080.7160.245
SGD43.9330.3510.4390.3250.2110.7670.401
SimpleLogistic26.5610.1460.2660.1550.0060.7150.244
SMO29.4770.1850.2950.060.0040.7080.217
Avg31.6400.2060.3166250.1730.0480.7260.238
lazyIBK91.6610.9170.9170.9170.9070.9540.875
Kstar53.2100.5430.5320.5170.4740.9140.569
LWL23.5760.4460.2360.1590.1870.7240.250
Avg56.1490.6350.5610.5310.5220.8640.564
MetaAdaBoostM118.1830.1400.1820.2430.1350.5960.149
AttributeSelectedClassifier65.4040.6670.6540.6470.6150.9660.775
Bagging68.0360.6770.680.6780.6410.9610.754
ClassificationViaRegression58.8960.5830.5890.5810.5350.9340.643
FilteredClassifier73.1060.7300.7310.7280.6990.9750.848
IterativeClassifierOptimizer33.9400.3960.3390.3160.2660.7850.342
LogitBoost33.9400.3960.3390.3160.2660.7850.342
MultiClassClassifier25.9440.1420.2590.1520.0530.7150.239
MultiClassClassifierUpdateable13.8570.1390.1390.2430.4990.4990.105
RandomCommittee91.9350.9200.9190.9190.9100.9870.962
RandomizableFilteredClassifier90.7810.9080.9080.9080.8970.9570.881
RandomSubSpace76.8740.7700.7690.7670.7420.9780.849
Stacking13.8570.1390.1390.2430.1390.4990.105
WeightedInstancesHandlerWrapper55.6940.5570.5570.7150.5000.5000.506
vot13.8570.1390.1390.2430.1390.4990.105
CVParameterSelection13.8570.1390.1390.2430.1390.4990.105
Avg46.7600.4650.4670.4960.4480.7580.470
RulesDecisionTable65.3940.6680.6540.6450.6140.9670.755
JRip44.9690.6250.4500.4600.4390.8080.453
OneR63.8970.6460.6390.6250.5940.7980.448
PART62.7710.6250.6280.6250.5820.9500.738
ZeroR13.8570.1390.1390.2430.1390.4990.105
Avg50.1780.5400.5020.5190.4760.8040.499
MiscInputMappedClassifier13.8570.1390.1390.2430.1390.4990.105
Avg13.8570.1390.1390.2430.1390.4990.105

Comparative analysis of 40 classifiers utilizing feature selection via CAE on dataset DS2.

Besides, when focusing solely on optimizing the precision metric through a strategic approach perspective, adopting the tree learning strategy can be highly effective.

Table 11 presents the results of applying 40 classifiers to the phishing website dataset (DS2), focusing on recall and MCC.

Table 11

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree93.6680.9370.9370.9370.9300.9740.913
Random forest93.8440.9380.9380.9380.9310.9920.957
REPTree61.4600.6110.6150.6110.5680.9420.678
DecisionStump17.3410.2030.1730.3090.2050.5870.130
HoeffdingTree30.7790.3530.3080.2970.2450.7640.308
LMT77.7450.7780.7770.7770.7520.9760.852
J4B67.1650.6720.6720.670.6340.9570.772
Avg63.1430.6410.6310.6480.6090.8840.651
BayesBayesNet72.9590.7310.7300.7290.6990.9750.828
NaiveBayes27.0010.3070.2700.2480.1970.7460.295
NaiveBayesUpdateable27.0010.3070.2700.7150.1970.7460.295
Avg42.3200.4480.4230.5640.3640.8220.477
FunctionsLogistic30.8470.2530.3080.3220.2790.7630.301
SimpleLogistic30.7390.2510.3070.3170.2740.7630.301
SMO32.5300.4010.3250.3080.2470.7480.262
MultilayerPerceptron43.3050.4260.4330.4180.360.8570.467
Avg88.6650.8880.8860.8860.7710.9100.887
LazyIBK93.5500.9360.9360.9350.9280.9580.886
Kstar67.0090.670.670.6650.6310.9550.734
LWL23.7910.2760.2380.380.2920.7560.303
Avg88.6520.8910.8860.8850.7730.9270.912
MetaAdaBoostM117.3410.2030.1730.3090.2050.5870.130
AttributeSelectedClassifier65.4040.6670.6540.6470.6150.9660.775
Bagging70.6300.7040.7060.7040.6710.9670.780
ClassificationViaRegression62.6340.6210.6260.6200.5790.9440.680
FilteredClassifier75.5430.7540.7550.7540.7270.9780.873
IterativeClassifierOptimizer35.0360.3470.3500.3400.2690.8130.381
LogitBoost35.0360.3470.3500.3400.2690.8130.381
MultiClassClassifier30.2950.2860.3030.3200.2750.7620.300
MultiClassClassifierUpdateable13.8570.1390.1390.2430.2430.4990.105
RandomCommittee93.7070.9370.9370.9370.9300.9870.965
RandomizableFilteredClassifier90.6340.9060.9060.9060.8960.9590.885
RandomSubSpace77.4320.7740.7740.7730.7480.9770.848
Stacking13.8570.1390.1390.2430.2430.4990.105
WeightedInstancesHandlerWrapper13.8570.1390.1390.2430.2430.4990.105
vot13.8570.1390.1390.2430.2430.4990.105
CVParameterSelection13.8570.1390.1390.2430.2430.4990.105
Avg45.1860.4520.4510.4910.4620.7650.470
RulesDecisionTable65.5510.6670.6560.6480.6160.9660.748
JRip52.3290.6410.5230.5360.5060.8610.533
OneR63.8970.6460.6390.6250.5940.7980.448
PART66.9010.6710.6690.6680.6310.9560.764
ZeroR13.8570.1390.1390.2430.2430.4990.105
Avg81.6490.8230.8160.8440.7130.8290.821
MiscInputMappedClassifier13.8570.1390.1390.2430.2430.4990.105
Avg55.6940.5570.5570.7150.5000.5000.506

Comparative analysis of 40 classifiers utilizing feature selection via GRAE, on dataset DS2.

According to Table 11, the Random Tree classifier performs exceptionally well on the Recall metric. At the same time, the Random Forest model achieves the best MCC among all considered classification models.

Furthermore, Trees prove themselves to be an exceptional learning strategy, producing superior output compared to seven alternative strategies from both recall and MCC perspectives.

The results obtained from the 40 classifiers applied to the phishing website dataset (DS2) for the recall and MCC metrics are presented in Table 12. Random tree classifier demonstrates superior recall, while the random forest and the random tree stand out with exceptional performance on MCC among the classification models considered. Also, compared to the seven learning strategies under review, Trees shows better results for both the Recall and MCC measures.

Table 12

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree93.7070.9370.9370.9370.9300.9720.908
Random forest93.7070.9370.9370.9370.9300.9930.959
REPTree59.2380.5910.5920.5090.5430.9350.652
DecisionStump18.1830.1400.1820.2430.1350.5960.149
HoeffdingTree26.8930.2490.2690.2140.1740.7320.272
LMT78.6060.7880.7860.7860.7620.9740.859
J4B67.1750.6710.6720.6710.6330.9560.781
Avg62.5010.6160.6250.6250.5860.8790.654
BayesBayesNet72.9980.7310.730.7280.6990.9730.821
NaiveBayes71.8240.7230.7180.7140.6850.9650.789
NaiveBayesUpdateable26.8930.2510.2690.2140.1750.7320.272
Avg57.2380.5680.5720.5520.5190.8900.633
FunctionsLogistic27.7250.0720.2770.0350.0080.7310.245
SimpleLogistic27.6760.0780.2770.0320.010.730.244
SMO28.8310.1570.2880.0640.0510.7130.213
MultilayerPerceptron41.8370.4230.4180.4120.3480.8390.431
Avg88.6650.8880.8860.8860.7710.9100.887
LazyIBK93.1980.9320.9320.9320.9240.9510.871
Kstar95.6490.9570.9560.9560.9120.9510.994
LWL23.7910.3450.2380.0580.0090.7390.267
Avg70.8790.7440.7060.6480.6420.8800.710
MetaAdaBoostM118.1830.1400.1820.2430.1350.5960.149
AttributeSelectedClassifier65.4040.6670.6540.6470.6150.9660.775
Bagging71.2070.7010.7120.7010.6770.9670.786
ClassificationViaRegression64.0830.6340.6410.6340.5940.9430.695
FilteredClassifier77.3630.7710.7740.7710.7460.9790.885
IterativeClassifierOptimizer36.1910.3860.3620.3440.2860.7990.306
LogitBoost36.1910.3860.3620.3440.2860.7990.306
MultiClassClassifier27.5000.0470.2750.009−0.0040.7310.244
MultiClassClassifierUpdateable13.8570.1390.1390.2430.2430.4990.105
RandomCommittee93.7460.9370.9370.9370.9030.9880.968
RandomizableFilteredClassifier91.2310.9120.9120.9120.9020.9550.878
RandomSubSpace80.4560.8060.8050.8030.7810.9810.878
Stacking13.8570.1390.1390.2430.2430.1050.105
WeightedInstancesHandlerWrapper13.8570.1390.1390.2430.2430.1050.105
vot13.8570.1390.1390.2430.2430.4990.105
CVParameterSelection13.8570.1390.1390.2430.2430.4990.105
Avg45.6770.4430.4560.4730.4470.7130.468
RulesDecisionTable65.3940.6680.6540.6450.6140.9670.755
JRip48.3750.6570.4840.4940.4760.8260.492
OneR63.890.6460.6390.6250.5940.7980.448
PART67.3810.6730.6740.6730.6360.9570.783
ZeroR13.8570.1390.1390.2430.2430.4990.105
Avg51.7810.5560.5180.5360.51260.8090.516
MiscInputMappedClassifier13.8570.1390.1390.2430.2430.4990.105
Avg13.8570.1390.1390.2430.2430.4990.105

Comparative analysis of 40 classifiers utilizing feature selection via IGAE, on dataset DS2.

Additionally, these two classifiers have been most effective on this dataset, as indicated by their respective evaluation scores in Table 12.

The accuracy and precision metrics for the phishing dataset (DS2) were evaluated using 40 classifiers, and the results are presented in Table 13.

Table 13

Learning SClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree69.4360.6990.6940.6940.6610.9600.810
Random forest69.3970.6990.6940.6940.6610.9650.812
REPTree58.1030.5880.5810.5820.5350.9410.679
DecisionStump17.3410.2030.1730.3090.2050.5870.130
HoeffdingTree29.5360.3520.2950.2740.2030.7650.293
LMT61.3910.6190.6140.6140.5720.9530.736
J4B59.0230.5970.590.5910.5450.9460.706
Avg52.0320.5360.5200.5360.4870.8730.595
BayesBayesNet30.1330.3320.3010.2950.2330.7790.313
NaiveBayes27.3530.3740.2740.2580.2190.7130.282
NaiveBayesUpdateable27.3530.3740.2740.2580.2190.7130.282
Avg28.2800.3060.2830.2730.2230.7350.292
FunctionsLogistic26.6290.3080.2660.2450.1850.7480.292
SimpleLogistic26.6880.3110.2670.2430.1860.7450.285
SMO30.3090.3450.3030.2910.2310.7460.255
MultilayerPerceptron42.4640.4470.4250.4290.3670.8400.469
Avg88.6650.8880.8860.8860.7710.9100.887
LazyIBK69.4360.6990.6940.6930.6610.960.809
Kstar59.6300.6170.5960.6020.5590.9410.678
LWL24.0160.2610.240.3720.2860.7480.293
Avg88.6520.8910.8860.8850.7730.9220.913
MetaAdaBoostM117.3410.2030.1730.3090.2050.5870.13
AttributeSelectedClassifier48.9900.5000.4900.4880.4330.9180.597
Bagging61.4890.6220.6150.6170.5740.9520.727
ClassificationViaRegression55.7440.5970.5570.5590.5020.9370.659
FilteredClassifier53.1310.5450.5310.5340.4820.9320.643
IterativeClassifierOptimizer68.7510.3160.3120.2940.2310.8070.361
LogitBoost31.2480.3160.3120.2940.2310.8070.361
MultiClassClassifier26.6680.1250.2670.1820.1210.7040.273
MultiClassClassifierUpdateable13.8570.1390.1390.2430.2430.4990.105
RandomCommittee69.3580.6980.6940.6930.6060.9650.825
RandomizableFilteredClassifier68.4080.6880.6840.6830.6490.9580.825
RandomSubSpace51.7120.5370.5170.5170.4680.9030.563
Stacking13.8570.1390.1390.2430.2430.4990.105
WeightedInstancesHandlerWrapper13.8570.1390.1390.2430.2430.4990.105
vot13.8570.1390.1390.2430.2430.4990.105
CVParameterSelection13.8570.1390.1390.2430.2430.4990.105
Avg38.8830.3640.3650.3990.3610.7500.404
RulesDecisionTable50.1660.5310.5020.5070.4550.9130.568
JRip46.6040.6370.4660.4930.4060.8570.52
OneR21.3640.1910.2140.1990.1100.5580.132
PART58.0250.5900.5800.5810.5340.9440.691
ZeroR13.8570.1390.1390.2430.2430.4990.105
Avg38.0030.4170.3800.4040.3600.7540.402
MiscInputMappedClassifier13.8570.1390.1390.2430.2430.4990.105
Avg55.6940.5570.5570.7150.5000.5000.506

Comparative analysis of 40 classifiers utilizing feature selection via PC, on dataset DS2.

From the table, it is evident that the IBK model under the lazy learning strategy, along with the random tree model under the tree learning approach, achieved the highest accuracy and precision values.

Furthermore, based on the findings in Table 13 regarding optimizing the precision metric for the Learning Strategy factor, Tree Learning should be selected for its superior performance.

Table 14 displays the results of forty classifiers applied to a dataset (DS3) containing phishing websites. The evaluation metrics were F-measure and ROC area. Among these, the random forest classifier showed exceptional performance in both F-measure and ROC, compared with all seven learning strategies under scrutiny. Additionally, Trees displayed better outcomes than others on both measures.

Table 14

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree87.7850.8780.8780.8780.7450.8740.835
Random forest92.2190.9220.9220.9220.8370.9720.972
REPTree89.3500.8930.8940.8930.7760.9420.930
DecisionStump75.5270.7530.7550.750.4760.7470.729
HoeffdingTree71.8970.7200.7190.720.4140.7530.754
J4B90.0230.9000.9000.9000.7900.9270.903
Avg84.4670.8440.8440.8430.6730.8690.853
BayesBayesNet86.5890.8660.8660.8650.7170.9380.942
NaiveBayes72.3970.8200.7240.7210.5470.9240.913
NaiveBayesUpdateable72.3970.8200.7240.7210.5470.9240.913
Avg77.1280.8350.7710.7690.6030.9280.667
FunctionsLogistic88.1330.8810.8810.8800.7500.9450.944
MultilayerPerceptron88.1110.8810.8810.8810.7500.9380.939
SGD87.5890.8770.8760.8740.7380.8600.825
SimpleLogistic87.9800.8800.8800.8790.7460.9440.944
SMO85.8070.8610.8580.8550.7010.8370.801
Avg87.5240.8760.8750.8730.7370.9040.890
LazyIBK87.8930.8790.8790.8790.7460.8860.855
Kstar89.2840.8950.8930.8910.7750.9500.952
LWL79.0690.7940.7910.7830.5550.8690.859
Avg85.4160.8560.8540.8510.6920.9018,667
MetaAdaBoostM186.0460.8600.8600.8600.7070.9290.929
AttributeSelectedClassifier88.0890.8810.8810.8800.7490.9300.910
Bagging90.4150.9040.9040.9040.7990.9630.964
ClassificationViaRegression88.4590.8840.8850.8840.7570.9490.948
FilteredClassifier89.8280.8980.8980.8980.7860.9380.919
IterativeClassifierOptimizer87.9370.8790.8790.8790.7460.9410.941
LogitBoost87.9370.8790.8790.8790.7460.9410.941
MultiClassClassifier88.1330.8810.8810.8800.7500.9450.944
MultiClassClassifierUpdateable87.5890.8770.8760.8740.7380.8060.825
RandomCommittee91.6320.9160.9160.9160.8240.9600.951
RandomizableFilteredClassifier85.0680.8510.8510.8510.6880.8580.822
RandomSubSpace90.1100.9020.9010.9000.7920.9610.962
Stacking60.5950.6060.6060.7550.7550.4990.522
WeightedInstancesHandlerWrapper60.5950.6060.6060.7550.7550.4990.522
vot60.5950.6060.6060.7550.7550.4990.522
CVParameterSelection60.5950.6060.6060.7550.7550.4990.522
Avg81.4760.8140.8140.8510.7560.8250.821
RulesDecisionTable65.5510.6670.6560.6480.6160.9660.748
JRip45.2240.6050.4520.4630.4370.8170.463
OneR63.8970.6460.6390.6250.5940.7980.448
PART59.7760.5970.5980.5970.5510.9450.708
ZeroR13.8570.1390.1390.2430.2430.4990.105
Avg49.6610.5300.4960.5150.4880.8050.494
MiscInputMappedClassifier60.5950.6060.6060.7550.2430.4990.105
Avg60.5950.6060.6060.7550.2430.4990.105

Comparative analysis of 40 classifiers utilizing feature selection via CAE, on dataset DS3.

From Table 14, the scores for each evaluation method indicate that, among the classifiers tested, they were most efficient on this dataset when compared with the other methods employed herein.

The results of running 40 classifiers on the phishing website dataset (DS3) are shown in Table 15, including F-measure and ROC metrics. According to the table, the random forest classifier outperforms other classification models on both F-measure and ROC for this dataset.

Table 15

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree90.2630.9030.9030.9030.7970.8990.864
Random forest94.2620.9430.9430.9420.8790.9810.981
REPTree91.0450.9110.9110.9110.8120.9450.927
DecisionStump79.0910.7940.7910.7920.5680.7780.744
HoeffdingTree82.7640.8270.8280.8270.6380.8650.848
J4B91.8710.9190.9190.9190.8290.9280.898
Avg88.2160.8820.8820.8820.7530.8990.876
BayesBayesNet88.3500.8840.8840.8820.7540.9460.95
NaiveBayes86.7630.8670.8680.8670.7210.9220.911
NaiveBayesUpdateable86.7630.8670.8680.8670.7210.9220.911
Avg87.2920.8720.8730.8720.7320.930.924
FunctionsLogistic89.6320.8970.8960.8950.7820.9520.949
SGD89.8060.8990.8980.8970.7860.8840.853
MultilayerPerceptron89.5670.8960.8960.8960.7820.9410.937
SimpleLogistic89.6110.8960.8960.8950.7810.9520.949
SMO87.9800.8820.880.8780.7470.8610.828
Avg89.3190.8940.8930.8920.7750.9180.903
LazyIBK89.6540.8960.8970.8960.7820.8960.867
Kstar90.4360.910.9040.9020.8020.9540.955
LWL79.0910.7940.7910.7920.5680.8930.892
Avg86.3940.8660.8640.8630.7170.9140.904
MetaAdaBoostM190.1970.9020.9020.9020.7940.9570.957
AttributeSelectedClassifier91.4360.9140.9140.9140.820.9390.918
Bagging92.5010.9250.9250.9250.8430.9690.965
ClassificationViaRegression91.0020.9100.9110.9110.8110.9610.958
FilteredClassifier91.5450.9150.9150.9150.8220.9310.912
IterativeClassifierOptimizer90.1970.9020.9020.9010.7940.9590.959
LogitBoost90.1970.9020.9020.9010.7940.9590.959
MultiClassClassifier89.6320.8970.8960.8950.7820.9520.949
MultiClassClassifierUpdateable89.8060.8990.8980.8970.7860.8840.853
RandomCommittee93.2840.9330.9330.9330.8590.9710.965
RandomizableFilteredClassifier84.9380.850.8490.8490.6850.8560.819
RandomSubSpace91.9580.920.920.9190.8310.9720.972
Stacking60.5950.6060.6060.6060.7550.4990.522
WeightedInstancesHandlerWrapper60.5950.6060.6060.6060.7550.4990.522
vot60.5950.6060.6060.6060.7550.4990.522
CVParameterSelection60.5950.6060.6060.6060.7550.4990.522
Avg83.0670.8300.830.8300.7900.8310.829
RulesDecisionTable89.6110.8970.8960.8950.7810.9450.945
JRip91.7840.9180.9180.9180.8270.9260.916
OneR78.3300.7810.7830.7830.5410.7660.721
PART91.8710.9190.9190.9190.8290.9430.924
ZeroR60.5950.6060.6060.7550.4990.4990.522
Avg82.4380.8240.8240.8540.69540.81580.805
MiscInputMappedClassifier60.5950.6060.6060.7550.4990.4990.522
Avg60.5950.6060.6060.7550.4990.4990.522

Comparative analysis of 40 classifiers utilizing feature selection via CAE, on dataset DS3.

Additionally, Trees is the most effective learning strategy for achieving high marks on both evaluation measures among the seven strategies considered here.

When 40 classifiers were applied to the phishing website dataset (DS3), Table 16 shows the results for both the F-measure and ROC metrics. According to this table, among the considered classification models, the random forest classifier achieves superior results in terms of F-measure and ROC on the same dataset. Besides, Trees, as a learning strategy, demonstrates top-notch performance across both evaluation criteria when juxtaposed with seven other strategies.

Table 16

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree91.5670.9160.9160.9160.8230.9120.884
Random forest93.7400.9370.9370.9370.8690.9770.977
REPTree91.5670.9160.9160.9160.8230.9120.884
DecisionStump78.0480.7890.7780.7750.5320.7730.758
HoeffdingTree77.4830.7730.7750.7730.5230.7980.794
LMT92.8050.9280.9280.9280.8490.9620.955
J4B92.1100.9210.9210.9210.8340.9490.935
Avg88.1880.8810.8810.8800.7500.8970.857
BayesBayesNet91.5230.9160.9150.9150.8220.9710.971
NaiveBayes76.5050.8380.7650.7650.6030.9490.940
NaiveBayesUpdateable76.5050.8380.7650.7650.6030.9490.940
Avg81.5110.8640.8150.8150.6760.9560.950
FunctionsLogistic91.4360.9140.9140.9140.8210.9670.964
SimpleLogistic91.3710.9140.9140.9130.8180.9670.963
SMO88.1540.8850.8820.8790.7520.8620.83
MultilayerPerceptron92.5450.9260.9250.9250.8440.9630.959
Avg88.6650.8880.8860.8860.7710.9100.887
LazyIBK90.9150.9090.9090.9090.9090.9090.885
Kstar91.5880.9210.9160.9140.8270.9710.972
LWL77.8740.7770.7790.7750.5280.8550.858
Avg88.6520.8910.8860.8850.7730.9270.913
MetaAdaBoostM177.8740.7770.7790.7750.5280.8550.858
AttributeSelectedClassifier92.2190.9220.9220.9220.8370.9510.938
Bagging93.0660.9310.9310.9300.8540.9740.973
ClassificationViaRegression90.8930.9090.9090.9080.8080.9650.962
FilteredClassifier92.9160.9290.9290.9290.8510.9420.928
IterativeClassifierOptimizer90.8060.9080.9080.9070.8070.9630.963
LogitBoost90.8060.9080.9080.9070.8070.9630.963
MultiClassClassifier91.4360.9140.9140.9140.8020.9670.964
MultiClassClassifierUpdateable90.6970.9080.9070.9060.8040.8950.866
RandomCommittee93.5440.9350.9350.9350.8640.9610.951
RandomizableFilteredClassifier90.1970.9020.9020.9020.7940.9040.880
RandomSubSpace92.6530.9270.9270.9260.8460.9750.975
Stacking60.5950.6060.6060.7550.7550.4990.522
WeightedInstancesHandlerWrapper60.5950.6060.6060.7550.7550.4990.522
vot60.5950.6060.6060.7550.7550.4990.522
CVParameterSelection60.5950.6060.6060.7550.7550.4990.522
Avg83.0930.8300.8300.8670.7520.8310.831
RulesDecisionTable90.4150.9040.9040.9040.7980.950.951
JRip91.1540.9110.9120.9110.8140.9250.917
OneR78.3300.7810.7830.7820.5410.7660.72
PART93.0010.930.930.930.8530.9690.963
ZeroR60.5950.6060.6060.7550.7550.4990.522
Avg81.6490.8220.8160.8460.7130.8290.821
MiscInputMappedClassifier60.5950.6060.6060.7550.7550.4990.522
55.6940.5570.5570.7150.5050.5060.506

Comparative analysis of 40 classifiers utilizing feature selection via GRAE, on dataset DS3.

The results of applying 40 classifiers to the phishing website dataset, with respect to F-measure and ROC metrics, are shown in Table 17. The random forest classifier outperforms the other considered classification models on both measures for this dataset, as shown in Table 17.

Table 17

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree91.1540.9120.9120.9120.8150.9080.876
Random forest94.6310.9460.9460.9460.8870.9830.982
REPTree59.2380.9250.9250.9250.8420.9560.945
DecisionStump78.0480.780.780.7750.5320.7730.758
HoeffdingTree85.7850.8580.8580.8560.7720.8830.865
LMT92.7190.9270.9270.9270.8470.9650.962
J4B92.2620.9230.9230.9230.8380.9340.916
Avg84.8340.8950.8950.8940.7830.9140.901
BayesBayesNet88.5020.8850.8850.8840.7580.9530.955
NaiveBayes88.220.8850.8820.880.7530.9460.936
NaiveBayesUpdateable88.220.8850.8820.880.7530.9460.936
Avg88.3140.8850.8830.8810.7540.9480.943
FunctionsLogistic90.6970.9070.9070.9060.8040.9630.961
SimpleLogistic90.6320.9060.9060.9060.8030.9630.961
SMO88.0020.8830.880.8780.7480.8610.828
MultilayerPerceptron90.8060.9080.9080.9080.8070.9580.954
Avg88.6650.8880.8860.8860.7710.9100.887
LazyIBK90.2410.9020.9020.9020.7950.90.872
Kstar89.9800.9050.90.8980.7930.9510.952
LWL78.4170.7830.7840.780.540.8620.867
Avg86.2130.8630.8620.860.7090.9040.897
MetaAdaBoostM189.6110.8960.8960.8960.7810.9570.957
AttributeSelectedClassifier92.2400.9220.9220.9220.8370.9470.93
Bagging93.6310.9360.9360.9360.8660.9740.973
ClassificationViaRegression92.5010.9250.9250.9250.8430.9670.963
FilteredClassifier92.4790.9250.9250.9250.8420.9350.919
IterativeClassifierOptimizer90.9800.9010.9010.9090.8010.9640.964
LogitBoost90.9800.9010.9010.9090.8110.9640.964
MultiClassClassifier90.6970.9070.9070.9060.8040.9630.961
MultiClassClassifierUpdateable90.3490.9040.9030.9030.7970.8910.891
RandomCommittee94.1960.9420.9420.9420.8780.9780.973
RandomizableFilteredClassifier83.4380.8340.8340.8340.6520.8390.802
RandomSubSpace92.9790.9340.9340.9340.8520.9750.975
Stacking60.5950.6060.6060.7550.7550.4990.522
WeightedInstancesHandlerWrapper60.5950.6060.6060.7550.7550.4990.522
vot60.5950.6060.6060.7550.7550.4990.522
CVParameterSelection60.5950.6060.6060.7550.7550.4990.522
Avg83.5290.8330.8350.8720.7990.8340.835
RulesDecisionTable89.6760.8980.8970.8950.7830.9460.946
JRip92.4360.9240.9240.9240.8410.9260.912
OneR78.3300.7810.7830.7820.5410.7660.72
PART92.3490.9230.9230.9230.8390.960.951
ZeroR60.5950.6060.6060.7550.7550.4990.522
Avg82.6770.8260.8260.8550.7510.8190.812
MiscInputMappedClassifier60.5950.6060.6060.7550.7550.4990.522
Avg60.5950.6060.6060.7550.7550.4990.522

Comparative analysis of 40 classifiers utilizing feature selection via IGAE, on dataset DS3.

Notably, Trees proves superior as a learning strategy, based on its performance across all evaluation criteria among the seven strategies compared here, particularly on F-measure and ROC metrics.

The results of using 40 classifiers on the phishing website detection dataset (DS3) are depicted in Table 18 and analyzed using accuracy and pre-session metrics. The data reveal that Random Forest achieves the best F-MEASURE and ROC scores, while other top-performing methods, Random Committee and J4B, also achieve outstanding F-MEASURE and ROC scores.

Table 18

Learning strategyClassifierAccuracyPrecisionRecallF-measureMCCROC areaPRC area
TreeRandom tree89.5670.8950.8960.8960.7810.8950.863
Random forest92.7840.9280.9280.9280.8480.9710.971
REPTree90.1540.9010.9020.9010.7930.9390.926
DecisionStump76.2660.8110.7630.7350.5220.6930.699
HoeffdingTree87.0680.8710.8710.8700.7280.9090.891
LMT61.3910.6190.6140.6140.5720.9530.736
J4B91.1320.9110.9110.9110.8140.9310.911
Avg84.0520.8470.8400.8360.7220.8980.714
BayesBayesNet87.1980.8730.8720.8770.7130.9350.939
NaiveBayes87.1980.8730.8720.8770.7130.9340.924
NaiveBayesUpdateable87.1980.8730.8720.8790.7130.9340.924
Avg87.1980.8730.8720.870.5990.9340.929
FunctionsLogistic88.9580.8890.890.8890.7670.9520.952
SimpleLogistic88.5670.8860.8860.8840.7590.9520.949
SMO86.7420.8670.8670.8650.7210.8470.812
MultilayerPerceptron89.6110.8960.8960.8960.7820.9430.943
88.6650.8880.8860.8860.7710.9100.888
LazyIBK89.4370.8940.8940.8940.7770.8950.873
Kstar88.6110.8930.8860.8830.7650.9380.941
LWL76.2660.8110.7630.7350.5220.8940.893
Avg88.6520.8910.8860.8850.7730.9270.912
MetaAdaBoostM187.5240.8750.8750.8740.7370.940.942
AttributeSelectedClassifier89.4580.8950.8950.8940.7780.9340.918
Bagging91.5880.9160.9160.9160.8230.9620.961
ClassificationViaRegression88.8930.8900.8890.8880.7660.9420.939
FilteredClassifier90.6970.9070.9070.9070.8040.9350.92
IterativeClassifierOptimizer89.0670.8910.8910.890.7070.9490.951
LogitBoost89.0670.8910.8910.890.7070.9490.951
MultiClassClassifier88.9580.8900.8890.8890.7670.9520.952
MultiClassClassifierUpdateable88.1330.8830.8810.880.7650.8640.831
RandomCommittee92.1100.9210.9210.9210.8340.9590.95
RandomizableFilteredClassifier86.6550.8660.8670.8660.7720.8710.843
RandomSubSpace91.2190.9130.9120.9110.8160.9640.966
Stacking60.5950.6060.6060.7550.7550.4990.522
WeightedInstancesHandlerWrapper60.5950.6060.6060.7550.7550.4990.522
vot60.5950.6060.6060.7550.7550.4990.522
CVParameterSelection60.5950.6060.6060.7550.7550.4990.522
Avg82.2340.8220.8220.8590.7750.8260.825
RulesDecisionTable89.2410.8920.8920.8920.7730.9350.937
JRip90.3060.9030.9030.9020.7960.9060.896
OneR74.5700.7430.7460.7420.4580.7230.679
PART89.9150.8990.8990.8980.7880.9440.934
ZeroR60.5950.6060.6060.7550.7550.4990.522
Avg80.9250.8080.8090.8370.7140.8010.793
MiscInputMappedClassifier60.5950.6060.6060.7550.7550.4990.522
Avg55.6940.5570.5570.7150.0550.5050.506

Comparative analysis of 40 classifiers utilizing feature selection via PC, on dataset DS3.

Evaluating the learning strategy indicates that Tree attains optimal results for F-MEASURE and ROC METRICS.

The summary of the comparative analysis of 40 classifiers across three datasets, as presented in Tables 418, is shown in Table 19. In this table, “RC” refers to a random committee, “RF” denotes random forest, and “RT” stands for random tree.

Table 19

DatasetAccuracyPrecisionRecallMCCF-measureROC area
DS1IBK, RCRC, RFRFRFRTRT
DS2RT, RFRT, RFRT, RFRT, RFRT, RFRF, RF
DS3RFRFRFRFRF, RepTREERF, LOGISTIC

Best classifier with respect to the evaluation metric and the dataset.

The study revealed that the classifier delivered superior results across the considered metrics and datasets. According to Table 11, the random tree classifier achieved superior results on 13 occasions, and the random forest classifiers were best seven times. Random committee classifiers besides IBK performed well twice.

This indicates that, for phishing datasets, random forest is the preferred option, compared with committee classifiers, which ranked second. The phishing website said this: the random forest classifier excelled on the phishing dataset, where all attributes were integer types, and it showcased excellent performance on the phishing website detection dataset, or even the phishing website dataset, which included integer types. Its exceptional ability to perform well regardless of the number/types of attributes makes it evident why random forest remains a preferred choice among classification techniques.

5 Best feature selection method to use with phishing website datasets

The objective of this section is to determine the optimal feature selection approach suitable for phishing datasets. To achieve this, five popular methods are assessed and compared: ClassifierAttributeEval (CAE), CorrelationAttributeEval (CAE), GainRatioAttributeEval (GRAI), InfoGainAttributeEval (IGAE), and principal components. The default settings and parameters in WEKA were utilized throughout all evaluations.

These feature selection methods were applied to a phishing dataset, with 40 classification models trained using only the top-performing 15 features, corresponding to 0.50% of the available attributes (30).

Moreover, the evaluation metrics used earlier, such as MCC, accuracy, and precision, will be analyzed again comprehensively within the same segment under Section Four's scope.

The phishing dataset underwent various feature selection methods, after which 40 classification models were trained using the top-performing 0.50% of features (15 out of 30). This section evaluates the accuracy, precision, and MCC metrics outlined in Section 4 using Table 20. Dissecting four feature selection techniques considered for this study, applied to the phishing dataset, and evaluated for accuracy, is showcased.

Table 20

Learning strategyClassifierCAECAEGRAEIGAEPC
TreeRandom tree90.50295.91195.64995.64094.219
Random forest90.66496.43696.25596.19194.473
REPTree89.56194.89894.85094.74493.523
DecisionStump84.73088.89188.89188.89188.891
HoeffdingTree88.80194.00293.93093.90393.062
LMT90.61095.76695.82995.67694.373
J4B90.03195.45095.63095.10693.794
Avg89.2710.23694.43494.30793.191
BayesBayesNet87.53592.77292.78192.63692.356
NaiveBayes87.53592.77292.78192.64592.365
NaiveBayesUpd55.69492.77255.69455.69492.365
Avg76.9210.17380.41980.32592.362
FunctionsLogistic88.64793.36993.38793.37892.682
SGD88.73893.30693.35193.51491.705
SimpleLogistic88.62993.30693.35193.43292.645
SMO88.95593.31593.32493.52391.714
VotedPerceptro88.35893.28893.33393.36592.555
Avg86.1870.24188.66588.66588.665
LazyIBK90.75596.11995.82995.73094.237
Kstar90.39396.12895.98395.64994.165
LWL84.73088.99188.97389.01888.991
Avg88.6520.20588.65293.46588.652
MetaAdaBoostM187.43592.58292.58292.58292.166
AttributeSelectedClassifier87.36394.40094.40094.31092.935
Bagging89.97795.48695.40495.38693.830
ClassificationViaRegression89.03694.53694.43694.63593.188
FilteredClassifier90.03195.45095.63095.10693.794
IterativeClassifierOptimizer87.80692.73692.73692.73692.220
LogitBoost87.80692.73692.73692.73692.437
MultiClassClassifier88.64793.36993.38793.37892.682
MultiClassClassifierUpdateable88.73893.30693.35193.51493.830
RandomCommittee90.75596.40890.75596.40894.409
RandomizableFilteredClassifier90.23094.29290.23094.77190.230
RandomSubSpace89.02793.41493.98493.45092.691
Stacking55.69455.69455.69455.69455.694
WeightedInstancesHandlerWrapper55.69455.69455.69455.69455.694
vot55.69455.69455.69455.69455.694
CVParameterSelection55.69455.69455.69455.69455.694
Avg80.60279.99687.20284.48786.661
RulesDecisionTable88.17792.86392.99892.97193.025
JRip89.27194.75394.52794.56393.306
OneR84.73088.89188.89188.89188.891
PART90.37595.58595.45995.46894.355
ZeroR55.69455.69455.69455.69455.694
Avg81.64985.55781.64985.51785.054
MiscInputMappedClassifier55.69455.69455.69455.69455.694
Avg55.69455.69455.69455.69455.694

Evaluation of the considered feature selection methods on the phishing dataset-(DS1) using the accuracy metric.

As shown in Table 20, the random forest and IBK classifiers achieved their highest accuracies with CAE as their selected method, also revealing the Functions strategy as superior across accuracy-based field analyses in this particular case.

In addition, Tree performed optimally when acclimated alongside CAE's specified attribute-selection methodology.

Table 20 clearly shows that using only 0.50% of the features generally improves accuracy.

For instance, random forest classifiers achieved the best accuracy result (96.2) when all features were used, while the random forest achieved (96.1). However, both classifiers attained their highest accuracy scores with the phishing dataset by utilizing GRAE, besides the IGAE feature selection method on just 0.50% of its features, resulting in an overall improvement based on Table 20 analysis evidence, which suggests that employing a preprocessing step, such as feature selection, may enhance predictive performance

for various classification models specifically through adoption of the CAE technique.

The evaluation results for the phishing dataset using five feature selection methods are shown in Table 21, with emphasis on the precision metric. The Random Forest classifier using the GRAE method achieved the highest precision, yielding remarkable results regardless of the feature selection method.

Table 21

Learning strategyClassifierCAECAEGRASIGAEPC
Treerandom tree0.8800.9190.9370.9370.810
Random forest0.8790.9220.9380.9370.812
REPTree0.5500.5710.6110.5910.679
DecisionStump0.2030.1400.2030.1400.13
HoeffdingTree0.3410.2320.3530.2490.293
LMT0.6020.6210.7780.7880.736
J4B0.7320.9110.6720.6710.706
Avg0.2950.2360.6410.6160.595
BayesBayesNet0.2950.7130.7310.7310.313
NaiveBayes0.2910.2300.3070.7230.282
NaiveBayesUpdateable0.3580.2300.3070.2510.282
Avg0.2940.5200.4480.5680.292
FunctionsLogistic0.3350.1450.2530.0720.292
SGD0.8770.3510.2510.0780.285
SimpleLogistic0.6420.1460.4010.1570.255
SMO0.2670.1850.4260.4230.469
Avg0.6670.24188.66588.66588.676
LazyIBK0.6420.9170.9360.9320.809
Kstar0.5630.5430.670.9570.678
LWL0.7450.4460.2760.3450.293
Avg0.3590.20588.6570.74488.652
MetaAdaBoostM10.3590.140.2030.1400.130
AttributeSelectedClassifier0.2720.6670.6670.6670.597
Bagging0.1390.6770.7040.7100.727
ClassificationViaRegression0.8780.5830.6210.6340.659
FilteredClassifier0.8730.7300.7540.7710.643
IterativeClassifierOptimizer0.7450.3960.3470.3860.361
LogitBoost0.1390.3960.3470.3860.361
MultiClassClassifier0.1390.1420.2860.0470.273
MultiClassClassifierUpdateable0.1390.1390.1390.1390.105
RandomCommittee0.1390.9200.9370.9370.825
RandomizableFilteredClassifier0.6670.9080.9060.9120.800
RandomSubSpace0.6050.7700.7740.8060.563
Stacking0.6460.1390.1390.1390.105
WeightedInstancesHandlerWrapper0.5970.5570.1390.1390.105
vot0.1390.1390.1390.1390.105
CVParameterSelection0.1390.1390.1390.1390.105
Avg80.6020.4557.5100.4437.474
RulesDecisionTable0.6670.6680.6670.6680.568
JRip0.6050.6250.6410.6570.52
OneR0.6460.6460.6460.6460.132
PART0.5970.6250.6710.6730.691
ZeroR0.1390.1390.1390.1390.105
Avg81.6490.54081.6490.5560.403
MiscInputMappedClassifier55.6940.1390.1390.1390.105
Avg55.6940.13955.6930.13955.693

Evaluation of the considered feature selection methods on the phishing dataset-(DS1) using the precision metric.

Moreover, function and tree strategies proved to be efficient learning approaches for the precision metrics in this dataset. GRAE achieved the trees' maximum precision.

Comparing Table 21 (utilizing all features) and using only 0.50% percentiles confirms that a general improvement in precision results can be seen when utilizing fewer attributes such as those demonstrated in Table 1's findings; for instance, while utilizing every attribute resulted in a top score reaching 0.938, lessened usage proved more beneficial overall performance-wise across varying methodologies examined previously through these tables mentioned above.

Hence, according to Table 21, using feature selection as a preprocessing step may improve the overall predictive performance of most classification models.

Hence, according to Table 21, using feature selection as a preprocessing step may improve the overall predictive performance of most classification models,

Table 22 displays the assessment results for five feature selection techniques applied to the phishing dataset, using MCC as the metric. The Random Forest classifier achieved the highest MCC values with the IGAE technique, and the Functions strategy produced optimal learning results on this data set. The tree method achieved favorable results by applying IGAE for feature selection.

Table 22

Learning strategyClassifierCAECAEGRASIGAEPC
TreeRandom tree0.7450.7970.8230.8150.781
Random forest0.8370.8790.8690.8870.848
REPTree0.7760.8120.8230.8420.793
DecisionStump0.4760.5680.5320.5320.522
HoeffdingTree0.4140.6380.5230.7000.728
J4B0.7900.8290.8340.8380.814
Avg0.6730.7530.7340.7690.747
BayesBayesNet0.7170.7540.8220.7580.730
NaiveBayes0.5470.7210.6030.7530.730
NaiveBayesUpdateable0.5470.7210.6030.7530.730
Avg0.6030.7320.6760.7540.730
FunctionsLogistic0.7500.7820.8200.8040.767
SGD0.7380.7820.8180.8030.759
SimpleLogistic0.7460.7810.7520.7480.721
SMO0.7010.7470.8440.8070.782
Avg0.7330.7730.77120.7710.771
LazyIBK0.7460.7820.9090.7950.777
Kstar0.7750.8020.8270.7930.765
LWL0.5550.5680.5280.540.522
Avg0.6920.7170.7730.7090.773
MetaAdaBoostM10.7070.7940.5280.7810.737
AttributeSelectedClassifier0.7490.8200.8370.8370.778
Bagging0.7990.8430.8540.8660.823
ClassificationViaRegression0.7570.8110.8080.8430.766
FilteredClassifier0.7860.8220.8510.8420.804
IterativeClassifierOptimizer0.7460.7940.8070.8100.770
LogitBoost0.7460.7940.8070.8100.770
MultiClassClassifier0.7500.7820.8200.8040.767
MultiClassClassifierUpdateable0.7380.7860.8040.7970.750
RandomCommittee0.8240.8590.8640.8780.834
RandomizableFilteredClassifier0.6880.6850.7940.6520.72
RandomSubSpace0.7920.8310.8460.8520.816
Stacking0.7550.7550.7550.7550.755
WeightedInstancesHandlerWrapper0.7550.7550.7550.7550.755
vot0.7550.7550.7550.7550.755
CVParameterSelection0.7550.7550.7550.7550.755
Avg0.7560.7900.7900.7990.772
RulesDecisionTable0.6160.7810.7980.7830.773
JRip0.4370.8270.8140.8410.796
OneR0.5940.5410.5410.5410.458
PART0.5510.8290.8530.8390.788
ZeroR0.2430.4990.7550.7550.755
Avg0.4880.69540.7130.7510.714
MiscInputMappedClassifier0.2430.4990.7550.7550.755
Avg0.2430.4990.5000.7550.500

Evaluation of the considered feature selection methods on the phishing dataset-(DS1) using the MCC metric.

Moreover, comparing all features vs. using only 0.50% showed an improvement in overall performance when examined against the MCC matrix, exemplified by the best-case scenario, in which using all available features yielded a score of 0.887 via Random Forest classification.

According to Table 15, considerable progress is expected in refined prediction accuracy across various classification models if appropriate feature selection is conducted during preprocessing, particularly when leveraging responsive methods such as those designated “IGEA.”

The optimal approach to optimization is demonstrated in Figure 5, which shows that IGAE Feature selection reigns supreme.

Figures 35 reveal that the feature selection IGAE and GRAE, in addition to the tree learning strategy, exhibit superior performance compared to other strategies in terms of accuracy, precision, recall, MCC, F-measure, and ROC area across three datasets. Moreover, the rules and misc learning strategies demonstrate subpar results across almost all metrics for those same three datasets.

Figure 3

Figure 4

Figure 5

Consequently, it is strongly advised against using rules other than the misc learning strategy course of study for phishing detection.

6 Conclusion and future research

This research aimed to identify optimal characteristics for creating a stronger machine learning model for detecting phishing websites. Over the past three decades, machine learning has made significant strides and has been implemented in many practical applications, including identifying malicious web pages used in scams or identity theft.

The paper investigates the best classification model for detecting these site types. While exploring which classification method would best handle phishing website detection datasets, the author discovered that an ensemble approach combining Random Forest, Random Tree, and IBK classifiers proved most effective. In conclusion, after evaluating several feature selection methods for detecting fraudulent websites, InfoGainAttributeEval and GainRatioAttributeEval were deemed reliable options. However, further appraisals focusing on variables such as the additional classification styles mentioned above should continue to be considered alongside other metrics. Comparing their performance will provide additional insight into refining detection accuracy for tracing illicit online activity.

Statements

Data availability statement

Publicly available datasets were analyzed in this study. This data can be found here: https://data.mendeley.com/datasets/h3cgnj8hft/1.

Author contributions

RA: Writing – original draft, Writing – review & editing. MB: Writing – review & editing, Writing – original draft, Data curation. KA: Writing – original draft, Methodology, Writing – review & editing. AA: Writing – review & editing, Software, Writing – original draft. YH: Writing – review & editing, Writing – original draft, Project administration. FA: Writing – review & editing, Writing – original draft, Visualization. EQ: Writing – review & editing, Writing – original draft.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

  • 1

    AbdelhamidN.AyeshA.ThabtahF. (2014). Phishing detection based associative classification data mining. Expert Syst. Appl.41, 59485959. doi: 10.1016/j.eswa.2014.03.019

  • 2

    AbutahaM.AbabnehM.MahmoudK.BaddarS. A. H. (2021). “URL phishing detection using machine learning techniques based on URLs lexical analysis,” in 2021 12th International Conference on Information and Communication Systems (ICICS) (Valencia: IEEE), 147152. doi: 10.1109/ICICS52457.2021.9464539

  • 3

    AlazaidahR.AhmadF. K.MohsenM. F. M.JunohA. K. (2018). Evaluating conditional and unconditional correlations capturing strategies in multi label classification. J. Telecommun. Electr. Comput. Eng.10, 4751.

  • 4

    AlazaidahR.Al-ShaikhA.Al-Mousa MR.KhafajahH.SamaraG.AlzyoudM.et al. (2024). Website phishing detection using machine learning techniques. J. Stat. Applic. Probab.13, 119129. doi: 10.18576/jsap/130108

  • 5

    AlazaidahR.AlzyoudM.Al-ShanablehN.AlzoubiH. (2023b). “The significance of capturing the correlations among labels in multi-label classification: an investigative study,” in AIP Conference Proceedings, Vol. 2979 (Jordan: AIP Publishing). doi: 10.1063/5.0177340

  • 6

    AlazaidahR.SamaraG.AljaidiM.Haj QasemM.AlsarhanA.AlshammariM. (2023a). Potential of machine learning for predicting sleep disorders: a comprehensive analysis of regression and classification models. Diagnostics14:27. doi: 10.3390/diagnostics14010027

  • 7

    Al-BatahM. S.AlzboonM. S.AlazaidahR. (2023). Intelligent heart disease prediction system with applications in Jordanian hospitals. Int. J. Adv. Comput. Sci. Applic.14, 11511159. doi: 10.14569/IJACSA.2023.0140954

  • 8

    AljofeyA.Bello SA.LuJ.XuC. (2025). Comprehensive phishing detection: a multi-channel approach with variants TCN fusion leveraging URL and HTML features. J. Netw. Comput. Applic.238:104170. doi: 10.1016/j.jnca.2025.104170

  • 9

    AlluwaiciM.JunohA. K.AlZoubiW. A.AlazaidahR.Al-luwaiciW. (2020). New features selection method for multi-label classification based on the positive dependencies among labels. Solid State Technol. 63.

  • 10

    AlluwaiciM. A.JunohK.AlazaidahR. (2020). New problem transformation method based on the local positive pairwise dependencies among labels. J. Inform. Knowl. Manag.19:2040017. doi: 10.1142/S0219649220400171

  • 11

    AlzyoudM.AlazaidahR.AljaidiM.SamaraG.QasemM.KhalidM.et al. (2024). Diagnosing diabetes mellitus using machine learning techniques. Int. J. Data Netw. Sci.8, 179188. doi: 10.5267/j.ijdns.2023.10.006

  • 12

    APWG (2021). Phishing Activity Trends Reports, 4th Quarter 2020. Anti-Phishing Working Group. Available online at: https://apwg.org/trendsreports/ (Accessed May 09, 2021).

  • 13

    AthulyaA. A.PraveenK. (2020). “Towards the detection of phishing attacks,” in 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI) (Tirunelveli, India: IEEE), 337343. doi: 10.1109/ICOEI48184.2020.9142967

  • 14

    BarikK.MisraS.MohanR. (2025). Web-based phishing URL detection model using deep learning optimization techniques. Int. J. Data Sci. Anal.20, 123. doi: 10.1007/s41060-025-00728-9

  • 15

    ChaplaH.KotakR.JoiserM. (2019). “A machine learning approach for URL based web phishing using fuzzy logic as classifier,” in 2019 International Conference on Communication and Electronics Systems (ICCES) (Coimbatore: IEEE), 383388. doi: 10.1109/ICCES45898.2019.9002145

  • 16

    ChiewK. L.TanC. L.WongK.YongK. S. C.TiongW. K. (2019). A new hybrid ensemble feature selection framework for machine learning-based phishing detection system. Inform. Sci. 484, 153166. doi: 10.1016/j.ins.2019.01.064

  • 17

    CuiQ. (2019). Detection and Analysis of Phishing Attacks. Diss. Université d'Ottawa/University of Ottawa.

  • 18

    GandotraE.GuptaD. (2021). An efficient approach for phishing detection using machine learning, multimedia security: algorithm development. Anal. Applic.239253. doi: 10.1007/978-981-15-8711-5_12

  • 19

    GanjeiM. A.BoostaniR. (2022). A hybrid feature selection scheme for high-dimensional data. Eng. Appl. Artif. Intell.113:104894. doi: 10.1016/j.engappai.2022.104894

  • 20

    GarethJ.WittenD.HastieT.TibshiraniR.TaylorJ. (2023). “Statistical learning,” in An Introduction to Statistical Learning: With Applications in Python (Cham: Springer International Publishing), 1567.

  • 21

    MohammadR. M.ThabtahF.McCluskeyL. (2015). Phishing Websites Features. School of Computing and Engineering, University of Huddersfield.

  • 22

    NiJ.ShenK.ChenY.CaoW.YangS. X. (2022). An improved deep network-based scene classification method for self-driving cars. IEEE Trans. Instrument. Measur.71, 114. doi: 10.1109/TIM.2022.3146923

  • 23

    NtiI. N.Narko-BoatengO.AdekoyaA. F.SomanathanA. R. (2022). Stacknet based decision fusion classifier for network intrusion detection. Int. Arab J. Inform. Technol.19, 478490. doi: 10.34028/iajit/19/3A/8

  • 24

    PeiM.FengY.ChanglongZ.MinghuaJ. (2022). Smoke detection algorithm based on negative sample mining. Int. Arab J. Inform. Technol.19, 19. doi: 10.34028/iajit/19/4/15

  • 25

    RaoR. S.VaishnaviT.PaisA. R. (2020). CatchPhish: detection of phishing Websites by inspecting URLs. J. Ambient Intell. Hum. Comput.11, 813825. doi: 10.1007/s12652-019-01311-4

  • 26

    RashidJ.MahmoodT. M.NisarW.NazirT. (2020). “Phishing detection using machine learning technique,” in 2020 First International Conference of Smart Systems and Emerging Technologies (SMARTTECH) (Riyadh: IEEE), 4346. doi: 10.1109/SMART-TECH49988.2020.00026

  • 27

    SahingozO. K.BuberE.DemirO.DiriB. (2019). Machine learning based phishing detection from URLs. Expert Syst. Applic. 117, 345357. doi: 10.1016/j.eswa.2018.09.029

  • 28

    SrivastavaS. (2014). Weka: a tool for data preprocessing, classification, ensemble, clustering and association rule mining. Int. J. Comput. Applic.88:10. doi: 10.5120/15389-3809

  • 29

    SuJ.-M.ChangJ.IndrayaniN. L. D.WangC. (2023). Machine learning approach to determine the decision rules in ergonomic assessment of working posture in sewing machine operators. J. Saf. Res. 87, 1526. doi: 10.1016/j.jsr.2023.08.008

  • 30

    TanC. L. (2018). Phishing Dataset for Machine Learning: Feature Evaluation. Mendeley Data. Available online at: https://data.mendeley.com/datasets/h3cgnj8hft/1 (Accessed May 10, 2021).

  • 31

    UbingA. A.KamiliaS.AbdullahA.JhanjhiN.SupramaniamM. (2019). Phishing Website detection: an improved accuracy through feature selection and ensemble learning. Int. J. Adv. Comput. Sci. Appl.10, 252257. doi: 10.14569/IJACSA.2019.0100133

  • 32

    VigneswariT.VijayaN.KalaiselviN. (2021). Early prediction of cervical cancer using machine learning techniques. Turkish J. Physiother. Rehabil. 32, 262269.

  • 33

    WarburtonD. (2020). 2020 Phishing and Fraud Report. F5 Labs.

  • 34

    ZabihimayvanM.DoranD. (2019). “Fuzzy rough set feature selection to enhance phishing attack detection,” in 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (New Orleans, LA: IEEE). doi: 10.1109/FUZZ-IEEE.2019.8858884

Summary

Keywords

classification, phishing websites, machine learning, feature selection, URL analysis

Citation

Alazaidah R, BaniSalman M, Alqawasmi KE, Abu Zaid A, Hazaimeh Y, Alshraiedeh FS and Qumsiyeh E (2026) Identifying key features for phishing website detection through feature selection techniques. Front. Comput. Sci. 7:1687867. doi: 10.3389/fcomp.2025.1687867

Received

18 August 2025

Revised

23 November 2025

Accepted

26 November 2025

Published

21 January 2026

Volume

7 - 2025

Edited by

Zainab Loukil, University of Gloucestershire, United Kingdom

Reviewed by

Faisal Ahmad, Workday Inc., United States

Abdul Karim, Hallym University, Republic of Korea

Updates

Copyright

*Correspondence: Emma Qumsiyeh,

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics