ORIGINAL RESEARCH article

Front. Comput. Sci.

Sec. Software

Volume 7 - 2025 | doi: 10.3389/fcomp.2025.1550453

This article is part of the Research TopicMachine Learning for Software EngineeringView all 5 articles

A Hierarchical Multi-Class Classification System for Face and Text Datasets

Provisionally accepted
Ashish  SainiAshish Saini1Nasib  Singh GillNasib Singh Gill1Preeti  GuliaPreeti Gulia1KHUSHWANT  SINGHKHUSHWANT SINGH1Fernando  MoreiraFernando Moreira2*
  • 1Maharshi Dayanand University, Rohtak, Haryana, India
  • 2Portucalense University, Porto, Portugal

The final, formatted version of the article will be published soon.

In the era of rapidly growing multimedia data, the need for robust and efficient classification systems has become critical i.e. identification of class name and pose or style. The paper first provides an understanding of the organization of data, next feature selection (i.e. edge) using the k-means segmentation technique is explained. Further for optimization of features, the linear regression technique is employed. The optimized features can be directly used with classifiers but to reduce the noise outliers are identified and removed from training data. The classifiers involved in training and recognizing the face or text class label. After the prediction of class labels, the distance matrix-based technique is used to identify the style or pose name. Finally, the experiments are conducted with the help of the ORL dataset (40 class and 10 poses in each class) and character dataset (36 characters and 10 font styles in each character). The experimental consequences demonstrate the proposed methodology accurately classifies the hierarchically organized data and prove superiority over KNN and Bayesian based classification in place of SVM. The system provides up to 100% accurate classification outcomes for outlier removed data, and for basic features, it is up to 98%. Unlike traditional flat classification approaches, our system leverages hierarchical structures to enhance classification accuracy, scalability and interpretability.

Keywords: Data Mining, Support vector machine, Bayes classifier, K-nearest neighbor, machine learnig

Received: 23 Dec 2024; Accepted: 20 May 2025.

Copyright: © 2025 Saini, Gill, Gulia, SINGH and Moreira. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Fernando Moreira, Portucalense University, Porto, Portugal

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.