Evaluation of MobileNetV3-Large for Crack Classification Across Low-and High-Resolution Images

Chen, Liujie; Yao, Haodong; Gan, Ke; Huang, Zanyu; Zhang, Jing; Ng, Ching Tai; Fu, Jiyang

doi:10.3389/fbuil.2025.1724879

ORIGINAL RESEARCH article

Front. Built Environ.

Sec. Computational Methods in Structural Engineering

This article is part of the Research TopicAdvanced Non-Destructive Testing for Structural Health Monitoring of StructuresView all articles

Evaluation of MobileNetV3-Large for Crack Classification Across Low-and High-Resolution Images

Provisionally accepted

Liujie Chen¹

Haodong Yao¹ Ke Gan

Ke Gan²

Zanyu Huang¹

Jing Zhang^1*

Ching Tai Ng³

Jiyang Fu^1*

¹Guangzhou University, Guangzhou, China
²Guangzhou Testing Center of Construction Quality and Safety Co., Ltd, Guangzhou, China
³The University of Adelaide, Adelaide, Australia

The final, formatted version of the article will be published soon.

This paper evaluates the robustness and generalization ability of five recently developed Convolutional Neural Networks(CNNs), Visual Geometry Group 16 (VGG16), Google Inception Net (GoogLeNet), Mobile Network version 3 Large (MobileNetV3-Large), Efficient Network B0 (EfficientNetB0) and Efficient Network version 2 Small (EfficientNetV2-S), on crack recognition and classification. This study proposes a semantic segmentation based on VGG16-U-Net to address the issue of background noise in the images automatically and the transfer learning with fine-tuning is used to improve the performance of the CNNs in the bridge crack image dataset and building crack image dataset (transverse cracks, vertical cracks, oblique cracks and irregular cracks). The results indicate that the MobileNetV3-Large has the best performance. For the low-resolution building crack image dataset, the accuracy of the crack recognition reaches 99.58% and the F1-score reaches 99.60%. The accuracy of the classification reaches 94.70% and the Macro-F1 reaches 94.71%. For the higher resolution bridge crack image dataset, the accuracy of the classification reaches 95.70% and the Macro-F1 reaches 95.67%. The results show that the MobileNetV3-Large has the best robustness and generalization ability with a small CNN size and the shortest training time.

Keywords: Convolutional neural networks (CNNs), Crack recognition and classification, Semantic segmentation, Transfer Learning, robustness, Generalization ability

Received: 14 Oct 2025; Accepted: 18 Nov 2025.

Copyright: © 2025 Chen, Yao, Gan, Huang, Zhang, Ng and Fu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Jing Zhang, zhangjing@gzhu.edu.cn
Jiyang Fu, jiyangfu@gzhu.edu.cn

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.