Editorial: Advances and Challenges in AI-Driven Visual Intelligence: Bridging Theory and Practice

Huang, Bo; Zhang, Dawei; Liu, Qiao

doi:10.3389/frai.2025.1740331

EDITORIAL article

Front. Artif. Intell.

Sec. Machine Learning and Artificial Intelligence

This article is part of the Research TopicAdvances and Challenges in AI-Driven Visual Intelligence: Bridging Theory and PracticeView all 7 articles

Editorial: Advances and Challenges in AI-Driven Visual Intelligence: Bridging Theory and Practice

Provisionally accepted

Bo Huang^1*

Dawei Zhang²

Qiao Liu³

¹Chongqing University, Chongqing, China
²Zhejiang Normal University, Jinhua, China
³Chongqing Normal University, Chongqing, China

The final, formatted version of the article will be published soon.

As AI systems increasingly operate in dynamic, open-world scenarios, their vulnerability to adversarial manipulation presents significant concerns. Pi et al. [1] investigate test-time poisoning attacks targeting openworld test-time training (OWTTT) models. Their findings reveal that these models can be compromised with merely 100 queries using single-step query-based attack strategies. More critically, the affected models show limited recovery capability even when subsequently exposed to normal samples. This work serves as a crucial reminder that as we pursue adaptive AI systems, we must simultaneously address their security vulnerabilities to ensure safe deployment in critical applications. The scarcity of labeled training data remains a persistent bottleneck in visual intelligence applications. Two articles tackle this limitation through innovative semi-supervised approaches. Yu et al. [2] propose an enhanced YOLOv8 framework that embeds the CBAM attention mechanism in high-level networks while incorporating Mean Teacher semi-supervised learning to address data labeling challenges. Their method achieves robust detection of micron-scale defects in industrial polymer films. Guo et al. [3] introduce a weight-aware semi-supervised self-ensembling framework (WSSL) for interior decoration style classification. By employing an adaptive weighting module based on truncated Gaussian functions, their approach selectively leverages reliable unlabeled data while mitigating confirmation bias from unreliable pseudo-labels. Both works demonstrate that semi-supervised learning can substantially reduce annotation costs without compromising performance. The preservation of cultural landscapes presents unique challenges that require accurate 3D reconstruction capabilities. Chen et al. [4] develop DGA-Net, which combines deep feature extraction with graph structure representation for reconstructing historical garden landscapes. The architecture incorporates attention mechanisms to emphasize ecologically significant features, enabling more accurate restoration planning and cultural heritage protection. This work exemplifies how AI-driven visual intelligence can contribute to environmental conservation and the preservation of cultural heritage sites. Two articles demonstrate the potential of visual intelligence for accessible healthcare and agricultural monitoring. Navarro-Cabrera et al. [5] apply DenseNet169 for detecting iron deficiency anemia in university students through fingernail image analysis. Their approach achieves 71.08% accuracy with 74.09% AUC, offering a cost-effective alternative to conventional blood tests, particularly relevant for developing regions. Aronés et al. [6] present App2, a mobile application combining CNN and SVM architectures for apple leaf disease detection. The system achieves 95% accuracy on clear images and maintains 80% performance under real-field conditions after model adaptation. Notably, their inclusion of validation filters to verify leaf presence reduces false detections, demonstrating attention to practical deployment considerations. Both works highlight how AI-based visual intelligence can address accessibility gaps in healthcare and agriculture. The collected works reveal several persistent challenges in visual intelligence research. First, the security of adaptive systems requires greater attention, particularly as models operate in adversarial environments. The difficulty in recovering from poisoning attacks suggests that defensive mechanisms must be incorporated at the architectural level. Second, efficient learning from limited labeled data continues to demand innovation, with semi-supervised and self-supervised methods showing promise but requiring careful handling of pseudo-label reliability. Third, the integration of domain knowledge with deep learning architectures remains an underexplored avenue that could enhance both performance and interpretability. Finally, the gap between laboratory performance and real-world deployment-exemplified by the accuracy drop in field conditions-indicates that robustness to environmental variations needs greater emphasis during model development.The progress documented in this Research Topic demonstrates that effective visual intelligence systems require more than algorithmic sophistication. Success depends equally on addressing practical constraints including computational efficiency, data availability, security vulnerabilities, and deployment accessibility. As the field advances, bridging the gap between theoretical capabilities and practical requirements will necessitate continued collaboration across disciplines and careful consideration of real-world operating conditions.

Keywords: AI-Driven Visual Intelligence, adversarial attacks, Semi-Supervised Learning, 3D Reconstruction, Non-invasive diagnostic

Received: 05 Nov 2025; Accepted: 17 Nov 2025.

Copyright: © 2025 Huang, Zhang and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Bo Huang, huangbo0326@cqu.edu.cn

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.