AUTHOR=He Tao , Jin Xinyuan , Zou Yiming TITLE=Deep learning-based action recognition for joining and welding processes of dissimilar materials JOURNAL=Frontiers in Materials VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/materials/articles/10.3389/fmats.2025.1560419 DOI=10.3389/fmats.2025.1560419 ISSN=2296-8016 ABSTRACT=IntroductionJoining and welding processes for dissimilar materials present unique challenges due to the need for precise monitoring and analysis of complex physical and chemical interactions. These processes are influenced by variations in material behavior, dynamic changes in process parameters, and environmental factors, making real-time action recognition a critical tool for ensuring consistent quality, efficiency, and reliability. Traditional methods for analyzing such processes often fail to effectively capture the multi-scale spatiotemporal dependencies and adapt to the inherent variability of these operations. To address these limitations, we propose a novel deep learning-based framework specifically designed for action recognition in joining and welding tasks involving dissimilar materials.MethodsOur proposed model, the Multi-Scale Spatiotemporal Attention Network (MS-STAN), leverages advanced hierarchical feature extraction techniques and attention mechanisms to capture fine-grained spatiotemporal patterns across varying scales. The model simultaneously suppresses irrelevant or noisy regions within the input data to enhance its robustness. The framework integrates adaptive frame sampling and lightweight temporal modeling to ensure computational efficiency, making it practical for real-time applications without sacrificing accuracy. Additionally, domain-specific knowledge is embedded into the framework to enhance its interpretability and improve its ability to generalize across diverse joining and welding scenarios.Results and DiscussionExperimental results highlight the model's superior performance in recognizing critical process actions. The MS-STAN framework outperforms traditional approaches in terms of accuracy and adaptability, effectively capturing the complex dependencies within joining and welding processes. The results demonstrate its potential for robust real-time monitoring, quality assurance, and optimization of joining and welding workflows. By integrating intelligent recognition capabilities into manufacturing systems, this work paves the way for more adaptive and efficient production environments.