AUTHOR=Naseer Aysha , Almudawi Naif , Aljuaid Hanan , Alazeb Abdulwahab , AlQahtani Yahay , Algarni Asaad , Jalal Ahmad , Liu Hui TITLE=Multi-modal remote sensory learning for multi-objects over autonomous devices JOURNAL=Frontiers in Bioengineering and Biotechnology VOLUME=Volume 13 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/bioengineering-and-biotechnology/articles/10.3389/fbioe.2025.1430222 DOI=10.3389/fbioe.2025.1430222 ISSN=2296-4185 ABSTRACT=IntroductionThere has been an increasing focus on object segmentation within remote sensing images in recent years due to advancements in remote sensing technology and the growing significance of these images in both military and civilian realms. In these situations, it is critical to accurately and quickly identify a wide variety of objects. In many computer vision applications, scene recognition in aerial-based remote sensing imagery presents a common issue.MethodHowever, several challenging elements make this work especially difficult: (i) Different objects have different pixel densities; (ii) objects are not evenly distributed in remote sensing images; (iii) objects can appear differently depending on viewing angle and lighting conditions; and (iv) there are fluctuations in the number of objects, even the same type, in remote sensing images. Using a synergistic combination of Markov Random Field (MRF) for accurate labeling and Alex Net model for robust scene recognition, this work presents a novel method for the identification of remote sensing objects. During the labeling step, the use of MRF guarantees precise spatial contextual modeling, which improves comprehension of intricate interactions between nearby aerial objects. By simultaneously using deep learning model, the incorporation of Alex Net in the following classification phase enhances the model’s capacity to identify complex patterns in aerial images and adapt to a variety of object attributes.ResultsExperiments show that our method performs better than others in terms of classification accuracy and generalization, indicating its efficacy analysis on benchmark datasets such as UC Merced Land Use and AID.DiscussionSeveral performance measures were calculated to assess the efficacy of the suggested technique, including accuracy, precision, recall, error, and F1-Score. The assessment findings show a remarkable recognition rate of around 97.90% and 98.90%, on the AID and the UC Merced Land datasets, respectively.