AUTHOR=Ray Lala Shakti Swarup , Xia Qingxin , Rey Vitor Fortes , Wu Kaishun , Lukowicz Paul TITLE=Improving IMU based human activity recognition using simulated multimodal representations and a MoE classifier JOURNAL=Frontiers in Computer Science VOLUME=Volume 7 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/computer-science/articles/10.3389/fcomp.2025.1569205 DOI=10.3389/fcomp.2025.1569205 ISSN=2624-9898 ABSTRACT=The lack of labeled sensor data for Human Activity Recognition (HAR) has driven researchers to synthesize Inertial Measurement Unit (IMU) data from video, utilizing the rich activity annotations available in video datasets. However, current synthetic IMU data often struggles to capture subtle, fine-grained motions, limiting its effectiveness in real-world HAR applications. To address these limitations, we introduce Multi3Net+, an advanced framework leveraging cross-modal, multitask representations of text, pose, and IMU data. Building on its predecessor, Multi3Net, it uses improved pre-training strategies and a mixture of experts classifier to effectively learn robust joint representations. By leveraging refined contrastive learning across modalities, Multi3Net+ bridges the gap between video and wearable sensor data, enhancing HAR performance for complex, fine-grained activities. Our experiments validate the superiority of Multi3Net+, showing significant improvements in generating high-quality synthetic IMU data and achieving state-of-the-art performance in wearable HAR tasks. These results demonstrate the efficacy of the proposed approach in advancing real-world HAR by combining cross-modal learning with multi-task optimization.