Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Radiol.

Sec. Artificial Intelligence in Radiology

This article is part of the Research TopicAgentic AI and Large Language Models for RadiologyView all articles

Multimodal Deep Learning Model for Enhanced Early Detection of Aortic Stenosis Integrating ECG and Chest X-ray with Cooperative Learning

Provisionally accepted
Shun  NagaiShun Nagai1Makoto  NishimoriMakoto Nishimori2*Masakzu  ShinoharaMasakzu Shinohara2Hidekazu  TanakaHidekazu Tanaka1Hiromasa  OtakeHiromasa Otake1
  • 1Division of Cardiovascular Medicine, Department of Internal Medicine, Kobe University Graduate School of Medicine, Kobe, Japan, kobe, Japan
  • 2Division of Molecular Epidemiology, Kobe University Graduate School of Medicine, Kobe, Japan, kobe, Japan

The final, formatted version of the article will be published soon.

Background: Aortic stenosis (AS) is diagnosed by echocardiography, the current gold standard, but examinations are often performed only after symptoms emerge, highlighting the need for earlier detection. Recently, artificial intelligence (AI)–based screening using non-invasive and widely available modalities such as electrocardiography (ECG) and chest X-ray(CXR) has gained increasing attention for valvular heart disease. However, single-modality approaches have inherent limitations, and in clinical practice, multimodality assessment is common. In this study, we developed a multimodal AI model integrating ECG and CXR within a cooperative learning framework to evaluate its utility for earlier detection of AS. Methods: We retrospectively analyzed 23,886 patient records from 7,483 patients who underwent ECG, CXR, and echocardiography. A multimodal model was developed by combining a 1D ResNet50– Transformer architecture for ECG data with an EfficientNet-based architecture for CXR. Cooperative learning was implemented using a loss function that allowed the ECG and CXR models to refine each other's predictions. We split the dataset into training, validation, and test sets, and performed 1,000 bootstrap iterations to assess model stability. AS was defined echocardiographically as peak velocity ≥2.5 m/s, mean pressure gradient ≥20 mmHg, or aortic valve area ≤1.5 cm². Results: Among 7,483 patients, 608 (8.1%) were diagnosed with AS. The multimodal model achieved a test AUROC of 0.812 (95% CI: 0.792–0.832), outperforming the ECG model (0.775, 95% CI: 0.753– 0.796) and the CXR model (0.755, 95% CI: 0.732–0.777). Visualization techniques (Grad-CAM, Transformer attention) highlighted distinct yet complementary features in AS patients. Conclusions: The multimodal AI model via cooperative learning outperformed single-modality methods in AS detection and may aid earlier diagnosis and reduce clinical burden.

Keywords: multimodal AI, cooperative learning, aortic stenosis, deep learning, ECG, CXR = Chest X-Ray

Received: 06 Sep 2025; Accepted: 11 Nov 2025.

Copyright: © 2025 Nagai, Nishimori, Shinohara, Tanaka and Otake. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Makoto Nishimori, mnishimail@gmail.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.