Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Artif. Intell.

Sec. Medicine and Public Health

Volume 8 - 2025 | doi: 10.3389/frai.2025.1662984

This article is part of the Research TopicDigital Medicine and Artificial IntelligenceView all 9 articles

Search-Optimized Quantization in Biomedical Ontology Alignment

Provisionally accepted
  • Université de Lille, Lille, France

The final, formatted version of the article will be published soon.

In the fast-moving world of AI, as organizations and researchers develop more advanced models, they face challenges due to their sheer size and computational demands. Deploying such models on edge devices or in resource-constrained environments adds further challenges related to energy consumption, memory usage and latency. To address these challenges, emerging trends are shaping the future of efficient model optimization techniques. From this premise, by employing supervised state-of-the-art transformer-based models, this research introduces a systematic method for ontology alignment, grounded in cosine-based semantic similarity between a biomedical layman vocabulary and the Unified Medical Language System (UMLS) Metathesaurus. It leverages MICROSOFT OLIVE to search for target optimizations among different Execution Providers (EPs) using the ONNX RUNTIME backend, followed by an assembled process of dynamic quantization employing INTEL NEURAL COMPRESSOR and IPEX (Intel Extension for PyTorch). Through our optimization process, we conduct extensive assessments on the two tasks from the DEFT 2020 Evaluation Campaign, achieving a new state-of-the-art in both. We retain performance metrics intact, while attaining an average inference speed-up of 20x and reducing memory usage by approximately 70%.1

Keywords: Ontology alignment, UMLS Metathesaurus, semantic similarity, Transformer models, model optimization, Model quantization

Received: 09 Jul 2025; Accepted: 18 Aug 2025.

Copyright: © 2025 Bouaggad and Grabar. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Oussama Bouaggad, Université de Lille, Lille, France

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.