ORIGINAL RESEARCH article
Front. Artif. Intell.
Sec. Medicine and Public Health
Volume 8 - 2025 | doi: 10.3389/frai.2025.1612502
LMS-ViT: A Multi-Scale Vision Transformer Approach for Real-Time Smartphone-Based Skin Cancer Detection
Provisionally accepted- VIT University, Vellore, Tamil Nadu, India
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Skin Cancer is the abnormal growth of skin cells. It occurs mostly in skin exposed to sunlight. To prevent the occurrence of skin cancer, avoid exposing skin from ultraviolet radiation. Skin cancer can be very harmful if found very late.Traditional convolutional neural networks (CNNs) face challenges in fine-grained lesion classification due to their limited ability to extract detailed features. In order to overcome such limitations, we introduced a novel approach in the form of a lightweight multi-scale vision transformer (LMS-ViT) application for the automated detection of skin cancer using dermoscopic images and the HAM10000 dataset.Unlike CNNs, LMS-ViT employs a multi-scale attention mechanism to capture both global lesion structures and fine-grained textural details, improving classification accuracy. This study combines skin images from the HAM10000 dataset with pictures taken using a smartphone. It uses a compact method to mix important features, which makes the system faster and suitable for real-time use in medical apps The proposed system enables realtime skin cancer classification via a smartphone camera, making it portable and platformindependent.Experimental results show LMS-ViT surpasses CNN-based models across all skin lesion categories, achieving 90% accuracy, an 18% improvement over CNN while reducing computational cost by 30%. LMS-ViT also improves precision, recall, and F1-score, particularly in complex categories like Vasc (0.96 to 1.01) and Nv (0.94 to 1.01), demonstrating superior classification power.With real-time Android implementation, LMS-ViT offers accessible, mobile-friendly diagnostics for early skin cancer detection.
Keywords: vision Transformer, Domain adaptation, CNN, image classification, LMS-ViT
Received: 08 May 2025; Accepted: 30 Jul 2025.
Copyright: © 2025 Leema, P, G and G. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Balakrishnan P, VIT University, Vellore, 632 014, Tamil Nadu, India
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.