Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Dent. Med.

Sec. Pediatric Dentistry

Volume 6 - 2025 | doi: 10.3389/fdmed.2025.1634006

This article is part of the Research TopicEmerging Technologies and Therapies in Orthodontics and Pediatric DentistryView all articles

EVALUATING THE ACCURACY OF GENERATIVE ARTIFICIAL INTELLIGENCE MODELS IN DENTAL AGE ESTIMATION BASED ON THE DEMIRJIAN'S METHOD

Provisionally accepted
Allan  AbuabaraAllan Abuabara1Thais  Vilalba Paniagua Machado do NascimentoThais Vilalba Paniagua Machado do Nascimento2Seandra  Maria TrentiniSeandra Maria Trentini1Angela  Mairane Costa GonçalvesAngela Mairane Costa Gonçalves1Maria Angélica  Hueb De Menezes OliveiraMaria Angélica Hueb De Menezes Oliveira3Isabela  MadalenaIsabela Madalena3Svenja  Beisel-MemmertSvenja Beisel-Memmert4Christian  KirschneckChristian Kirschneck4Livia  AntunesLivia Antunes5Cristiano  Miranda de AraujoCristiano Miranda de Araujo2Flares  Baratto-FilhoFlares Baratto-Filho2Erika  KuchlerErika Kuchler6*
  • 1Universidade da Regiao de Joinville, Palhoa, Brazil
  • 2Universidade Tuiuti do Parana, Curitiba, Brazil
  • 3Universidade de Uberaba, Uberaba, Brazil
  • 4Universitatsklinikum Bonn, Bonn, Germany
  • 5Universidade Federal Fluminense, Niteri, Brazil
  • 6University of Bonn, Bonn, Germany

The final, formatted version of the article will be published soon.

Dental age estimation plays a key role in forensic identification, clinical diagnosis, treatment planning, and prognosis in fields such as pediatric dentistry and orthodontics. Large language models (LLM) are increasingly being recognized for their potential applications in Dentistry.This study aimed to compare the performance of currently available generative artificial intelligence LLM technologies in estimating dental age using the Demirjian's scores.Panoramic radiographs were analyzed using Demirjian's method (1973), with each left permanent mandibular tooth classified from stage A to H. Untrained LLM, ChatGPT (GPT-4turbo), Gemini 2.0 Flash, and DeepSeek-V3 were tasked with estimating dental age based on the patient's Demirjian score for each tooth. Due to the probabilistic nature of ChatGPT, Gemini, and DeepSeek, which can produce varying responses to the same question, three responses were collected per case per day (three different computers) from each model on three separate days. The age estimates obtained from LLM were compared to the individuals' chronological ages. Intra-and inter-examiner reliability was assessed using the Intraclass Correlation Coefficient (ICC). Model performance was evaluated using Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Coefficient of Determination (R²), and Bias. Thirty panoramic radiographs (40% female, 60% male; mean age 10.4 ± 2.32 years) were included. Both intra-and inter-examiner ICC values exceeded 0.85. ChatGPT and DeepSeek exhibited comparable but suboptimal performance, with higher errors (MAE: 1.98-2.05 years; RMSE: 2.33-2.35 years), negative R² values (-0.069 to -0.049), and substantial overestimation biases (1.90-1.91 years), indicating poor model fit and systematic flaws. Gemini demonstrated intermediate results, with a moderate MAE (1.57 years) and RMSE (1.81 years), a positive R² (0.367), and a lower bias (1.32 years).In conclusion, this study demonstrated that, although LLM like ChatGPT, Gemini, and DeepSeek can estimate dental age using Demirjian's scores, their performance remains inferior to the traditional method. Among them, DeepSeek-V3 showed the best results, but all models require task-specific training and validation before clinical application.

Keywords: artificial intelligence, Generative artificial intelligence, clinical decision-making, Large language models, Evidence-Based Dentistry, Age Determination by Teeth

Received: 23 May 2025; Accepted: 14 Jul 2025.

Copyright: © 2025 Abuabara, Vilalba Paniagua Machado do Nascimento, Trentini, Costa Gonçalves, Hueb De Menezes Oliveira, Madalena, Beisel-Memmert, Kirschneck, Antunes, Miranda de Araujo, Baratto-Filho and Kuchler. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Erika Kuchler, University of Bonn, Bonn, Germany

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.