Article Data

  • Views 209
  • Dowloads 105

Original Research

Open Access

Chatbots in clinical decision-making for dental trauma: from diagnosis to references

  • Derya Sarıoğlu1,*,
  • Büşra Yücetürk1
  • Zehra Güner1
  • Zübeyde Uçar Gündoğar1

1Department of Paediatric Dentistry, Faculty of Dentistry, Gaziantep University Gaziantep, 27310 Gaziantep, Türkiye

DOI: 10.22514/jocpd.2026.082 Vol.50,Issue 3,May 2026 pp.274-281

Submitted: 29 November 2025 Accepted: 20 January 2026

Published: 03 May 2026

*Corresponding Author(s): Derya Sarıoğlu E-mail: dsarioglu@gantep.edu.tr

Abstract

Background: Traumatic dental injuries are a significant public health issue, yet intelligent chatbots offering potential solutions may broaden both patients and clinicians’ horizons. This study aimed to evaluate the accuracy, temporal reliability, and reference reliability of the diagnosis and treatment responses provided by artificial intelligence (AI) chatbots to created dental trauma. Methods: 45 dental trauma scenarios based on the International Association of Dental Traumatology (IADT) guidelines were presented to four generative large language models (LLMs): ChatGPT-4o, Claude Sonnet 3.7, Gemini Advance, and DeepSeek R1. Three questions (diagnosis, treatment, and references) were asked for each scenario, and the responses were scored using a modified Global Quality Scale (mGQS) developed by the authors, allowing quantitative comparison of the results. The obtained data were analyzed using IBM SPSS Statistics version 27.0. Differences between diagnosis and treatment scores were evaluated using the Analysis of Variance (ANOVA) test, and the temporal reliability of chatbots was evaluated using the intraclass correlation coefficient (ICC). The sources provided by the LLMs were cross-checked via Google Scholar and PubMed and classified as real or fake references. Results: No significant differences were found among LLMs in diagnosis and treatment scores (p > 0.05). In the overall evaluation, DeepSeek R1 received the highest scores, while Claude Sonnet 3.7 showed the lowest average scores. When temporal reliability was assessed, ChatGPT-4o demonstrated good, clinically acceptable temporal reliability (ICC = 0.80). In contrast, Claude Sonnet showed poor reliability, while Gemini Advance and DeepSeek R1 exhibited moderate reliability. In terms of reference reliability, the highest true source rate was observed in DeepSeek R1 (84.78%), while the lowest rate was seen in Claude Sonnet 3.7 (52.57%). Conclusions: Although LLMs provided partially accurate and consistent responses for simple dental trauma cases, they are not yet suitable for clinical use, particularly in terms of treatment recommendations and source reliability.


Keywords

Artificial intelligence; Chatbot; Dental trauma; Dentistry; Large language models; Pediatric dentistry


Cite and Share

Derya Sarıoğlu,Büşra Yücetürk,Zehra Güner,Zübeyde Uçar Gündoğar. Chatbots in clinical decision-making for dental trauma: from diagnosis to references. Journal of Clinical Pediatric Dentistry. 2026. 50(3);274-281.

References

[1] Andreasen JO, Andreasen FM, Andersson L. Textbook and color atlas of traumatic injuries to the teeth. 5th edn. John Wiley & Sons: Munksgaard. 2018.

[2] Petti S, Glendor U, Andersson L. World traumatic dental injury prevalence and incidence: a meta-analysis. Dental Traumatology. 2018; 34: 71–86.

[3] Lin S, Pilosof N, Karawani M, Wigler R, Kaufman AY, Teich ST. Occurrence and timing of complications following traumatic dental injuries. Journal of Clinical and Experimental Dentistry. 2016; 8: e429–e436.

[4] Lee JY, Divaris K. Hidden consequences of dental trauma: the social and psychological effects. Pediatric Dentistry. 2009; 31: 96–101.

[5] Antipovienė A, Narbutaitė J, Virtanen JI. Traumatic dental injuries, treatment, and complications in children and adolescents. European Journal of Dentistry. 2021; 15: 557–562.

[6] Guven Y, Ozdemir OT, Kavan MY. Performance of artificial intelligence chatbots in responding to patient queries related to traumatic dental injuries: a comparative study. Dental Traumatology. 2024; 41: 338–347.

[7] Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, et al. Artificial intelligence in healthcare: past, present and future. Stroke and Vascular Neurology. 2017; 2: 230–243.

[8] Lee H. The rise of ChatGPT: exploring its potential in medical education. Anatomical Sciences Education. 2024; 17: 926–931.

[9] Zhang J, Zhang Z. Ethics and governance of trustworthy medical artificial intelligence. BMC Medical Informatics and Decision Making. 2023; 23: 7.

[10] Zielinski C, Winker MA, Aggarwal R, Ferris LE, Heinemann M, Lapeña JF Jr, et al. Chatbots, generative AI, and scholarly manuscripts: WAME recommendations on chatbots and generative artificial intelligence in relation to scholarly publications. Colombia Médica. 2023; 54: e1015868.

[11] Marr B. A short history of ChatGPT: how we got to where we are today. 2023. Available at: https://www.forbes.com/ (Accessed: 20 May 2025).

[12] Sezer B, Aydoğdu T. Performance of advanced artificial intelligence models in traumatic dental injuries in primary dentition. Applied Sciences. 2025; 15: 7778.

[13] Anthropic. Claude 3 model family. 2024. Available at: https://www.anthropic.com/ (Accessed: 20 May 2025).

[14] Shan T, Tay FR, Gu L. Application of artificial intelligence in dentistry. Journal of Dental Research. 2021; 100: 232–244.

[15] Ozden I, Gokyar M, Ozden ME, Sazak Ovecoglu H. Assessment of artificial intelligence applications in responding to dental trauma. Dental Traumatology. 2024; 40: 722–729.

[16] Levin L, Day PF, Hicks L, O’Connell A, Fouad AF, Bourguignon C, et al. International association of dental traumatology guidelines for the management of traumatic dental injuries: general introduction. Dental Traumatology. 2020; 36: 309–313.

[17] Taraç MG. Evaluation of artificial intelligence chatbots in the management of primary tooth traumas: a comparative analysis. Journal of International Dental Sciences. 2025; 11: 22–31.

[18] Mustuloğlu Ş, Deniz BP. Evaluation of chatbots in the emergency management of avulsion injuries. Dental Traumatology. 2025; 41: 437–444.

[19] Johnson AJ, Singh TK, Gupta A, Sankar H, Gill I, Shalini M, et al. Evaluation of validity and reliability of AI chatbots as public sources of information on dental trauma. Dental Traumatology. 2025; 41: 187–193.

[20] Kuru HE, Aşık A, Demir DM. Can artificial intelligence language models effectively address dental trauma questions? Dental Traumatology. 2025; 41: 567–580.

[21] Tokgöz Kaplan T, Cankar M. Evidence-based potential of generative artificial intelligence large language models on dental avulsion: ChatGPT versus Gemini. Dental Traumatology. 2025; 41: 178–186.

[22] Kaygisiz ÖF, Teke MT. Can DeepSeek and ChatGPT be used in the diagnosis of oral pathologies? BMC Oral Health. 2025; 25: 638.

[23] Öztürk Z, Bal C, Çelikkaya BN. Evaluation of information provided by ChatGPT versions on traumatic dental injuries for dental students and professionals. Dental Traumatology. 2025; 41: 427–436.

[24] Moeini A, Torabi S. The role of artificial intelligence in dental diagnosis and treatment planning. Journal of Oral and Dental Health Nexus. 2025; 2: 14–26.

[25] Suárez A, Jiménez J, Llorente de Pedro M, Andreu-Vázquez C, Díaz-Flores García V, Gómez Sánchez M, et al. Beyond the scalpel: assessing ChatGPT’s potential as an auxiliary intelligent virtual assistant in oral surgery. Computational and Structural Biotechnology Journal. 2024; 24: 46–52.


Submission Turnaround Time

Top