Title
Author
DOI
Article Type
Special Issue
Volume
Issue
Prediction of dental caries in children through machine learning
1Preventive Dental Sciences Department, Dental College and Hospital, Taibah University, 4147 Medina, Kingdom of Saudi Arabia
DOI: 10.22514/jocpd.2025.110 Vol.49,Issue 5,September 2025 pp.158-167
Submitted: 11 September 2024 Accepted: 07 February 2025
Published: 03 September 2025
*Corresponding Author(s): Sarah Ahmed Bahammam E-mail: sbahammam@taibahu.edu.sa
Background: Dental caries is a contagious disease that decays the structure of the tooth, with cavities of the tooth as the one of most common outcomes. It is categorized as one of the prevailing issues related to oral health. The study on dental caries has been examined for early prediction due to treatment cost and pain. Medical study in oral healthcare has depicted restrictions such as the requirement of time and funds consideration. Therefore, artificial intelligence (AI) has been employed lately to produce models that can predict the dental caries risk. Methods: This study employed machine learning techniques to predict dental caries among children. The questionnaire, containing both clinical and socio-demographic information, was divided into training and testing sets to train and evaluate four distinct predictive models: Logistic Regression, Random Forest, Extreme Gradient Boosting (XGBoost) and Light Gradient-Boosting Machine (LightGBM). Our investigation aimed to assess the performance of each model performance in identifying cases of dental caries. Through key metrics such as precision, recall and specificity, we gauged effectiveness in distinguishing positive and negative cases. Additionally, the study involved generating Receiver Operating Characteristic curves and calculating the Area Under the Curve for each model to evaluate predictive capability. The models’ predictions were then visualized and compared through graphical representations, including ROC curves. Results: LightGBM exhibited the highest overall performance, achieving an accuracy of 85% and demonstrating superior sensitivity and specificity in distinguishing between children with and without dental caries. Random Forest and XGBoost also yielded commendable results with accuracy rates of 83% and 84%, respectively. Conclusions: It is concluded that the results revealed promising predictive capabilities across all four models. Our findings provide valuable insights into the feasibility of utilizing machine learning algorithms for dental caries prediction, contributing to the advancement of preventive dental healthcare practices among children.
Algorithms; Artificial intelligence; Dental caries; Machine learning; ROC curve; Sensitivity and specificity
Sarah Ahmed Bahammam. Prediction of dental caries in children through machine learning. Journal of Clinical Pediatric Dentistry. 2025. 49(5);158-167.
[1] Chi DL, Scott JM. Added sugar and dental caries in children: a scientific update and future steps. Dental Clinics of North America. 2019; 63: 17–33.
[2] He S, Wang J. Validation of the Chinese version of the caries impacts and experiences questionnaire for children (CARIES-QC). International Journal of Paediatric Dentistry. 2020; 30: 50–56.
[3] Zou J, Du Q, Ge L, Wang J, Wang X, Li Y, et al. Expert consensus on early childhood caries management. International Journal of Oral Science. 2022; 14: 35.
[4] Laksmiastuti SR, Budiardjo SB, Sutadi H. Validated questionnaire of maternal attitude and knowledge for predicting caries risk in children: epidemiological study in North Jakarta, Indonesia. Journal of International Society of Preventive and Community Dentistry. 2017; 7: S42–S47.
[5] Bud ES, Bica CI, Stoica OE, Vlasa A, Eșian D, Bucur SM, et al. Observational study regarding the relationship between nutritional status, dental caries, mutans streptococci, and lactobacillus bacterial colonies. International Journal of Environmental Research and Public Health. 2021; 18: 3551.
[6] Pakkhesal M, Riyahi E, Naghavi Alhosseini A, Amdjadi P, Behnampour N. Impact of dental caries on oral health related quality of life among preschool children: perceptions of parents. BMC Oral Health. 2021; 21: 68.
[7] Reddy P, Krithikadatta J, Srinivasan V, Raghu S, Velumurugan N. Dental caries profile and associated risk factors among adolescent school children in an urban South-Indian city. Oral Health and Preventive Dentistry. 2020; 18: 379–386.
[8] Zabokova Bilbilova E. Dietary factors, salivary parameters, and dental caries. In Zabokova Bilbilova E (ed.) Dental Caries (pp. 1-20). IntechOpen: London, UK. 2020.
[9] Weatherspoon DJ, Horowitz AM, Kleinman DV. Maryland physicians’ knowledge, opinions, and practices related to dental caries etiology and prevention in children. Pediatric Dentistry Journal. 2016; 38: 61–67.
[10] Tadakamadla SK, Tadakamadla J, Kroon J, Lalloo R, Johnson NW. Effect of family characteristics on periodontal diseases in children and adolescents—a systematic review. International Journal of Dental Hygiene. 2020; 18: 3–16.
[11] Chaffee BW, Featherstone JD, Gansky SA, Cheng J, Zhan L. Caries risk assessment item importance: risk designation and caries status in children under age 6. JDR Clinical & Translational Research. 2016; 1: 131–142.
[12] Ghaffari M, Zhu Y, Shrestha A. A review of advancements of artificial intelligence in dentistry. Dentistry Review. 2024; 4: 100081.
[13] Nambiar R, Nanjundegowda R. A comprehensive review of ai and deep learning applications in dentistry: from image segmentation to treatment planning. Journal of Robotics and Control. 2024; 5: 1744–1752.
[14] Zhu J, Chen Z, Zhao J, Yu Y, Li X, Shi K, et al. Artificial intelligence in the diagnosis of dental diseases on panoramic radiographs: a preliminary study. BMC Oral Health. 2023; 23: 358.
[15] Schober P, Vetter TR. Logistic regression in medical research. Anesthesia & Analgesia. 2021; 132: 365–366.
[16] Abhishek Sharma. Decision tree vs random forest | which is right for you? 2020. Available at: https://www.analyticsvidhya.com/blog/2020/05/decision-tree-vs-random-forest-algorithm/ (Accessed: 5 March 2025).
[17] Dong W, Huang Y, Lehane B, Ma G. XGBoost algorithm-based prediction of concrete electrical resistivity for structural health monitoring. Automation in Construction. 2020; 114: 103155.
[18] Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. Lightgbm: a highly efficient gradient-boosting decision tree. 31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach, CA, USA, 4–9 December. Curran Associates Inc.: Red Hook, NY, USA. 2017.
[19] Karhade DS, Roach J, Shrestha P, Simancas-Pallares MA, Ginnis J, Burk ZJS, et al. An automated machine learning classifier for early childhood caries. Pediatric Dentistry Journal. 2021; 43: 191–197.
[20] Hung M, Voss MW, Rosales MN, Li W, Su W, Xu J, et al. Application of machine learning for diagnostic prediction of root caries. Gerodontology. 2019; 36: 395–404.
[21] Kang IA, Ngnamsie Njimbouom S, Lee KO, Kim JD. DCP: prediction of dental caries using machine learning in personalized medicine. Applied Sciences. 2022; 12: 3043.
[22] Wu TT, Xiao J, Sohn MB, Fiscella KA, Gilbert C, Grier A, et al. Machine learning approach identified multi-platform factors for caries prediction in child-mother dyads. Frontiers in Cellular and Infection Microbiology. 2021; 11: 727630.
[23] Ngnamsie Njimbouom S, Lee K, Kim JD. MMDCP: multi-modal dental caries prediction for decision support system using deep learning. International Journal of Environmental Research and Public Health. 2022; 19: 10928.
[24] Ataş M, Yeşilnacar Mİ, Demir Yetiş A. Novel machine learning techniques based hybrid models (LR-KNN-ANN and SVM) in prediction of dental fluorosis in groundwater. Environmental Geochemistry and Health. 2022; 44: 3891–3905.
[25] Sadegh-Zadeh SA, Rahmani Qeranqayeh A, Benkhalifa E, Dyke D, Taylor L, Bagheri M. Dental caries risk assessment in children 5 years old and under via machine learning. Dentistry Journal. 2022; 10: 164.
[26] Marcus M, Maida CA, Wang Y, Xiong D, Hays RD, Coulter ID, et al. Child and parent demographic characteristics and oral health perceptions associated with clinically measured oral health. JDR Clinical & Translational Research. 2018; 3: 302–313.
[27] Liu H, Hays RD, Marcus M, Coulter I, Maida C, Ramos-Gomez F, et al. Patient-Reported oral health outcome measurement for children and adolescents. BMC Oral Health. 2016; 16: 95.
[28] Wang Y, Hays RD, Marcus M, Maida CA, Shen J, Xiong D, et al. Developing children’s oral health assessment toolkits using machine learning algorithm. JDR Clinical & Translational Research. 2020; 5: 233–243.
[29] Maida CA, Marcus M, Hays RD, Coulter ID, Ramos-Gomez F, Lee SY, et al. Child and adolescent perceptions of oral health over the life course. Quality of Life Research. 2015; 24: 2739–2751.
[30] Maida CA, Marcus M, Hays RD, Coulter ID, Ramos-Gomez F, Lee SY, et al. Qualitative methods in the development of a parent survey of children’s oral health status. Journal of Patient-Reported Outcomes. 2017; 2: 7.
[31] Marcus M, Xiong D, Wang Y, Maida CA, Hays RD, Coulter ID, et al. Development of toolkits for detecting dental caries and caries experience among children using self-report and parent report. Community Dentistry and Oral Epidemiology. 2019; 47: 520–527.
[32] Toledo Reyes L, Knorst JK, Ortiz FR, Brondani B, Emmanuelli B, Saraiva Guedes R, et al. Early childhood predictors for dental caries: a machine learning approach. Journal of Dental Research. 2023; 102: 999–1006.
[33] Feldens CA, Dos Santos IF, Kramer PF, Vítolo MR, Braga VS, Chaffee BW. Early-life patterns of sugar consumption and dental caries in the permanent teeth: a birth cohort study. Caries Research. 2021; 55: 505–514.
[34] Chen T, Guestrin C. Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16). 2016; 785–794.
[35] Rajkomar A, Dean J, Kohane I. Machine learning in medicine. The New England Journal of Medicine. 2019; 380: 1347–1358.
[36] Lundberg S, Lee S. A unified approach to interpreting model predictions. 31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach, CA, USA, 4–9 December. Long Beach, California, USA. 2017.
Science Citation Index Expanded (SciSearch) Created as SCI in 1964, Science Citation Index Expanded now indexes over 9,500 of the world’s most impactful journals across 178 scientific disciplines. More than 53 million records and 1.18 billion cited references date back from 1900 to present.
Biological Abstracts Easily discover critical journal coverage of the life sciences with Biological Abstracts, produced by the Web of Science Group, with topics ranging from botany to microbiology to pharmacology. Including BIOSIS indexing and MeSH terms, specialized indexing in Biological Abstracts helps you to discover more accurate, context-sensitive results.
Google Scholar Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines.
JournalSeek Genamics JournalSeek is the largest completely categorized database of freely available journal information available on the internet. The database presently contains 39226 titles. Journal information includes the description (aims and scope), journal abbreviation, journal homepage link, subject category and ISSN.
Current Contents - Clinical Medicine Current Contents - Clinical Medicine provides easy access to complete tables of contents, abstracts, bibliographic information and all other significant items in recently published issues from over 1,000 leading journals in clinical medicine.
BIOSIS Previews BIOSIS Previews is an English-language, bibliographic database service, with abstracts and citation indexing. It is part of Clarivate Analytics Web of Science suite. BIOSIS Previews indexes data from 1926 to the present.
Journal Citation Reports/Science Edition Journal Citation Reports/Science Edition aims to evaluate a journal’s value from multiple perspectives including the journal impact factor, descriptive data about a journal’s open access content as well as contributing authors, and provide readers a transparent and publisher-neutral data & statistics information about the journal.
Scopus: CiteScore 2.3 (2024) Scopus is Elsevier's abstract and citation database launched in 2004. Scopus covers nearly 36,377 titles (22,794 active titles and 13,583 Inactive titles) from approximately 11,678 publishers, of which 34,346 are peer-reviewed journals in top-level subject fields: life sciences, social sciences, physical sciences and health sciences.
Top