Machine Learning for Early Non-invasive Diabetes Detection Using Electronic Health Records
(1) Department of Computer Science and Engineering, Graphic Era (Deemed to be University), Dehradun, Uttarakhand 248002, India
(2) Department of Biomedical Informatics, Columbia University Medical Center, New York, NY 10032, USA
(3) Department of Electrical and Computer Engineering, School of Engineering, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
(4) Department of Industrial and Systems Engineering, Wayne State University, Detroit, MI 48202, USA
(*) Corresponding Author
Abstract
Keywords
Full Text:
PDFReferences
Allgaier, J., & Pryss, R. (2024). Cross-Validation Visualized: A Narrative Guide to Advanced Methods. Machine Learning and Knowledge Extraction, 6(2), 1378–1388. https://doi.org/10.3390/make6020065
Appasani, D., Bokkisam, C. S., & Surendran, S. (2024). An Incremental Naive Bayes Learner for Real-time Health Prediction. Procedia Computer Science, 235, 2942–2954. https://doi.org/10.1016/j.procs.2024.04.278
Bayramli, I., Castro, V., Barak-Corren, Y., Madsen, E. M., Nock, M. K., Smoller, J. W., & Reis, B. Y. (2022). Predictive structured–unstructured interactions in EHR models: A case study of suicide prediction. Npj Digital Medicine, 5(1), 15. https://doi.org/10.1038/s41746-022-00558-0
Bernardini, M., Romeo, L., Misericordia, P., & Frontoni, E. (2020). Discovering the Type 2 Diabetes in Electronic Health Records Using the Sparse Balanced Support Vector Machine. IEEE Journal of Biomedical and Health Informatics, 24(1), 235–246. https://doi.org/10.1109/JBHI.2019.2899218
Chen, Z., Tang, J., & Song, D. (2024). Modeling landslide susceptibility using alternating decision tree and support vector. Terrestrial, Atmospheric and Oceanic Sciences, 35(1), 12. https://doi.org/10.1007/s44195-024-00074-6
Diallo, R., Edalo, C., & Awe, O. O. (2025). Machine Learning Evaluation of Imbalanced Health Data: A Comparative Analysis of Balanced Accuracy, MCC, and F1 Score (pp. 283–312). https://doi.org/10.1007/978-3-031-72215-8_12
Fawagreh, K., & Gaber, M. M. (2020). Resource-efficient fast prediction in healthcare data analytics: A pruned Random Forest regression approach. Computing, 102(5), 1187–1198. https://doi.org/10.1007/s00607-019-00785-6
G, K., K P, I., Hasin A, J., M, L. F. J., Siluvai, S., & G, K. (2025). Support Vector Machines: A Literature Review on Their Application in Analyzing Mass Data for Public Health. Cureus. https://doi.org/10.7759/cureus.77169
Global Burden of Disease Collaborative Network. (2024, April 3). Global Burden of Disease Study 2021: Results. Institute for Health Metrics and Evaluation.
Gurcan, F., & Soylu, A. (2024). Learning from Imbalanced Data: Integration of Advanced Resampling Techniques and Machine Learning Models for Enhanced Cancer Diagnosis and Prognosis. Cancers, 16(19), 3417. https://doi.org/10.3390/cancers16193417
Hairani, H., Widiyaningtyas, T., & Dwi Prasetya, D. (2024). Addressing Class Imbalance of Health Data: A Systematic Literature Review on Modified Synthetic Minority Oversampling Technique (SMOTE) Strategies. JOIV : International Journal on Informatics Visualization, 8(3), 1310. https://doi.org/10.62527/joiv.8.3.2283
Halder, R. K., Uddin, M. N., Uddin, Md. A., Aryal, S., & Khraisat, A. (2024). Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of
modifications. Journal of Big Data, 11(1), 113. https://doi.org/10.1186/s40537-024-00973-y
Hennebelle, A., Dieng, Q., Ismail, L., & Buyya, R. (2024). SmartEdge: Smart Healthcare End-to-End Integrated Edge and Cloud Computing System for Diabetes Prediction Enabled by Ensemble Machine Learning. 2024 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), 127–134. https://doi.org/10.1109/CloudCom62794.2024.00031
Ilham, A., Kindarto, A., Fathurohman, A., Khikmah, L., Dias Ramadhani, R., Abdunnasir Jawad, S., April Liana, D., Amylia. AR, A., Kareem Oleiwi, A., & Mutiar, A. (2024). CFCM-SMOTE: A Robust Fetal Health Classification to Improve Precision Modeling in Multiclass Scenarios. International Journal of Computing and Digital Systems, 15(1), 471–486. https://doi.org/10.12785/ijcds/160137
Kiran, M., Xie, Y., Anjum, N., Ball, G., Pierscionek, B., & Russell, D. (2025). Machine learning and artificial intelligence in type 2 diabetes prediction: a comprehensive 33-year bibliometric and literature analysis. Frontiers in Digital Health, 7. https://doi.org/10.3389/fdgth.2025.1557467
Lee, H., Hwang, S. H., Park, S., Choi, Y., Lee, S., Park, J., Son, Y., Kim, H. J., Kim, S., Oh, J., Smith, L., Pizzol, D., Rhee, S. Y., Sang, H., Lee, J., & Yon, D. K. (2025). Prediction model for type 2 diabetes mellitus and its association with mortality using machine learning in three independent cohorts from South Korea, Japan, and the UK: a model development and validation study. EClinicalMedicine, 80, 103069. https://doi.org/10.1016/j.eclinm.2025.103069
Lin, H.-C., Kuo, Y.-C., & Liu, M.-Y. (2020). A health informatics transformation model based on intelligent cloud computing – exemplified by type 2 diabetes mellitus with related cardiovascular diseases. Computer Methods and Programs in Biomedicine, 191(2), 105409. https://doi.org/10.1016/j.cmpb.2020.105409
Moglia, V., Johnson, O., Cook, G., de Kamps, M., & Smith, L. (2025). Artificial intelligence methods applied to longitudinal data from electronic health records for prediction of cancer: a scoping review. BMC Medical Research Methodology, 25(1), 24. https://doi.org/10.1186/s12874-025-02473-w
Nawaz, A., Khan, S. S., & Ahmad, A. (2024). Ensemble of Autoencoders for Anomaly Detection in Biomedical Data: A Narrative Review. IEEE Access, 12, 17273–17289. https://doi.org/10.1109/ACCESS.2024.3360691
Noroozi, Z., Orooji, A., & Erfannia, L. (2023). Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction. Scientific Reports, 13(1), 22588. https://doi.org/10.1038/s41598-023-49962-w
Singh, N., & Singh, P. (2021). Exploring the effect of normalization on medical data classification. 2021 International Conference on Artificial Intelligence and Machine Vision (AIMV), 1–5. https://doi.org/10.1109/AIMV53313.2021.9670938
Tabassum, S., Abedin, N., Maruf, R. I., Taufiq Ahmed, M., & Ahmed, A. (2022). Improving Health Status Prediction by Applying Appropriate Missing Value Imputation Technique. 2022 IEEE 4th Global Conference on Life Sciences and Technologies (LifeTech), 345–348. https://doi.org/10.1109/LifeTech53646.2022.9754794
Zhu, M., Xia, J., Jin, X., Yan, M., Cai, G., Yan, J., & Ning, G. (2018). Class Weights Random Forest Algorithm for Processing Class Imbalanced Medical Data. IEEE Access, 6, 4641–4652. https://doi.org/10.1109/ACCESS.2018.2789428
Article Metrics
Abstract view : 37 timesPDF - 5 times
DOI: https://doi.org/10.26714/jichi.v6i1.17299
Refbacks
- There are currently no refbacks.
____________________________________________________________________________
Journal of Intelligent Computing and Health Informatics (JICHI)
ISSN 2715-6923 (print) | 2721-9186 (online)
Organized by
Department of Informatics
Faculty of Engineering
Universitas Muhammadiyah Semarang
W : https://jurnal.unimus.ac.id/index.php/ICHI
E : jichi.informatika@unimus.ac.id, ahmadilham@unimus.ac.id
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.