International Journal of Management, Economics and Social Sciences
Special Issue-International Conference on Medical and Health Informatics (ICMHI 2017)
2018, Vol. 7(S1), pp. 132 – 141.
ISSN 2304 – 1366


A Comparative Analysis of Data Mining Techniques for Prediction of Postprandial Blood Glucose: A Cohort Study


Huan-Cheng Chang1
Pin-Hsiang Chang2
Sung-Chin Tseng3
Chi-Chang Chang4
Yen-Chiao Lu5
1Dept. of Community Medicine, Landseed Hospital, Taoyuan, Taiwan
2Dept. of Healthcare Management, Yuanpei University of Medical Technology, Hsinchu, Taiwan
3Div. of Family Medicine, Chiayi Chang Gung Memorial Hospital, Chiayi, Taiwan
4School of Medical Informatics, Chung-Shan Medical University/Hospital, Taichung, Taiwan
5School of Nursing, Chung-Shan Medical University, Taichung, Taiwan



The use of advanced predictive techniques and reasoning models has greatly assisted clinicians in improving the diagnosis, prognosis, and treatment of diabetes. Although numerous studies have focused on the relationship between abnormal blood glucose levels and diabetes, few have focused on the risk forecasting of postprandial blood glucose levels in patients with diabetes. This work aimed to develop a model for the prediction of postprandial blood glucose levels to screen for undiagnosed diabetes cases in a cohort study. The performance of the proposed model was then compared with those of five other datamining techniques: random forest (RF), support vector machine (SVM), C5.0, multilayer perceptron (MLP), and logistic regression (LR). The data of 1,438 patients who were admitted to Landseed Hospital, Northern Taiwan, over the period of 2006 and 2013 were collected and used to evaluate the performances of the data-mining techniques. Compared with the 4.5, SVM, MLP, and LR models, the RF model had the best prediction capability for postprandial blood glucose levels in terms of the overall correct classification rate. The results of this study underscore the importance of identifying the preclinical symptoms of abnormal blood glucose levels. The proposed model provides precise reasoning and prediction and can be used to help physicians improve the diagnosis, prognosis, and treatment of patients with diabetes.

Keywords:  Data mining techniques, random forest, support vector machine, multilayer perceptron, logistic regression