Abstract
Background and Objective: Diabetes is a disease that requires early detection and early treatment, and complications are likely to occur in late stages of the disease, threatening the life of patients. Therefore, in order to diagnose diabetic patients as early as possible, it is necessary to establish a model that can accurately predict diabetes.
Methodology: This paper proposes an ensemble learning framework: KFPredict, which combines multi-input models with key features and machine learning algorithms. We first propose a multi-input neural network model (KF_NN) that fuses key features and uses a decision tree-based selection recursive feature elimination algorithm and correlation coefficient method to screen out the key feature inputs and secondary feature inputs in the model. We then ensemble KF_NN with three machine learning algorithms (i.e., Support Vector Machine, Random Forest and K-Nearest Neighbors) for soft voting to form our predictive classifier for diabetes prediction.
Results: Our framework demonstrates good prediction results on the test set with a sensitivity of 0.85, a specificity of 0.98, and an accuracy of 93.5%. Compared with the single prediction method KFPredict, the accuracy is up to 18.18% higher. Concurrently, we also compared KFPredict with the existing prediction methods. It still has good prediction performance, and the accuracy rate is improved by up to 14.93%.
Conclusion: This paper constructs a diabetes prediction framework that combines multi-input models with key features and machine learning algorithms. Taking tthe PIMA diabetes dataset as the test data, the experiment shows that the framework presents good prediction results.
Methodology: This paper proposes an ensemble learning framework: KFPredict, which combines multi-input models with key features and machine learning algorithms. We first propose a multi-input neural network model (KF_NN) that fuses key features and uses a decision tree-based selection recursive feature elimination algorithm and correlation coefficient method to screen out the key feature inputs and secondary feature inputs in the model. We then ensemble KF_NN with three machine learning algorithms (i.e., Support Vector Machine, Random Forest and K-Nearest Neighbors) for soft voting to form our predictive classifier for diabetes prediction.
Results: Our framework demonstrates good prediction results on the test set with a sensitivity of 0.85, a specificity of 0.98, and an accuracy of 93.5%. Compared with the single prediction method KFPredict, the accuracy is up to 18.18% higher. Concurrently, we also compared KFPredict with the existing prediction methods. It still has good prediction performance, and the accuracy rate is improved by up to 14.93%.
Conclusion: This paper constructs a diabetes prediction framework that combines multi-input models with key features and machine learning algorithms. Taking tthe PIMA diabetes dataset as the test data, the experiment shows that the framework presents good prediction results.
Original language | English |
---|---|
Article number | 107378 |
Number of pages | 9 |
Journal | Computer Methods and Programs in Biomedicine |
Volume | 231 |
Early online date | 26 Jan 2023 |
DOIs | |
Publication status | Published - Apr 2023 |
Keywords
- Deep learning
- Diabetes prediction
- Ensemble learning
- KFPredict
- Soft voting
ASJC Scopus subject areas
- Software
- Computer Science Applications
- Health Informatics