Japan Association for Medical Informatics

[2-G-3-02] Machine Learning-Based Cardiovascular Disease Prediction Model. Findings from STEPS Noncommunicable Disease Risk Factors Survey.

*Nguyen Anh Tuyet1 (1. Osaka University)

artificial intelligence, cardiovascular disease, risk factors, algorithm

The prevalence of cardiovascular disease (CVD) is increasing globally, with a particularly pronounced impact on low and middle-income countries (LMICs). This study proposes a predictive model for cardiovascular disease (CVD) utilizing machine learning (ML) algorithms. Four machine learning algorithms, including Random Forest, Bagging Decision Tree, XGBoost, and Support Vector Machine, were deployed to enhance the accuracy of heart disease predictions. In addition, the mutual information feature selection algorithm was utilized. The performance evaluation of these algorithms included accuracy (ACC), precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC). Moreover, an explainable AI approach was adopted, leveraging SHAP frameworks to gain insights into the model's predictive mechanisms. The results show that age 59 and above, intoxication lapse, and adding salt on the table were important risk predictors of cardiovascular disease. The study’s outcomes emphasize the effectiveness of the Random Forest model for cardiovascular diseases. This optimization yields remarkable results: 0.79 accuracy, 0.91 precision, 0.86 recall, and a 0.89 F1 score. This optimization substantially improves the diagnostic accuracy of the model for cardiovascular disease.