Keywords:Data mining, Missing value completion, Machine learning
In this paper, we propose a method to supplement missing values in blood test data. In recent years, the digitization of hospital charts has progressed, and the number of electronic data that can be utilized is enormous. However, it is only used in normal business and is not used for secondary purposes. It is expected that oversight can be prevented by analyzing a large number of test results that are difficult for doctors to distinguish by analyzing the data and presenting the results to doctors. Since the doctor selects the test items of the blood test as needed, the values of many test items become unobserved (missing values). In the multiple imputation method that complements on the premise that defects occur randomly, the missing values are estimated using a linear model, but the missing values in the blood test data do not occur randomly. Therefore, in this paper, we try to complement by estimating the missing values using a nonlinear model. Besides, blood test data will be classified using multiple machine learning methods, and the effects will be confirmed.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.