4:40 PM - 5:00 PM
[1H3-J-13-05] Prediction of the Onset of Lifestyle-related Diseases Using Health Insurance Claims Data
Keywords:Machine Learning, Prediction of the onset, Medical Data, Representation Learning
This paper proposes a system which predicts the onset of lifestyle-related diseases using health insurance claims data. In the transportation industry, they try to reconsider health management to take measures against drivers' overwork these days. Previous studies used representation learning for predicting some diseases. Similarly, we regard this issue as a text classification problem in natural language processing and try to make a model which helps drivers' health management. We trainsformed the health insurance claims into a fixed-length vector and predicted lifestyle-related diseases with UnderSampling and Bagging. As a result, our model achieved 0.75 in the recall of positives. We're sure that the significance of applying natural language processing to health insurance claims data was shown in this study.