5:30 PM - 5:50 PM
[2S6-OS-7a-01] Towards Clinical Research DX
A System to Structure Stroke Risk Factors from Texts
Keywords:Medical Natural Langage Probcessing, Large Langage Model, Information Extraction, Electronic Health Records, Multi-Task Learning
Natural language processing (NLP) technology has been widely used in clinical research, contributing to its digital transformation (DX). This is because NLP can extract patient history and test results from the text stored in electronic health records, which is applicable to clinical research, such as prognosis prediction. Previous studies have combined various NLP models, including named entity extraction and sentence classification, though its sustainable maintenance has been a challenge. In this study, we developed a system to extract stroke risk factors from text using a single NLP model and structure them. Our method, trained through multi-task learning of a large language model T5, can effectively solve both the task of extracting test values such as blood pressure and the task of discriminating whether a person smokes or drinks alcohol. In the experiments, we compared the effects of different multi-task learning methods and evaluated the performance of each risk factor. The extraction task achieved a practical performance of more than 0.8 F1 value, though the discrimination task's performance is low. We developed a GUI application, thereby materializing clinical research DX and further improving our model.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.