5:40 PM - 6:00 PM
[1N4-J-9-02] Adding Multiple Subword Sequences to BiLSTM-CRF Model for Compound Name Extraction
Keywords:Named Entity Recognition, Deep Learning, Subword
In this paper, we propose a BiLSTM-CRF model for extracting compound names from documents in chemical domain. The proposed model can be taken multiple subword sequences as input in order to obtain sufficient features for long span or unknown tokens. Subword LSTM units with contextual information are introduced in the input layer of the model. We conducted experiments based on CHEMDNER challenge to investigate the effectiveness of the model. As a result, the extraction accuracy outperformed the normal BiLSTM-CRF model, and experimental results on unknown words showed that the proposed method works better.