6:00 PM - 6:20 PM
[1N4-J-9-03] Using Subword Sequence BiLSTM-CRF Model for Compound Name Extraction
Keywords:Compound Name Extraction, Subword
In this paper, we investigate of using subword sequences for compound name extraction problem. Five variety of subword sequence generators (SYMBOL, SP, BPE, BPE-DICT, and BPE-PMI) were used in the investigation. Last two of these, BPE-DICT and BPE-PMI, are originally proposed in this work. BPE-DICT is a variation of BPE which has a dictionary-based restriction. BPE-PMI introduces the PMI measure instead of word frequency count. The experimental results showed that subword sequence information improved the extraction performance. The F-measure value of BPE-DICT is 86.74 which is best score in all conditions of our experiments.