J-NER:Benchmark Dataset Considering Extended Named Entity in Named Entity Recognition for Large Language Models

Yusuke Shibuya; Hiroto Shibuya

[4Xin2-06] J-NER:Benchmark Dataset Considering Extended Named Entity in Named Entity Recognition for Large Language Models

〇Yusuke Shibuya¹, Hiroto Shibuya¹ (1.ARISE analytics, Inc.)

Keywords:large language model , named entity recognition, benchmark, dataset

It is an important aspect of understanding a language model to ascertain whether the model is able to recognize the structure and connections of sentences. Named entity such as place names and person names are one of the main components of language, and research on the recognition of proper expressions in language models is an important theme in understanding language models. Although named entity recognition is also important in large language models, compared to general language models, there is still room for research in areas such as the development of data sets for named entity recognition.
Therefore, in this study, we create a new benchmark dataset "J-NER", which includes named entities of training data of large language models and extended named entity. Using this dataset, we evaluate large language models with Gemini Pro, GPT-3.5, and ELYZA, and find that there is variation in accuracy and F1 score. This suggests that J-NER is effective in measuring the named entity recognition ability of large language models; it is expected that we can obtain deep insights into the named entity recognition ability of large language models through using J-NER.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[4Xin2] Poster session 2

[4Xin2-06] J-NER:Benchmark Dataset Considering Extended Named Entity in Named Entity Recognition for Large Language Models

Password