3:30 PM - 3:50 PM
[2Q5-IS-1-01] Analysis of Tender Documents Using Sequence Labeling with LLM-based Improver
Keywords:Text Mining, Large Language Model, Text Visualization
Bidders often take a long time to read and understand tender documents because they require specialized knowledge, and tender documents are generally long.
Here, the function that can extract specific items (i.e., item extractor) and the function that can highlight words or phrases related to specific items (i.e., word-phrase highlighter) are in great demand.
To develop such type of functions, we need to solve two problems.
The first problem is the problem related to the annotated data set.
The second problem concerns the BERT-based sequence labeling approach in a small training dataset setting.
To solve the first problem, we created two types of sequence labeling datasets related to Item Extractor and Word-Phrase Highlighter.
To solve the second problem, we propose the Information Extraction (IE) method, which combines (1) a supervised learning approach using BERT-based sequence labeling and (2) a large language model (LLM)-based improver.
Experimental evaluation demonstrates the effectivenes of our approach.
Moreover, as an application, We then developed the web application system called Tender Document Analyzer (TDDA).
Here, the function that can extract specific items (i.e., item extractor) and the function that can highlight words or phrases related to specific items (i.e., word-phrase highlighter) are in great demand.
To develop such type of functions, we need to solve two problems.
The first problem is the problem related to the annotated data set.
The second problem concerns the BERT-based sequence labeling approach in a small training dataset setting.
To solve the first problem, we created two types of sequence labeling datasets related to Item Extractor and Word-Phrase Highlighter.
To solve the second problem, we propose the Information Extraction (IE) method, which combines (1) a supervised learning approach using BERT-based sequence labeling and (2) a large language model (LLM)-based improver.
Experimental evaluation demonstrates the effectivenes of our approach.
Moreover, as an application, We then developed the web application system called Tender Document Analyzer (TDDA).
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.