Evolutionary Computation-based Automatic Prompt Engineering for OCR Text Analysis of Unstructured Document Images

Takashi Egami

2:00 PM - 2:20 PM

[4N3-GS-6-01] Evolutionary Computation-based Automatic Prompt Engineering for OCR Text Analysis of Unstructured Document Images

〇Takashi Egami¹, Hyakka Nakada¹, Rinka Fukuji², Marika Kubota², Masakazu Yakushiji¹ (1. Recruit Co., Ltd., 2. Beans Labo Co., Ltd.)

Keywords:OCR, LLM, Relation Extraction, Prompt Engineering, Evolutionary Computation

Optical Character Recognition (OCR) is a technology that recognizes characters from images, potentially reducing the man-hours required to publish store information from document images on websites. However, this process requires not just extracting characters but also extracting key-value relations. Such a task is straightforward in tabular documents but difficult in unstructured ones because of various formats. Advances in Large Language Models (LLMs) have improved text comprehension, and the accuracy is reported to be enhanced by automated prompt engineering, which is designed to generate task-specific prompts. Applying this approach to OCR is expected to improve the relation extraction accuracy. However, particularly in unstructured documents, it requires many times of inferences to learn large amounts of formats. This leads to expensive computational cost. Thus, to optimize LLM prompts with fewer inferences, the application of minibatch learning to evolutionary computation-based automated prompt engineering is proposed. The optimized prompts were found to extract key-value relations from OCR data with high precision.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[4N3-GS-6] Language media processing:

[4N3-GS-6-01] Evolutionary Computation-based Automatic Prompt Engineering for OCR Text Analysis of Unstructured Document Images

Password