4:10 PM - 4:30 PM
[3M5-OS-12b-03] Approaching Cell Classification of Machine-Unreadable Tables in Annual Security Reports
Keywords:Annual Securities Reports, Table Data, Cell Classification
This study focuses on tables that are difficult to machine read, and cell classification of tables contained in annual securities reports.
TDE (Table Data Extraction), a subtask of NTCIR-17 UFO, excluded tables that were difficult to machine read.
These machine-readable difficult tables are classified into five categories: ``tables containing subheading lines," ``tables with multiple headers and attributes," ``tables containing blank cells," ``tables containing non-scalar cells," and "tables with special shapes.
This paper presents the extent to which cell classification can be performed on these difficult tables using common methods, and clarifies the difficulty level of the task.
TDE (Table Data Extraction), a subtask of NTCIR-17 UFO, excluded tables that were difficult to machine read.
These machine-readable difficult tables are classified into five categories: ``tables containing subheading lines," ``tables with multiple headers and attributes," ``tables containing blank cells," ``tables containing non-scalar cells," and "tables with special shapes.
This paper presents the extent to which cell classification can be performed on these difficult tables using common methods, and clarifies the difficulty level of the task.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.