9:30 AM - 11:30 AM
[17a-PB02-8] Machine extraction of material data from tables in PDF articles
Keywords:material data, table, PDF
A system for machine extraction of material data from tables in PDF articles was created. The tables in a PDF article were first extracted using the existing tools, and the material and property names in them were then recognized to extract the triples of material-name, property-name, and value as the material data points. The recognizer for material-name recognition was created using machine learning for sentence classification. The recognizer could recognize not only general material names also sample labels and IDs defined by authors, and its f1 score was 0.89. The tools and methods used in creating the system and the results will be reported in detail.