The 69th JSAP Spring Meeting 2022

Presentation information

Oral presentation

23 Joint Session N "Informatics" » 23.1 Joint Session N "Informatics"

[24a-E203-1~10] 23.1 Joint Session N "Informatics"

Thu. Mar 24, 2022 9:00 AM - 11:45 AM E203 (E203)

Kentaro Kutsukake(RIKEN), Motoki Shiga(Gifu Univ.)

11:30 AM - 11:45 AM

[24a-E203-10] From Automatically-Extracted Database Toward Semi-Supervised Curation

〇Luca Foppiano1, Pedro Baptista de Castro2, Tomoya Mato1, Chikako Sakai2, Kensei Terashima2, Yoshihiko Takano2, Masashi Ishii1 (1.MDBG, MaDIS, NIMS, 2.NFSMG, MANA, NIMS)

Keywords:materials informatics, superconductors, data mining

The automatic collection of materials information from large scale research articles is the necessary component for rapid material discovery using materials informatics (MI). We are working to create a new automatically extracted database of superconductors materials.
However, after performing manual corrections of a subset of records, we found out that a) the correction is time consuming and uninspiring, b) the original PDFs is not always enough to collect all information (e.g. might be needed to check cited papers) c) the use of general purposes tools, such as Excel created a fragmentation in the data workflow, and d) challenging to reuse the corrected data to improve the underlying system.In this work, we present our solution to improve the aforementioned aspects. We propose our archi- tecture composed by a new front-end interface and articulated over two workflows (Figure 1): a) record flagging, and b) record correction.