2022年第69回応用物理学会春季学術講演会

講演情報

一般セッション(口頭講演)

23 合同セッションN「インフォマティクス応用」 » 23.1 合同セッションN「インフォマティクス応用」

[24a-E203-1~10] 23.1 合同セッションN「インフォマティクス応用」

2022年3月24日(木) 09:00 〜 11:45 E203 (E203)

沓掛 健太朗(理研)、志賀 元紀(岐阜大)

11:30 〜 11:45

[24a-E203-10] From Automatically-Extracted Database Toward Semi-Supervised Curation

〇Luca Foppiano1、Pedro Baptista de Castro2、Tomoya Mato1、Chikako Sakai2、Kensei Terashima2、Yoshihiko Takano2、Masashi Ishii1 (1.MDBG, MaDIS, NIMS、2.NFSMG, MANA, NIMS)

キーワード:materials informatics, superconductors, data mining

The automatic collection of materials information from large scale research articles is the necessary component for rapid material discovery using materials informatics (MI). We are working to create a new automatically extracted database of superconductors materials.
However, after performing manual corrections of a subset of records, we found out that a) the correction is time consuming and uninspiring, b) the original PDFs is not always enough to collect all information (e.g. might be needed to check cited papers) c) the use of general purposes tools, such as Excel created a fragmentation in the data workflow, and d) challenging to reuse the corrected data to improve the underlying system.In this work, we present our solution to improve the aforementioned aspects. We propose our archi- tecture composed by a new front-end interface and articulated over two workflows (Figure 1): a) record flagging, and b) record correction.