16:45 〜 17:00
▲ [19p-Z32-12] Machine readable extraction of chemically modified materials name
キーワード:machine learning, material parser, text mining
We present a machine-learning based material name parser for unstructured text. The parser implements a Conditional Random Field (CRF) model that segments the raw material string in seven component: name (Metal diboride, hydrogen, etc.), chemical formula (La Fe O7, SiH 4, etc.), doping ratio (Zn-doped, pure, etc.), stochiometric variable names and values (x = 1, 2; y = 3), and shape (thin film, powder, etc.). We constructed the training data of 3000 material names, using all the material entities from the SuperMat dataset.