4:30 PM - 5:00 PM
[18p-B01-6] Materials Informatics: Small Data Problem and Transfer Learning
Keywords:Machine learning, Materials Informatics, Transfer learning
There is a growing demand for the use of machine learning (ML) to derive fast-to-evaluate surrogate models of materials properties. In recent years, a broad array of materials property databases have emerge as part of a digital transformation of materials science. However, recent technological advances in ML are not being fully exploited because of the insufficient volume and diversity of materials data. An ML framework called ``transfer learning'' has considerable potential to overcome the problem of limited amounts of materials data. Transfer learning relies on the concept that various property types, such as physical, chemical, electronic, thermodynamic, and mechanical properties, are physically interrelated. For a given target property to be predicted from a limited supply of training data, models of related proxy properties are pre-trained using sufficient data; these models capture common features relevant to the target task. Re-purposing of such machine-acquired features on the target task yields outstanding prediction performance even with exceedingly small datasets, as if highly experienced human experts can make rational inferences even for considerably less experienced tasks. In this study, to facilitate widespread use of transfer learning, we develop a pre-trained model library called XenonPy.MDL. In this first release, the library comprises more than 100,000 pre-trained models for various properties of small molecules, polymers, and inorganic materials. Along with these pre-trained models, we describe some outstanding successes of transfer learning in different scenarios such as building models with only dozens of materials data, increasing the ability of extrapolative prediction through a strategic model transfer, and so on.