11:45 AM - 12:00 PM
▲ [22a-M206-11] Improving the language understanding in materials science: challenges and prospects
Keywords:materials informatics, deep learning, tdm
We built a BERT model using a set of 794198 papers (142M sentences and 3.2B tokens) as the continuation of the pre-training of SciBERT (Mat+Sci+BERT).
We used the Tensor Processing Units (TPU) on Google Cloud Platform (https://cloud.google.com). We received support from Google as part of the program "Google Cloud for Researchers".
In this presentation, we discuss the details of our model and the evaluation on domain-specific (superconductors NER) or generic (physical quantities NER, CoLA) tasks.
In future, we plan to pre-train from scratch using the original BERT (Mat+BERT) and the RoBERTa (Mat+RoBERTa) implementations. To pre-train the RoBERTa model, we use the Jean Zay supercomputer (500 Nvidia V100 GPUs) thanks to the French National Center for Computer Science Inria.
We used the Tensor Processing Units (TPU) on Google Cloud Platform (https://cloud.google.com). We received support from Google as part of the program "Google Cloud for Researchers".
In this presentation, we discuss the details of our model and the evaluation on domain-specific (superconductors NER) or generic (physical quantities NER, CoLA) tasks.
In future, we plan to pre-train from scratch using the original BERT (Mat+BERT) and the RoBERTa (Mat+RoBERTa) implementations. To pre-train the RoBERTa model, we use the Jean Zay supercomputer (500 Nvidia V100 GPUs) thanks to the French National Center for Computer Science Inria.