大量の学習データに利用に向けたノード間MPI並列TensorFlowの性能評価

深沢 圭一郎

講演情報

[J] ポスター発表

セッション記号 M (領域外・複数領域) » M-GI 地球科学一般・情報地球科学

[M-GI37] 情報地球惑星科学と大量データ処理

2019年5月26日(日) 17:15 〜 18:30 ポスター会場 (幕張メッセ国際展示場 8ホール)

コンビーナ:村田健史(情報通信研究機構)、本田理恵(高知大学自然科学系理工学部門)、野々垣進(国立研究開発法人　産業技術総合研究所　地質情報研究部門　情報地質研究グループ)、堀之内武(北海道大学地球環境科学研究院)

ポスター会場マップを表示

[MGI37-P15] 大量の学習データに利用に向けたノード間MPI並列TensorFlowの性能評価

*深沢圭一郎¹ (1.京都大学学術情報メディアセンター)

キーワード：機械学習、ノード間並列

Recently machine learning (ML) achieves great works in the recognition, and it plays an essential role in the AI (artificial intelligence) area. In general, the learning data is one of the most critical keys for the ML since the feature extraction from each learning data affects strongly to the accuracy of recognition. Thus, the large amount of learning data is required for ML to avoid bias recognition. However, it takes a long time and needs many computer memory to learn from the massive data. In this talk, the inter-node MPI parallel TensorFlow is introduced.

TensorFlow is the one of ML framework developed by Google, and it is used in the world. Google has developed the inter-node parallel TensorFlow however it is not suitable for the supercomputer such as picking up the local computing node directly to participate in the calculation. To overcome this problem, Horovod MPI based TensorFlow has released by Uber, and it can use the MPI tuned for the supercomputer. As the collaboration of Cray and Kyoto University, more optimized MPI based TensorFlow which is called as CPE ML Plugin has been introduced to the supercomputer of Kyoto University. In this talk the performance evaluation of this MPI based TensorFlow.