Japan Geoscience Union Meeting 2014

Presentation information

Oral

Symbol M (Multidisciplinary and Interdisciplinary) » M-GI General Geosciences, Information Geosciences & Simulations

[M-GI37_29AM1] Earth and planetary informatics with huge data management

Tue. Apr 29, 2014 9:00 AM - 10:30 AM 413 (4F)

Convener:*Eizi TOYODA(Numerical Prediction Division, Japan Meteorological Agency), Yasuhiro Murayama(National Institute of Information and Communications Technology), Junya Terazono(The University of Aizu), Tomoaki Hori(Nagoya University Solar Terrestrial Environment Laboratory Geospace Research Center), Kazuo Ohtake(Japan Meteorological Agency), Mayumi Wakabayashi(Kiso-Jiban Consultants Co.,Ltd), Takeshi Horinouchi(Faculty of Environmental Earth Science, Hokkaido University), Susumu Nonogaki(Geological Survey of Japan, National Institute of Advanced Industrial Science and Technology), Chair:Kazuo Ohtake(Japan Meteorological Agency)

9:30 AM - 9:45 AM

[MGI37-03] An Examination of Data I/O Speed on a Parallel Data Storage System

*Ken T. MURATA1, Hidenobu WATANABE1, Kentaro UKAWA2, Kazuya MURANAGA2, Suzuki YUTAKA2, Osamu TATEBE3, Masahiro TANAKA3, Eizen KIMURA4 (1.National Institute of Information and Communications Technology, 2.Systems Engineering Consultants Co., LTD., 3.University of Tsukuba, 4.Ehime University)

This paper is to propose a cloud system for science, which has been developed at NICT (National Institute of Information and Communications Technology), Japan. The NICT science cloud is an open cloud system for scientists who are going to carry out their informatics studies for their own science. The NICT science cloud is not for simple uses. Many functions are expected to the science cloud; such as data standardization, data collection and crawling, large and distributed data storage system, security and reliability, database and meta-database, data stewardship, long-term data preservation, data rescue and preservation, data mining, parallel processing, data publication and provision, semantic web, 3D and 4D visualization, out-reach and in-reach, and capacity buildings.In the present study, we examine performance of parallelization of I/O on the NICT Science Cloud system. We examine two types of data file system; parallel file system (GPFS) and distributed file system (Gfarm). The later file system shows a tremendous fast I/O, as fast as 23 GB/sec using only 30 servers. We should pay attention to this I/O speed (23GB/sec is 184 Gbps) from the viewpoint of network speed. Since general network speed so far is 10 Gpbs or 40 Gbps in a cloud system, this 184 Gpbs is fast enough that the I/O cannot be a bottle-neck for big data processing. We also discuss that the distributed file system shows better scalability compared with the GPSF system. Parallelization efficiency in the present examination is higher than 90% in case of parallel file system. This suggests that, in the near future, we can expect higher I/O speed using more file servers. For instance, if the I/O speed is as fast as 100 GB/sec, it takes only 1,000 sec. (17 min.) to read 100 TB data files.