Japan Geoscience Union Meeting 2014

Presentation information

International Session (Poster)

Symbol U (Union) » Union

[U-01_1PO1] Forum for Global Data Sciences in Earth and Planetary Research

Thu. May 1, 2014 6:15 PM - 7:30 PM Poster (3F)

Convener:*Murayama Yasuhiro(National Institute of Information and Communications Technology), Toshio Koike(Department of Civil Engineering, The University of Tokyo), Masatoshi Ohishi Masatoshi(Astronomy Data Center, National Astronomical Observatory of Japan), Masaru Kitsuregawa(Institute of Industrial Science, the University of Tokyo), Ryosuke Shibasaki(Center for Spatial Information Science, the University of Tokyo), Takashi Watanabe(Solar-Terrestrial Environment Laboratory, Nagoya University)

6:15 PM - 7:30 PM

[U01-P01] Construction of spatio-temporal data mining system for time-series satellite imagery using Hadoop

Kou NISHIMAE1, Tomoya MIYOSHI1, *Rie HONDA1 (1.Kochi University)

Keywords:distrubuted processing, Hadoop, MapReduce, data mining, spatio-temporal, satellite imagery

A large number of spatio-temporal data have been stored in various fields of science, such as remote sensing, numerical simulation, and astronomical observation, in which data often appears as time-series images. To extract spatio-temporal knowledge from spatio-temporal data including time-series images, spatio-temporal cross section relevant to a target task has to be extracted from a mass of data. Since these data are stored as a large number of files, utilization of distributed processing framework such as Hadoop or Gfarm is promising. We constructed distributed data mining system for time-series satellite images using 53 nodes (3 masters and 50 slaves at maximum) of iMac and Hadoop which enables distributed file system and distributed processing using MapReduce. We evaluated the scalability and performance of the system for the task extracting time-series data from a large number of images carefully and found that partitioning the images into optimum numbers and reducing the data between map phase and reduce phase is essential.The system was then applied to two different tasks focusing on time-series data analysis extracted from satellite imagery: statistical modeling of seasonal changes in vegetation index and spatio-temporal correlation analysis of weather satellite images. The tasks were successfully implemented on the system and the computational time was decreased in inverse proportion to the number of slave nodes, thus usefulness of the distributed system to spatio-temporal data mining for time-series images.