日本地球惑星科学連合2014年大会

講演情報

インターナショナルセッション(ポスター発表)

セッション記号 U (ユニオン) » ユニオン

[U-01_1PO1] Forum for Global Data Sciences in Earth and Planetary Research

2014年5月1日(木) 18:15 〜 19:30 3階ポスター会場 (3F)

コンビーナ:*村山 泰啓(独立行政法人 情報通信研究機構)、小池 俊雄(東京大学大学院工学系研究科社会基盤学専攻)、大石 雅寿(国立天文台天文データセンター)、喜連川 優(東京大学生産技術研究所)、柴崎 亮介(東京大学空間情報科学研究センター)、渡辺 堯(名古屋大学太陽地球環境研究所)

18:15 〜 19:30

[U01-P01] Hadoopによる時系列衛星画像のための時空間データマイニングシステムの構築

西前 光1三好 智也1、*本田 理恵1 (1.高知大学)

キーワード:分散処理, Hadoop, MapReduce, データマイニング, 時空間, 衛星画像

A large number of spatio-temporal data have been stored in various fields of science, such as remote sensing, numerical simulation, and astronomical observation, in which data often appears as time-series images. To extract spatio-temporal knowledge from spatio-temporal data including time-series images, spatio-temporal cross section relevant to a target task has to be extracted from a mass of data. Since these data are stored as a large number of files, utilization of distributed processing framework such as Hadoop or Gfarm is promising.

We constructed distributed data mining system for time-series satellite images using 53 nodes (3 masters and 50 slaves at maximum) of iMac and Hadoop which enables distributed file system and distributed processing using MapReduce. We evaluated the scalability and performance of the system for the task extracting time-series data from a large number of images carefully and found that partitioning the images into optimum numbers and reducing the data between map phase and reduce phase is essential.

The system was then applied to two different tasks focusing on time-series data analysis extracted from satellite imagery: statistical modeling of seasonal changes in vegetation index and spatio-temporal correlation analysis of weather satellite images. The tasks were successfully implemented on the system and the computational time was decreased in inverse proportion to the number of slave nodes, thus usefulness of the distributed system to spatio-temporal data mining for time-series images.