2022年度 人工知能学会全国大会(第36回)

講演情報

国際セッション

国際セッション » ES-2 Machine learning

[1S5-IS-2a] Machine learning

2022年6月14日(火) 16:20 〜 18:00 S会場 (遠隔S)

Chair: Toshihiko Matsuka (Chiba University)

17:20 〜 17:40

[1S5-IS-2a-04] Network Structure based Clustering of Multiple Heterogeneous Datasets Using Metadata

〇Takeshi Sakumoto1, Teruaki Hayashi2, Hiroki Sakaji2, Hirofumi Nonaka1 (1. Nagaoka University of Technology, 2. The University of Tokyo)

Regular

キーワード:clustering, heterogeneous data, network clustering, clustering of multiple heterogeneous datasets, metadata

Recent developments of computers and data exchange platforms have increased expectations for innovation by combining data. Especially in the field of machine learning, researchers have been focusing on the combination of datasets for innovation. Most of the previous studies assume that the researchers can easily access sets of closely related datasets that have similar topics, are contextually similar, or are from the same domains. However, generally, data providers do not neccessarily design and create datasets on the premise of data exchange or merge the ones. Furthermore, the maintenance of the unified schema are not currently insufficient and the areas where they can be applied are limited. These problems make it difficult to search, discover, exchange, and utilize the datasets on data platforms where various types of inter-disciplinary data are exchanged. In this research, we propose network-based method based on not-human-readable metadata to detect clusters composed of closely related datasets from the set of various types of datasets. Experimental results on Kaggle metadata datasets demonstrate the effectiveness of our proposed methods.

講演PDFパスワード認証
論文PDFの閲覧にはログインが必要です。参加登録者の方は「参加者用ログイン」画面からログインしてください。あるいは論文PDF閲覧用のパスワードを以下にご入力ください。

パスワード