M-GI General Geosciences, Information Geosciences & Simulations

Open Research Data and Interoperable Science Infrastructures for Earth & Planetary Sciences

Mon. May 23, 2016 9:00 AM - 10:30 AM

Yasuhiro Murayama(Integrated Science Data System Research Laboratory, National Institute of Information and Communications Technology), Baptiste Cecconi(LESIA, Observatoire de Paris, CNRS, PSL Research University), Yasuhisa Kondo(Research Institute for Humanity and Nature), Reiichiro Ishii(Japan Agency of Marine-Earth Science and Technology), Daniel Crichon(Jet Propulsion Laboratory, National Aeronautics and Space Administration)

10:15 AM - 10:30 AM

How to make the data sets in "dark long tail" open and preserve?

Toshihiko Iyemori (Data Analysis Center for Geomagnetism and Space Magnetism, Graduate School of Science, Kyoto University)

Keywords: open data, data preservation, small data set

In data analysis, we often encounter the difficulty by lack of data and try to find additional data sets by asking researchers in the same research community. Sometimes, we can reach the data set suitable to fill the gap of data or we find unexpected data set which is very useful. However, in most cases, we cannot find the data. We know that there are a huge number of datasets—mainly obtained on a research project basis—that are not registered to active data centres, and hence are 'dark' to many of us. These datasets are typically built by small research groups for a limited period, and data are not open for public. Although they exist only for a limited period, such data are very important and useful if the location of observation site is highly unique, or if other observations are not available.
One way to make such data sets open from the 'dark long tail' is to register metadata that describe the observations in as much detail as possible. An example of this in practice is IUGONET (Interuniversity Upper atmosphere Global Observation NETwork), which has a common database of metadata and forms a virtual data centre of distributed databases at several institutions. This data system includes databases from the 'dark long tail', as well as large well-known databases.
Another way is to use university repositories. However, in this case, we need a common method to find and retrieve the data set.