JSAI2018

Presentation information

Oral presentation

General Session » [General Session] 10. Vision / Speech

[4M1] [General Session] 10. Vision / Speech

Fri. Jun 8, 2018 12:00 PM - 1:40 PM Room M (2F Amethyst Hall Hoo)

座長:金崎 朝子(産業技術総合研究所)

1:00 PM - 1:20 PM

[4M1-04] Image Modality Translation for Enriching Virtual Space

Shinta Masuda1, Takashi Machida2, 〇Takashi Matsubara1, Kuniaki Uehara1 (1. Graduate School of System Informatics, Kobe Universiry, 2. Toyota Central R&D Labs., Inc.)

Keywords:image translaiton, image processing, virtual space

Following great successes of machine learning in various benchmarks, its practical use is attracting attention. The machine learning system has to be trained using a wide variety of data samples and to be tested under various conditions, but collecting numerous data samples is very costly. Here, a demand for data augmentation arises. In this paper, we tackle the augmentation of real images by translating their modality to another modality such as daytime vs. night-time. This data augmentation enables us to train and test the machine learning system in various modality. We first demonstrate that existing approaches, pix2pix and cycle-GAN have some difficulties of applying data augmentation; pix2pix requires paired samples in both modalities or cannot overcome the difference in the modalities, and cycle-GAN sometimes fails in keeping consistency in both modalities. We propose modifications of these methods, which improve the consistency in image modality translation.