JSAI2022

Presentation information

General Session

General Session » GS-7 Vision, speech media processing

[1O4-GS-7] Vision, speech media processing: detection / data set creation

Tue. Jun 14, 2022 2:20 PM - 4:00 PM Room O (Room 510)

座長:石原 賢太(NEC)[遠隔]

2:40 PM - 3:00 PM

[1O4-GS-7-02] An Image Data Augmentation Method Based on an Unsupervised Segmentation Model for Object Detection Tasks

〇Yuto Ichikawa1, Kennichiro Shimada1, Ryosuke Tanno1, Tomonori Izumitani1 (1. NTT Communications Corporation)

Keywords:Data Augmentation, Image Detection, Segmentation

Data augmentation by random pasting of cropped rectangular images, which include objects, is commonly used to generate training data for object detection model learning.Using this simple method, the boundary between the original image and the pasted images tends to be unnatural.This may affect detection performance.In addition, it is costly to crop images into the object shapes with the masks obtained by unsupervised learning methods.We propose a method to generate images with natural boundary using the copy-paste GAN.The method can produce augmented images without mask creation costs.To show the effectiveness of the method, we compared it to conventional methods in terms of detection accuracy and the confidence score using two image datasets,the Airbus Ship Detection Challenge dataset, and the Happy-whale dataset.The proposed method demonstrate the effectiveness of our data augmentation framework.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password