JSAI2023

Presentation information

Organized Session

Organized Session » OS-12

[2R4-OS-12] 機械学習品質評価・向上技術

Wed. Jun 7, 2023 1:30 PM - 3:10 PM Room R (602)

オーガナイザ:磯部 祥尚、小林 健一、中島 震

1:50 PM - 2:10 PM

[2R4-OS-12-02] Regression Prediction of Attribute Value Using Semi-Supervised Representation Learning for Dataset Quality Assessment

〇Masatoshi Sekine1, Daisuke Shimbara1, Tomoyuki Myojin1, Eri Imatani1 (1. Hitachi, Ltd.)

Keywords:data quality assessment, variational autoencoder, representation learning, regression, attribute

AI software differs from traditional software in that it is generated inductively from training data. Therefore, it is essential to prepare high-quality training data. We have previously proposed a method to evaluate the quality of data using a variational autoencoder. It is a laborious and challenging task for users to manually map latent variables to attributes, which consumes a lot of time and makes it difficult to quantify the attribute values. In this study, we propose a semi-supervised representation learning approach that can automatically and quantitatively predict attribute values from latent variables by enhancing the variational autoencoder. Our proposed method incorporates a term in the loss function that includes the coefficient of determination, which measures the goodness of fit of the regression equation. The purpose of this is to predict the attribute values from latent variables through regression analysis in the labeled data, which is a portion of the dataset. The method is then trained to increase the coefficient of determination. The results of applying the proposed method to a dataset of handwritten characters on forms indicated that it is an effective method for objectively evaluating the quality of the dataset.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password