1:50 PM - 2:10 PM
[2R4-OS-12-02] Regression Prediction of Attribute Value Using Semi-Supervised Representation Learning for Dataset Quality Assessment
Keywords:data quality assessment, variational autoencoder, representation learning, regression, attribute
AI software differs from traditional software in that it is generated inductively from training data. Therefore, it is essential to prepare high-quality training data. We have previously proposed a method to evaluate the quality of data using a variational autoencoder. It is a laborious and challenging task for users to manually map latent variables to attributes, which consumes a lot of time and makes it difficult to quantify the attribute values. In this study, we propose a semi-supervised representation learning approach that can automatically and quantitatively predict attribute values from latent variables by enhancing the variational autoencoder. Our proposed method incorporates a term in the loss function that includes the coefficient of determination, which measures the goodness of fit of the regression equation. The purpose of this is to predict the attribute values from latent variables through regression analysis in the labeled data, which is a portion of the dataset. The method is then trained to increase the coefficient of determination. The results of applying the proposed method to a dataset of handwritten characters on forms indicated that it is an effective method for objectively evaluating the quality of the dataset.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.