4:30 PM - 4:50 PM
[2T5-OS-5b-04] Exploration of Evaluation Functions Correlated with Human Evaluation in Generating Descriptions of Fashion Coordination
Keywords:AI, Alignment, evaluation of texts generated by natural language generation (NLG) systems
To apply Large Language Models (LLMs) in the real world, it is crucial that the text they generate is of value to humans and of a quality that is acceptable to humans. This study aims to find evaluation functions that correlate with human evaluations of fashion coordination descriptions generated by LLMs. Identifying such evaluation functions could allow for the improvement of the accuracy of fashion coordination description generation models in a direction aligned with human values, and potentially automate the entire process from description generation to evaluation. In this research, fashion coordination descriptions generated by LLMs were evaluated by skilled fashion stylists, and a dataset was created based on their evaluation. Using this dataset, we sought to find evaluation metrics that correlate with human evaluations. The candidates for these functions were functions used in the abstractive summarization task.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.