Exploration of Evaluation Functions Correlated with Human Evaluation in Generating Descriptions of Fashion Coordination

Yuya Fujisaki

4:30 PM - 4:50 PM

[2T5-OS-5b-04] Exploration of Evaluation Functions Correlated with Human Evaluation in Generating Descriptions of Fashion Coordination

Yuya Fujisaki^1,2, 〇Masashi Kishimoto¹, Natsuyo Tazaki¹, Tomoaki Tsuzuki¹ (1. DROBE.Co, 2. Japan Advanced Institute of Science and Technology)

Keywords:AI, Alignment, evaluation of texts generated by natural language generation (NLG) systems

To apply Large Language Models (LLMs) in the real world, it is crucial that the text they generate is of value to humans and of a quality that is acceptable to humans. This study aims to find evaluation functions that correlate with human evaluations of fashion coordination descriptions generated by LLMs. Identifying such evaluation functions could allow for the improvement of the accuracy of fashion coordination description generation models in a direction aligned with human values, and potentially automate the entire process from description generation to evaluation. In this research, fashion coordination descriptions generated by LLMs were evaluated by skilled fashion stylists, and a dataset was created based on their evaluation. Using this dataset, we sought to find evaluation metrics that correlate with human evaluations. The candidates for these functions were functions used in the abstractive summarization task.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[2T5-OS-5b] OS-5

[2T5-OS-5b-04] Exploration of Evaluation Functions Correlated with Human Evaluation in Generating Descriptions of Fashion Coordination

Password