1:20 PM - 1:40 PM
[1D3-GS-7-02] Style Analysis of E-Commerce Site Images Using Multimodal Embeddings
Keywords:Large Language Model, Embeddings, Clustering
With the expansion of the e-commerce market and advancements in technology, a detailed analysis of consumer purchasing behavior and understanding of preferences have become crucial. This is particularly true where the visual appeal of product images plays a significant role in consumer engagement. In our study, we utilized multimodal embeddings to analyze the style and nuances of art images on e-commerce sites. Specifically, we employed COCA (Contrastive Captioners as Image-Text Foundation Models) to extract multimodal embeddings that capture the complex patterns and stylistic elements of product images. We then clustered these images into distinct style groups. Our analysis revealed that multimodal embeddings are effective in detecting subtle stylistic changes in images. Furthermore, it suggested that the application of such generative AI could greatly enhance the understanding of image characteristics preferred by consumers.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.