A Study of Music Generation focusing on Emotion-related Music Attributes with a Diffusion Model

Moyu Kawabe

2:20 PM - 2:40 PM

[3N4-GS-7-03] A Study of Music Generation focusing on Emotion-related Music Attributes with a Diffusion Model

〇Moyu Kawabe¹, Ichiro Kobayashi¹ (1. Ochanomizu University)

Keywords:Music Generation, Diffusion models, Emotion

拡散過程を用いたモデル技術は、近年、生成AIの分野において生成品質・拡張性が高く、学習を安定に行うことができるなどの点で注目されている。しかし拡散モデルにおいて、テキスト以外で表現される感情を用いて音楽を生成したり、MIDI 形式の音楽を扱ったりする手法はあまり発展しておらず、音楽属性値のような複雑な属性に対する制御も難しい。
本研究では、離散系列データを生成可能とするDiffusion Language Modelを用いることで多様な音楽生成に制御性を加える、入力となる感情をラッセル円環グラフ上の座標値とすることで微小な感情の変化の表現を可能にする、感情と相関の高い音楽属性に対して制御を行う分類器を作成する、という3つのアプローチを用いることで、多様な感情を制御対象としたMIDI形式の音楽生成手法の開発を目指す。
今回提案手法を用いて、複数の入力に対してそれぞれの感情が音楽に反映しているかどうかを評価を行った。実験の結果、入力した感情に応じて音楽が生成されたことは確認されなかったが、分類器の処理や学習設定の改善によって、生成音楽がより多様になるということがわかった。

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[3N4-GS-7] Vision, speech media processing:

[3N4-GS-7-03] A Study of Music Generation focusing on Emotion-related Music Attributes with a Diffusion Model

Password