[4Yin2-55] A Study of Entity Linking Methods Considering Audio Similarity between User's Speech and Entity
Keywords:Speech Dialog System, Entity Linking, Speech Recognition
To improve the performance of Entity Linking on spoken dialogue systems, we explore the similarity computation methods based on audio features.We compute the similarity between a user's speech and each entity in the knowledge base, then link the user's speech with the entity that is most similar to the user's speech.In experiments, we used the phoneme sequence-based methods (edit distance, semi-global alignment) that use Automatic Speech Recognition (ASR) results, and the methods which do not use ASR uses speech data directly (Mel spectrogram, wav2vec 2.0 ).Using features extracted from speech data provides robustness against speech recognition errors.In the experimental results using log data from our spoken dialog service, the methods using phoneme sequences are less affected by filler and silence intervals, showed higher performance than the methods using speech data.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.