JSAI2023

Presentation information

International Session

International Session » IS-2 Machine learning

[1U3-IS-2a] Machine learning

Tue. Jun 6, 2023 1:00 PM - 2:40 PM Room U (Online)

Chair: Yuki Shibata (Tokyo metropolitan university)

1:00 PM - 1:20 PM

[1U3-IS-2a-01] Semi-Autoregressive Transformer for Sign Language Production

〇Ehssan Wahbi1, Masayasu Atsumi1 (1. Soka University )

[[Online, Regular]]

Keywords:Sign Language Production, Semi-Autoregressive Generation, Transformers, Back-Translation Evaluation, Spatial-Temporal Graph Convolution Network

Sign language production (SLP) aims to generate sign language frame sequences from the corresponding spoken language text sentences. Existing approaches to SLP either rely on autoregressive models that generate the target sign frames sequentially, suffering from error accumulation and high inference latency, or non-autoregressive models that attempt to accelerate the process by producing all frames parallelly, which results in the loss of generation quality. To optimize the trade-off between speed and quality, we propose a semi-autoregressive model for sign language production (named SATSLP), which maintains the autoregressive property on a global scale but generates sign pose frames parallelly on a local scale, thus combining the best of both methods. Furthermore, we reproduced the back-translation transformer model, in which a spatial-temporal graphical skeletal structure is encoded to translate to text for evaluation. Results on the PHOENIX14T dataset show that SATSLP outperformed the baseline autoregressive model in terms of speed and quality.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password