JSAI2025

Presentation information

Poster Session

Poster session » Poster Session

[2Win5] Poster session 2

Wed. May 28, 2025 3:30 PM - 5:30 PM Room W (Event hall D-E)

[2Win5-38] A Study of Bias Mitigation Methods using Task Arithmetic in Decoder-based Language Models

〇Daiki Shirafuji1, Koji Tanaka1, Makoto Takenaka1, Tatsuhiko Saito1, Kimura Yasutomo2 (1.Mitsubishi Electric Corporation, 2.Otaru University of Commerce)

Keywords:Large Language Model, Social Bias

In recent years, the social biases within Language Models (LMs) have been increasingly recognized as a serious problem. Most existing research on mitigating such bias focused on encoder-based LMs, while few works have examined with decoder-based LMs. In this study, we evaluate the effectiveness of an existing mitigating method on totally eight different models with two types of decoder-based architecture: GPT-2 and Llama-3. Specifically, we adopt a method based on Task Arithmetic, which achieves bias mitigation by editing model weights. The magnitude of the subtracted weights is controlled by a scaling factor λ. To assess social bias in the generated text, we use the HONEST dataset to measure the social bias scores of the debiased models. Additionally, we conduct evaluation experiments on the GLUE benchmark to examine how this method affects downstream task performance. Our results show that the method reduces social bias within LMs (0.1456 point → 0.0537 point), while incurring minor performance degradation on the GLUE benchmark. We also observed that for λ=1, the LMs occasionally produce a string of symbols, and that for λ=10, almost all outputs become symbol sequences. In future work, we plan to extend our evaluation to LLMs with 10B or more to investigate how model size influences social bias.
Warning: This paper includes examples that could be considered as discriminatory.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password