JSAI2025

Presentation information

Poster Session

Poster session » Poster Session

[2Win5] Poster session 2

Wed. May 28, 2025 3:30 PM - 5:30 PM Room W (Event hall D-E)

[2Win5-06] Evaluation of an Independent Masked Prediction Strategy for Labels and Values in Tabular Transformers

〇Nanae Aratake1, Taisei Tosaki1,2, Yuji Okamoto1, Eiichiro Uchino1, Ryosuke Kojima1,3, Yasushi Okuno1,2 (1.Kyoto University , 2.RIKEN Center for Computational Science , 3.RIKEN Biosystems Dynamics Research )

Keywords:Masked Prediction, Tabular Transformer, Representation Learning

Tabular data is widely used in various fields such as healthcare and finance, containing multiple data types, including numerical, categorical, and textual values. Proper tokenization and embedding methods are essential, especially when both labels and values coexist. This study focuses on Masked Prediction techniques for applying Transformers to variable-length tabular data, comparing two masking strategies: masking labels and values together versus masking them independently. We conducted evaluation experiments using the Adult Dataset from the UC Irvine Repository, performing transfer learning and fine-tuning after pretraining. The results showed that masking labels and values together achieved a higher AUROC score in transfer learning, while independent masking led to lower accuracy. However, in fine-tuning, both methods performed similarly with no significant difference. These findings suggest that independent masking is not advantageous for transfer learning. Future work should explore other datasets and different masking probabilities for a more comprehensive evaluation.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password