[2Xin5-01] Peptide Binding Prediction and Residue Pair Visualization Using BERT Based on a Large Scale Protein Database
Keywords:peptide binding prediction, Amino acid sequence, BERT, pre-training
Prediction of B-cell epitopes and peptide binding affinity to MHCII are both important tasks in vaccine development. B-cell epitope prediction is useful for the design and development of vaccines that induce antigen-specific antibody production. On the other hand, binding prediction between peptides and MHC class II molecules is also necessary for the research of vaccines that activate T cells to reduce the severity of infection. Conventional methods using machine learning for these prediction tasks have the following two problems: The first is that they do not capture the complex dependencies between distant residues. The second is that the accuracy is low when the training data is insufficient. To address these challenges, we propose a method using a BERT model with a self-attention mechanism, which is pre-trained using a large scale protein database. Experimental results show that our proposed method achieves better performance than the previous methods in predicting B cell epitopes and peptide binding to MHCII. We also visualize and analyze the derived self-attention from a biological viewpoint focusing on the protein structure and function.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.