The 21st Annual Meeting of the Protein Science Society of Japan

Presentation information

Poster Session

[2P-1] Poster 2 (2P-01ー2P-37)

Thu. Jun 17, 2021 2:45 PM - 4:45 PM Poster 1

[2P-34] Protein-small ligand interaction prediction method by supervised machine learning based on PDBBind database

Yuhang Chen1, Keiichiro Sato1, Kota Kasahara2, Yuxiang Huang1, Takuya Takahashi2 (1.Grad. Sch. Life Sci., Ritsumeikan Univ., 2.Coll.Life.Sci.,Ritsumeikan Univ.)

With the improvement of computational power, machine learning technology has been used in every part of drug discovery. Since there is a huge increase in a publicly available large database such as the protein data bank in recent years, the molecular docking method is also becoming popular in various fields such as Computer-Aided Drug Design.Our group previously reported comprehensive classification of protein-small ligand interactions with an unsupervised parametric pattern recognition technique based on the Gaussian mixture model (Kasahara et al., 2013). Here, we applied this technique to a development of a new knowledge-based docking method. 4,565 protein-ligand complexes were extracted from a dataset called "refined set" in the PDBBind database (released in Dec 2019) for statistical analyses. This dataset has been clustered with a 70% identity threshold of protein sequence homology. Also, for each clustered family, a representative was used to consist of 1,155 entries of the non-redundant dataset. In the study, supervised classifiers using neural network algorithm, support vector machine, random forest, and XGBoost have been developed for distinguishing between a native structure and a decoy structure.