Keywords:Reinforcement learning, Q-learning, Value standard
We have their own values, but actual behavior does not simply reflect their own values. Specifically, we consider actual gains from our behavior. Evaluation from the group to which we belong is important as the gain obtained from their actions. For this reason, there is a close relationship between individual value standards and community formation. In this study, we conducted a simulation experiment by modeling that individuals make one-on-one contact and evaluate each other to optimize their behavior, and that the value standard itself changes due to repeated behavior. In the case of equal number of contacts, it was confirmed that agents’ behavioral preferences gradually become closer, and that the agent with a low learning rate had a strong influence on others. On the other hand, when agents determine contact targets based on their value standards, communities with similar preferences are formed.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.