350 rub
Journal Neurocomputers №1 for 2022 г.
Article in number:
Method for solving social dilemmas based on multi-agent learning and reputation
DOI: https://doi.org/10.18127/j19998554-202201-01
UDC: 681.142
Authors:

A.N. Alfimtsev1, S.A. Sakulin2, V.E. Bolshakov3, N.V. Bykov4, M.S. Tovarnov5, N.S. Vlasova6

1–6 Bauman Moscow State Technical University (Moscow, Russia)

Abstract:

Many real-world problems contain social dilemmas that express contradictions between individual and public interests. Such tasks can be solved using deep multi-agent learning. This article focuses on reputation-based social dilemma solving and multi-agent reinforcement learning.

Target of the work is the development of a method for solving social dilemmas based on deep multi-agent reinforcement learning.

A method for solving social dilemmas based on deep multi-agent reinforcement learning is proposed. The difference between the method and the known analogs is based on the use of the agent's reputation as a part of the reward function. Reputation, in turn, depends on the degree of cooperation of an agent with other agents. The software implementation of the method is done in Python using the open library Pytorch. The effectiveness of the method was assessed on the basis of comparison with methods for solving social dilemmas without taking into account cooperation between agents.

The results of the work can be used to create software for unmanned vehicles of the city's transport system. A balance between collective and individual interests by solving social dilemmas in a “smart city” can be achieved through the application of the method proposed in the article.

Pages: 5-15
For citation

Alfimtsev A.N., Sakulin S.A., Bolshakov V.E., Bykov N.V., Tovarnov M.S., Vlasova N.S. Method for solving social dilemmas based on multi-agent learning and reputation. Neurocomputers. 2022. V. 24. № 1. Р. 5-15. DOI: https://doi.org/10.18127/j19998554-202201-01 (In Russian).

References
  1. Badjatiya P. et al. Status-quo policy gradient in Multi-Agent Reinforcement Learning. arXiv preprint arXiv. 2021. 2111.11692.
  2. Leibo J.Z., Zambaldi V., Lanctot M., Marecki J., Graepel T. Multi-agent reinforcement learning in sequential social dilemmas. arXiv preprint arXiv. 2017. 1702. 03037.
  3. Pathak D. et al. Curiosity-driven exploration by self-supervised prediction //International conference on machine learning. PMLR. 2017. P. 2778–2787.
  4. Klyubin A.S., Polani D., Nehaniv C.L. Empowerment: A universal agent-centric measure of control. IEEE congress on evolutionary computation. 2005. V. 1. P. 128–135.
  5. Sequeira P. et al. Emerging social awareness: Exploring intrinsic motivation in multiagent learning. IEEE international conference on development and learning (ICDL). 2011. V. 2. P. 1–6.
  6. Suh Y.H. et al. A sim2real framework enabling decentralized agents to execute MADDPG tasks. Proceedings of the Workshop on Distributed Infrastructures for Deep Learning. 2019. P. 1–6.
  7. Pretorius A. et al. A game-theoretic analysis of networked system control for common-pool resource management using multi-agent reinforcement learning. arXiv preprint arXiv:2010.07777. 2020.
  8. Heuillet A., Couthouis F., Díaz-Rodríguez N. Collective eXplainable AI: Explaining Cooperative Strategies and Agent Contribution in Multiagent Reinforcement Learning with Shapley Values. arXiv preprint arXiv:2110.01307. 2021.
  9. Schmid K. et al. Distributed Emergent Agreements with Deep Reinforcement Learning. International Joint Conference on Neural Networks (IJCNN). IEEE. 2021. P. 1–8.
  10. Chelarescu P. Deception in Social Learning: A Multi-Agent Reinforcement Learning Perspective. arXiv preprint arXiv:2106.05402. 2021.
  11. Wong A. et al. Multiagent deep reinforcement learning: Challenges and directions towards human-like approaches. arXiv preprint arXiv:2106.15691. 2021.
  12. Pan W. et al. A Dynamically Adaptive Approach to Reducing Strategic Interference for Multi-agent Systems. IEEE Transactions on Cognitive and Developmental Systems. 2021.
  13. Fan D., Shen H., Dong L. Multi-Agent Distributed Deep Deterministic Policy Gradient for Partially Observable Tracking. Actuators. Multidisciplinary Digital Publishing Institute. 2021. V. 10. № 10. P. 268.
  14. Wang T., Zhang L. Coordinated scheduling of integrated energy microgrid with multi-energy hubs based on MADDPG and two-layer game. Journal of Renewable and Sustainable Energy. 2021. V. 13. № 6. P. 065502.
  15. Samvelyan M. et al. The starcraft multi-agent challenge. arXiv preprint arXiv:1902.04043. 2019.
  16. Liu I. J. et al. Cooperative exploration for multi-agent deep reinforcement learning. International Conference on Machine Learning. PMLR. 2021. P. 6826–6836.
  17. Kuba J. et al. Settling the variance of multi-agent policy gradients. Advances in Neural Information Processing Systems. 2021. V. 34.
  18. Vinyals O. et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature. 2019. V. 575. № 7782.
    P. 350–354.
Date of receipt: 14.12.2021
Approved after review: 27.12.2021
Accepted for publication: 12.01.2022