Modeling of adaptive behavior of autonomous agents

350 rub

Journal Neurocomputers №3 for 2010 г.

Article in number:

Keywords: formation of adaptive behavior autonomous agents reinforcement learning evolutionary optimization

Authors:

V. G. Red-ko, G. A. Beskhlebnova

Abstract:

The computer model of formation of adaptive behavior of the autonomous agents that have natural needs (feeding, reproduction, safety) is constructed and investigated. The control system of the agent consists of the set of logical rules «If the situation S takes place, then it is necessary to execute the action A». Each rule has its weight. Any agent has a resource that increases at eating food by the agent and decreases at action execution and in dangerous situations. If the resource of the agent becomes less than a certain threshold, this agent dies. The agent control system is formed by both reinforcement learning of agents, and in the course of evolutionary optimization. The agent external environment consists of two cells: the first cell is dangerous for agents; the second cell is non-dangerous one. The status of cells is changed with the period TD time steps; namely, the dangerous cell becomes non-dangerous and, on the contrary, the non-dangerous cell becomes dangerous. The resource of the agent in the dangerous cell decreases on the value rD at each time moment t. There is a food of agents in the environment; the food is replenished with certain probability at each time moment. The agent executes each time moment one of the following four actions: division, eating a food, moving to another cell, resting. Using a special choice of parameters, we have analyzed the following cases: A) the case L (pure learning); B) the case E (pure evolution), in this case intensity of learning of agents was equal to zero; C) the case LE (learning + evolution), i.e. the full model. Computer simulation demonstrated that in all three cases agents moved from a dangerous cell into non-dangerous one in proper moments of time. Agent self-learning and evolutionary optimization of agent control systems are compared. Self-learning agents (the case L) carry out mainly actions corresponding to needs of feeding and safety, and evolutionary optimization (the case E) results in addition to the frequent actions corresponding to the need of reproduction. In the case LE the behavior of agents was similar to that of in the case L. So, the simulation demonstrated formation of rather natural behavior of agents. It is essential that the reproduction plays the important role at evolutionary optimization.

Pages: 33-38

References

Witkowski, M., An action-selection calculus // Adaptive Behavior. 2007. V. 15. No. 1. PP. 73-97.
Butz, M.V., Sigaud, O., Pezzulo, G., Baldassarre, G. (Eds.). Anticipatory Behavior in Adaptive Learning Systems: From Brains to Individual and Social Behavior. LNAI 4520. Berlin. Heidelberg: Springer Verlag. 2007.
Vernon, D., Metta, G., Sandini, G. A survey of artificial cognitive systems: Implications for the autonomous development of mental capabilities in computational agents // IEEE Transactions on Evolutionary Computation, special issue on Autonomous Mental Development. 2007. V. 11. No. 2. P. 151-180.
Sutton, R. S., Barto, A. G., Reinforcement Learning: An Introduction. MIT Press. 1998.
Редько В. Г. Перспективы моделирования когнитивной эволюции // Третья международная конференция по когнитивной науке: Тез. докл. В2-хт. Т. 2. М.: Художественно-издательскийцентр. 2008. С. 576-577.
Red-ko, V. G., Evolution of cognition: Towards the theory of origin of human logic // Foundations of Science. 2000. V. 5. No. 3. P. 323-338.
Holland, J. H., Holyoak, K. J., Nisbett, R. E., Thagard, P., Induction: Processes of Inference, Learning, and Discovery. Cambridge. MA: MIT Press. 1986.
Red-ko, V. G., Mosalov, O. P., Prokhorov, D. V., A model of evolution and learning // Neural Networks. 2005. V. 18. No. 5-6. P. 738-745.
Редько В. Г., Прохоров Д. В. Нейросетевые адаптивные критики // Научная сессия МИФИ-2004. VI Всероссийская научно-техническая конференция «Нейроинформатика-2004»: Сб. научных трудов. Ч. 2. М.: МИФИ. 2004. С. 77-84.
Непомнящих В. А., Попов Е. Е., Редько В. Г. Бионическая модель адаптивного поискового поведения // Изв. РАН. Теория и системы управления. 2008. № 1. С. 85-93.
Редько В. Г., Бесхлебнова Г. А. Модель формирования адаптивного поведения автономных агентов // Интегрированные модели и мягкие вычисления в искусственном интеллекте: Сб. тр. V-й Междунар. научн.-практической конференции. Т. 1. М.: Физматлит. 2009. С. 70-79.
Витяев Е. Е. Извлечение знаний из данных. Компьютерное познание. Модели когнитивных процессов. Новосибирск: НГУ. 2006.