350 rub
Journal Information-measuring and Control Systems №11 for 2015 г.
Article in number:
Classification of the dangerous situations in the decision support systems to control the technical objects
Authors:
A.V. Savchenko - Ph.D. (Eng.), Associate Professor, National Research University Higher School of Economics; Doctoral-Candidate, Nizhniy Novgorod State Technical University n.a. R.E. Alekseev. E-mail: avsavchenko@hse.ru V.R. Milov - Dr.Sc. (Eng.), Professor, Head of Department, Nizhniy Novgorod State Technical University n.a. R.E. Alekseev. E-mail: vladimir.milov@gmail.com A.A. Sevryukov - Assistant, Nizhniy Novgorod State Technical University n.a. R.E. Alekseev. E-mail: ansev@mail.ru D.V. Milov - Post-Graduate Student, Nizhniy Novgorod State Technical University n.a. R.E. Alekseev. E-mail: milovdv@mail.ru
Abstract:
In this paper we focus on the problem of decision support in the elimination of dangerous situations discovered with the monitoring of the technical object\'s state. The situations classification task is reduced to the contextual multi-armed bandit problem. To solve this task, the principle of maximum expected utility is used. The output of the discussed decision support system is the list of ordered potential actions, among which the decision maker can choose the most appropriate one. An expected reward of each potential action is predicted by using the nonparametric Nadaraya-Watson kernel regression. We discuss the random strategies of the action choice, such as the estimation of the regression confidence interval (upper confidence bound method) and the simulated annealing with the Boltzmann exploration rule, in which the law of the temperature decrease depends on the specific domain. The experimental results are presented for the simple Bayesian network of the features generation. The discussed methods are compared with the ideal Bayesian classifier, in which the distribution of the feature vector for each state is assumed to be known. We experimentally showed, that the highest accuracy and its minimal standard deviation is achieved for the Boltzmann exploration. The superiority of the Boltzmann rule is especially noticeable for the small training samples. Finally, we discussed the potential enhancements of the descibed decision-making scheme, e.g. the investigation of the special case of the binary reward, processing of the state reflected in the complex features (photos/videos of the observed technical object, etc.).
Pages: 52-58
References

 

  1. Milov V.R., Baranov V.G., Alekseev V.V., SHibert R.L., Egorov JU.S., Sevrjukov A.A., Milov D.V. Podderzhka prinjatija reshenijj pri monitoringe tekhnicheskogo sostojanija magistralnykh gazoprovodov // Informacionno-izmeritelnye i upravljajushhie sistemy. 2015. T. 13. № 3. S. 37-42.
  2. Milov V. R., Suslov B. A., Kryukov O. V. Intellectual management decision support in gas industry // Automation and Remote Control. 2011. V. 72. № 5. P. 1095-1101.
  3. Lu T., Pál D., Pál M. Contextual multi-armed bandits // International Conference on Artificial Intelligence and Statistics. 2010. P. 485-492.
  4. Russell S., Norvig P. Artificial intelligence: a modern approach. 3 ed. Pearson. 2009. 1152 p.
  5. Van Hasselt H. Reinforcement learning in continuous state and action spaces, Reinforcement learning. Springer Berlin Heidelberg. 2012.P. 207-251.
  6. Li Q., Racine J. S. Nonparametric econometrics: theory and practice. Princeton University Press. 2007. 746 p.
  7. Specht D. F. A general regression neural network // IEEE Transactions on Neural Networks. 1991. V. 2. № 6. P. 568-576.
  8. Savchenko A.V., Milov V.R. Nejjrosetevye metody raspoznavanija kusochno-odnorodnykh obektov // Nejjrokompjutery: razrabotka, primenenie. 2014. № 11. S. 10-20.
  9. Nikolenko S. I., Tulupev A. L. Samoobuchajushhiesja sistemy. M.: MCNMO. 2009. 288 s.
  10. Auer P., Cesa-Bianchi N., Fischer P. Finite-time analysis of the multiarmed bandit problem // Machine Learning. 2002. V. 47. P. 235-256.
  11. Aarts E., Korst J. Simulated annealing and Boltzmann machines. Wiley. 1988. 284 p.