350 rub
Journal Neurocomputers №11 for 2012 г.
Article in number:
Forecasting dynamic ANN construction on the base of algorithmic information theory
Authors:
A.S. Potapov
Abstract:
Artificial neural networks (ANNs) are the widely used tool for solving the task of forecasting. However, their advantages should be strictly described in order to perform their further improvement. To achieve this aim, the problem of ANN learning is considered as the task of inductive inference within the algorithmic information theory framework. Efficiency of ANNs as model representations is specified by compactness of descriptions of regularities, which presence is expected in the data to be extrapolated, because model description length corresponds to amount of training data necessary for their reconstruction. Automatic construction of models of dynamic systems is necessary for solving the forecasting tasks, so such the models should be representable within ANNs and have low complexity (description length). As far as classical ANNs with nonlinear activation functions don-t possess these properties (the basic elementary functions aren't representable with them), modification of ANN formalism was performed. Dynamic (recurrent with continuous time) ANNs with linear activation function were taken as the basis, because combinations of harmonic, polynomial, and exponential functions are representable with them. Although approximation of other functions with arbitrary precision with the use of such ANNs is possible in theory, corresponding extrapolation precision will always be restricted in practice. Therefore, improvement of expressive power of underlying representation is necessary in order to extend capabilities of ANNs. In order to preserve representability of mentioned elementary functions, extension of linear dynamic ANNs by introduction of «connections on connections» with nonlinear effect on signals propagating through ordinary connections is proposed instead of introducing the nonlinear activation functions. A general algorithm of training and architecture selection of such the networks is offered on the base of the minimum description length criterion that helps to avoid the overfitting problem and to provide the best forecasting precision (reachable within the selected representation). Experimental validation on model data showed low (about 1% on the doubled time interval) extrapolation error for time series built with the use of elementary functions. More complex, in particular, chaotic regularities are also representable. However, real time series that are non-stationary and chaotic require additional development of both neural representation of models and algorithms of their optimization for further improvement of their forecasting efficiency.
Pages: 60-68
References
  1. Потапов А.С. Распознавание образов и машинное восприятие: общий подход на основе принципа минимальной длины описания. СПб: Политехника. 2007.
  2. Колмогоров А.Н. К логическим основам теории информации и теории вероятностей // Проблемы передачи информации. 1969. Т. 5. № 3. С. 3-7.
  3. Rissanen J.J.Modeling by the shortest data description // Automatica-J.IFAC. 1978. V. 14. P. 465-471.
  4. Wallace C.S., Boulton D.M. An information measure for classification // Comput. J. 1968. V. 11. P. 185-195.
  5. Vitanyi P., Li M. Ideal MDL and its relation to Bayesianism // Proc. ISIS: Information, Statistics and Induction in Science. 1996. P. 282-291.
  6. Потапов А.С. Сравнительный анализ структурных представлений изображений на основе принципа репрезентационной минимальной длины описания // Оптический журнал. Т. 75. № 11. 2008. С. 35-41.
  7. Li M., Vitanyi P.M.B. Philosophical issues in Kolmogorov complexity (invited lecture) // In W. Kuich, ed., Proc. on Automata, Languages and Programming. 1992. V. 623. P. 1-15.
  8. Solomonoff R. A formal theory of inductive inference, par1 and part 2 // Information and Control. 1964. V. 7. P. 1-22, 224-254.
  9. Vitanyi P.M.B., Li M. Minimum description length induction, Bayesianism, and Kolmogorov complexity // IEEE Trans. on Information Theory. 2000. V. 46. No. 2. P. 446-464.
  10. Solomonoff R. Does Algorithmic Probability Solve the Problem of Induction - // Oxbridge Research. P.O.B. 391887. Cambridge. Mass. 02139. 1997.
  11. Морозов О.А., Овчинников П.Е., Семин Ю.А., Фидельман В.Р. Применение теоретико-информационного подхода в задаче обучения многослойного персептрона // Нейрокомпьютеры:разработка, применение. 2011. № 3. C. 29-33.
  12. Leonardis A., Bischof H. An efficient MDL-Based construction of RBF networks // Neural Networks. 1998. V. 11. № 5.
    P. 963-973.
  13. Jeen-Shing Wang, Yu-Liang Hsu An MDL-Based Hammerstein Recurrent Neural Network for Control Applications // Neurocomputing. 2010. V. 74. P. 315-327.
  14. Каримова Л.М., Ким С.А., Пак И.Т. Принцип минимальной длины описания в задачах нейропрогноза временных рядов // Нейроинформатика. 2003. С. 192-199.
  15. Zhao Yi, Small M. Minimum description length criterion for modeling of chaotic attractors with multilayer perceptron networks // IEEE Transactions on Circuits and Systems I. 2006. V. 53. Iss. 3. P. 722-732.
  16. Hornik K., Stinchcombe M., White H. Multilayer Feedforward Networks are Universal Approximators // Neural Networks. 1989. V. 2. № 5. P. 359-366.
  17. Potapov A.S., Malyshev I.A., Puysha A.E., Averkin A.N. New paradigm of learnable computer vision algorithms based on the representational MDL principle // Proc. SPIE. 2010. V. 7696. P. 769606.