Multi-step algorithm based on neural network committee for analysis of multi-dimensional time series

350 rub

Journal Neurocomputers №3 for 2010 г.

Article in number:

Keywords: neural networks prediction time series precursors feature selection neural network ensemble (committee)

Authors:

A. G. Guzhva, S. A. Dolenko, I. G. Persiantsev, Yu. S. Shugai

Abstract:

The problem of time series (TS) prediction (either binary TS - a series of events, or continuous one) based on previous values of several other time series (multidimensional TS) is considered. In the process of prediction, one should take into consideration the values of the physical features not only in a single point in time, but within some interval in the past. Therefore, for each TS, delay embedding is performed, i.e. each physical feature gives rise to a set of input features (input variables of the problem), which are the values of this physical feature in adjacent time moments in the past. This leads to a significant increase in total number of input features of the problem. In connection with that, besides prediction itself, one more very important task is finding precursors, i.e. determination of a set of the most significant input features in coordinates - initial physical feature (time series) ? lag?. The four-stage algorithm based on neural network (NN) committee, which is considered in this study, has been intended to solve both problems - prediction and search for precursors in multi-dimensional time series. The paper contains key point and principles the algorithm is based on, and all the stages of the algorithms are described: 1. Forming of the initial feature set (as multi-dimensional TS) from the preliminary feature set. 2. Finding within the search interval the most probable phenomenon causing the event (or initiating the value of the sought-for quantity), determination of the duration of the phenomenon (initiation interval), and determination of the delay between the phenomenon (event initiation) and the event itself. 3. Extracting a precursor of the event, i.e. a combination of the most significant input features of the problem, within the time interval and the set of TS that describe the phenomenon. 4. Making the prediction again, using only the features making up the precursor as input variables. Testing of the algorithm has been performed on a model problem with two options (continuous TS prediction and events prediction) and on a real world problem from solar physics domain, with continuous output. To construct a model dependence with continuous output, a linear combination of 5 features that correspond to 5 different TS with various fixed delays, has been used. Model dependence with binary output was obtained from the model with continuous output by introducing a condition. For the model with continuous output, the first three stages of the algorithm were able to choose correctly 4 of 5 input features that have been used, and a highly accurate neural network model able to predict an output has been obtained. For the model with binary output (prediction of events) it has been shown, that a necessary condition for success in problem solution is to equalize the balance between "events" and "non-events" in the dataset. In the case when the balance has been corrected, as the result of four stages of the algorithm it was possible to obtain a neural network model that showed a rather small rate of type I errors (about 3.4%), and that was nearly free of type II errors. Using express methods of selection of significant input features at the third stage of the algorithm allowed reducing the number of input features used, although the reduction was not as significant as for the model problem with continuous output (where the method used was sequential forward selection). Applying the algorithm for the real world problem (the prediction of geomagnetic index based on solar wind parameters) allowed extracting the precursor - a set of 12 most significant features. The prediction based on these 12 features was found to have smaller error than the prediction based on the set of all 960 available input features. The features selected are in concord with physical notions of the studied problem. Thus, the suggested and implemented four-stage algorithm based on neural network (NN) committee for prediction and search for precursors in multi-dimensional time series showed high efficiency on model data with continuous and binary outputs, and on a real world problem with continuous output as well.

Pages: 4-13

References

Доленко С. А., Орлов Ю. В., Персианцев И. Г., Шугай Ю. С. Нейросетевые алгоритмы прогнозирования событий и поиска предвестников в многомерных временных рядах // Нейрокомпьютеры: разработка, применение. 2005. № 1-2. С. 21-28.
Dolenko, S. A., Orlov, Yu. V., Persiantsev I. G., and Shugai,Yu. S., Neural Network Algorithms for Analyzing Multidimensional Time Series for Predicting Events and Their Application to Study of Sun-Earth Relations // Pattern Recognition and Image Analysis. 2007. V. 17. No. 4. P. 584-591.
Guzhva, A. G., Dolenko, S. A., Persiantsev, I. G., and Shugai, Yu. S., Comparative Analysis of Methods for Determination of Significance of Input Variables in Neural Network Modeling: Procedure of Comparison and its Application to Model Problems // 8th International Conference «Pattern Recognition and Image Analysis: New Information Technologies» (PRIA-8-2007): Conference Proceedings. Yoshkar-Ola. 2007. V. 2. P. 29-32.
Гужва А. Г., Доленко С. А., Персианцев И. Г., Шугай Ю. С. Сравнительный анализ методов определения существенности входных переменных при нейросетевом моделировании: методика сравнения и ее применение к известным задачам реального мира // Нейроинформатика-2008. X Всероссийская научно-техническая конференция: Сборник научных трудов. Ч. 2. М.: МИФИ. 2008. С. 216-225.
Dolenko, S., Guzhva, A., Persiantsev, I., and Shugai, Yu.,Multi-stage Algorithm Based on Neural Network Committee for Prediction and Search for Precursors in Multi- dimensional Time Series // In: C.Alippi et al (Eds.): ICANN 2009, Part II. (Lecture Notes in Computer Science. 2009. V. 5769. P. 295-304.) Springer-Verlag Berlin Heidelberg. 2009.
Wolpert, D. H., Stacked Generalization // Neural Networks. 1992. V. 5. P. 241-259.
Гужва А. Г., Доленко С. А., Персианцев И. Г., Шугай Ю. С. Многоступенчатый алгоритм на основе комитета нейронных сетей для прогнозирования и поиска предвестников в многомерных временных рядах // Нейроинформатика-2009. XIВсероссийская научно-техническая конференция: Сб. научн. тр. Ч. 2. С. 116-125. М.: МИФИ. 2009.
Gleisner, H., Lundstedt, H., and Wintoft, P., Predicting geomagnetic storms from solar-wind data using time-delay neural networks // Annales Geophysicae. 1996. V. 14. P. 679-686.
http://swdcwww.kugi.kyoto-u.ac.jp/dst_realtime/index.html
http://www.srl.caltech.edu/ACE/ASC/browse/view_browse_data.html
http://www-psych.stanford.edu/~andreas/Time-Series/