distributed on-line learning
neural network classification
The paper describes an original model for the on-line classification task solving on continuous data. The model’s work is based on selective subset creation that is to proper describe all the data available. Interrelation analysis for every attribute in the selective subset versus classification attribute provides a set of the most worth attributes, based on which one can cluster all the data. Initial cluster centers are used for selecting instances for the proper neural networks, which is to train independently on the non-crossing data sets.
The classification task for MAGIC Telescope photos was used as a testing area. An open-access experimental database was large enough and quite unbalanced in a sense of target classes.
Comparative analysis of the proposed model and traditional neural network based architecture, Bayes naive network, decision tables, decision trees, linear regression and K-nearest neighbor classifiers was fulfilled. The results showed slight lack of proposed model’s accuracy in comparison with the best known solution (which was the neural network), that was compensated by five times faster operation speed and lesser required memory for storing training instances.
The optimal values of internal parameters were found as a result of analysis for exogenous and endogenous factors influence on model’s quality rates. The major role of packet processing algorithm in the model’s on-line learning phase was also discovered. Whereas this approach appeared as the main reason for classification accuracy decreasing, it became the cause of 2-3 times lesser amount of memory for storing training instances. It is expected that changing the current non-robust clustering method would affect on packet filtering method consequences. This could happen if the data in clusters would be better distributed and the neural networks are better specified.