Algebraic approach the data preparation for the associative rules derivation


V.I. Munerman – Ph. D. (Eng.), Associate Professor, Department of Informatics, Smolensk State University

In the article two methods of increasing the efficiency of data processing in solving problems of derivation of associative rules are considered. Unlike most works in this field, which offer methods to improve the needs of end users involved in data analysis, it offers methods aimed at the programmer of the developer of analytical information systems.
The derivation of associative rules consists of two stages:
1. Construct for each subset of properties the number of objects that have all the properties of this subset, and only these properties.
2. Derivation of associative rules based on the received statistical data and the requirements of the user-analyst.
We consider the first stage, which realizes the preparation of data and has a high computational complexity. Acceleration can be achieved by applying a symmetrical horizontal distribution of the original data and the pipeline method of executing the chain of JOIN operations. This is possible due to the representation of data and operations by means of the file model. The possibility of data re-presentation by multidimensional matrices over which a sequence of multiplication operations is performed is shown.
The results of a computational experiment are presented, which showed that the application of the methods proposed in the article makes it possible to develop parallel software that significantly accelerates the process of preparing data for the derivation of associative rules.

