P.V. Dudarin – Post-graduate Student, Department «Information Systems», Ulyanovsk State Technical University E-mail: pavel.dudarin@gmail.com
N.G. Yarushkina – Dr. Sc. (Eng.), Professor, Head of Department «Information Systems», Ulyanovsk State Technical University E-mail: jng@ulstu.ru
In this paper the algorithm of construction of hierarchical classifier of short text fragments is presented. This algorithm is based on hierarchical clustering of fuzzy graph. Clustering and classification problem of short text fragments which are fully or partly context free is quite common nowadays. There are many examples of such objects: sms, twitter messages, paper and news headers. This paper is focused on classifier construction of key process indicators of strategic planning system of Russian Federation. Classifier is built on the result of clustering process. As a model for the clustering process the fuzzy graph is chosen, because its ability of natural presentation of word relations. This method allows perform clustering recursively, thus hierarchical classifier is obtained.
- X. Han, J. Ma, Y. Wu, C. Cui. A novel machine learning approach to rank web forum posts // Soft Computing. 2014. V. 18. № 5. P. 941−959.
- Federal'ny'j zakon «O strategicheskom planirovanii v Rossijskoj Federaczii» № 172-FZ ot 28.07.2014 g.URL = http://pravo.gov.ru/proxy/ips/?docbody=&nd=102354386 (02.05.2017).
- Oficzial'ny'j sajt Ministerstva e'konomicheskogo razvitiya Rossijskoj Federaczii. URL = http://economy.gov.ru/minec/activity/sections/strategicPlanning/ (02.05.2017).
- Oficzial'ny'j sajt Federal'noj sluzhby' gosudarstvennoj statistiki. URL = http://www.gks.ru/wps/wcm/connect/rosstat_main/rosstat/ru/statistics/databases/emiss/ (02.05.2017).
- Ball, Geoffrey H., Hall, David J. Isodata: a method of data analysis and pattern classification // Stanford Research Institute, Menlo Park,United States. Office of Naval Re-search. Information Sciences Branch. 1965.
- Desen Hou, Yundong Gu. An Efficient Successive Iteration Partial Cluster Algo-rithm for Large datasets // Fuzzy Information and Engineering. V. 78 of the series Ad-vances in Intelligent and Soft Computing. 2010. P. 557−562.
- Jie Zhang, Yuping Wang, Junhong Feng. A hybrid clustering algorithm based on PSO with dynamic crossover // Soft Computing, 2014. V. 18. № 5. P. 961−979.
- Ruspini E.H. A new approach to clustering // Inform. and Control. 1969. 15 (1) 22−32.
- Novák V., Perfilieva I., Jarushkina N.G. A general methodology for managerial decision making using intelligent techniques // Chapter Recent Advances in Decision Mak-ing. Series Studies in Computational Intelligence. 2009. Vol., 222. P. 103−120.
- Wang Li, Li Dong, Jing Tao. A Fast Global Fuzzy Clustering Algorithm for the Chemical Gray Box Modeling // Fuzzy Information and Engineering. 2010. V. 78 of the series Advances in Intelligent and Soft Computing P. 571−579. 2010.
- Yingxain Chen, Mingfeng Han, Huawei Zhu. Ant Spatial Clustering Based on Fuzzy IF-THEN Rule // Fuzzy Information and Engineering. 2010. V. 78 of the series Advances in Intelligent and Soft Computing. 2010. P. 563−569.
- Mansoori E.G. GACH: a grid based algorithm for hierarchical clustering of high-dimensional data // Soft Computing. 2014. V. 18. № 5. P. 905−922.
- Slavnov K.A. Analiz soczial'ny'x grafov. 2015. URL = http://www.machinelearning.ru/wiki/images/6/60/2015_417_SlavnovKA.pdf (02.05.2017).
- Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, Etienne Lefebvre. Fast unfolding of communities in large networks // J. Stat. Mech. 2008.
- Rosenfeld A., Fuzzy graphs // Fuzzy Sets and Their Applications to Cognitive and Decision Processes. L.A. Zadeh, K.S. Fu, K. Tanaka, M. Shimura (Eds.). New York: Academic Press. 1975. P. 77−95.
- Raymond T Yeh, Bang S.Y. Fuzzy relation, fuzzy graphs and their applications to clustering analysis // Fuzzy Sets and their Applications to Cognitive and Decision Processes. Academic Press. 1975. P. 125−149. ISBN 9780127752600.
- Sandeep Narayan K.R., Sunitha M.S., Connectivity in a Fuzzy Graph and its Complement // Gen. Math. Notes.March 2012. V. 9. № 1. P. 38−43. ISSN 2219-7184.
- Sameena K. Clustering Using Strong Arcs in Fuzzy Graphs // Gen. Math. Notes. 2012. V. 30. № 1. September 2015. P. 60−68. ISSN 2219-7184.
- Chandrasekaran E., N.Sathyaseelan. Fuzzy node fuzzy graph and its cluster analysis // International Journal of Engineering Research and Applications (IJERA). May-June 2012. V. 2. № 3. P. 733−738. ISSN 2248-9622.
- Yihong Dong, Yueting Zhuang, Ken Chen, Xiaoying Tai. A hierarchical clustering algorithm based on fuzzy graph connectedness // Fuzzy Sets and Systems. 2006. V. 157. № 13. P. 1760−1774. ISSN 0165-0114.
- Grechachin V.A. K voprosu o tokenizaczii teksta // Mezhdunar. nauchno-issledovatel'skij zhurnal. 2016. № 6 (48) Chast' 4. S. 25−27.
- HabrHabr o Pymorphy2. 2015. URL = // https://habrahabr.ru/post/176575/ (02.05.2017).
- Kutuzov Andrey and Andreev Igor. (2015) Texts in, meaning out: neural language models in semantic similarity task for Russian // Proceedings of Conference Dialog 2015. Moscow, Russia.
- Oficzial'naya dokumentacziiya po Gephi. Gephi as a tool of data visualization. 2012. URL = // https://habrahabr.ru/post/136575/ (02.05.2017).
- Rasporyazhenie pravitel'stva RF № 1398-r ot 29.07.2014. URL = // http://government.ru/docs/14051 (02.05.2017).
- Dudarin P., Pinkov A., Yarushkina N. Methodology and the algorithm for clustering economic analytics object // Automation of Control Processes. 2017. V. 47. № 1. P. 85−93.