350 rub
Journal Neurocomputers №5 for 2020 г.
Article in number:
Machine learning for metagraph knowledge base trained on results of text mining
Type of article: scientific article
DOI: 10.18127/j19998554-202005-01
UDC: 004.912
Authors:

A. I. Kanev – Post-graduate Student, Bauman Moscow State Technical University (Moscow, Russia) E-mail: kanevai@student.bmstu.ru

Abstract:

Using traditional information retrieval, the user is often faced with the need to formulate several queries from general to specific ones and analyze intermediate data. One of the ways to solve this problem is semantic search. One of the main tasks of semantic search is text mining, it is also used in a large number of other tasks. Various approaches are used for this purpose based on machine learning or word processing using rules, each of them has its own advantages and disadvantages. The paper considers the problem of knowledge and rules representation in text mining. It is proposed to solve the need for a combination of soft computing and rule processing methods using metagraphs in a hybrid intelligent system. It is suggested to use machine learning to calculate the weights in the metagraph knowledge base.

The goal of this paper is the development of a text mining technique to extract knowledge and fill the metagraph knowledge base. This technique consists of morphological, syntactic and context modules for text mining and information extraction and uses metagraph knowledge base for knowledge representation. Using different rules the context module processes the most appropriate meaning for the word. Emergence feature of metagraph allows to represent complicated relations between concepts. The paper describes the main data structures and algorithms for their processing for the proposed method of text mining using a metagraph knowledge base. The application of the described methodology is illustrated by the problem of finding hyponyms for a given concept. It also contains the investigation results of the machine learning execution time for the knowledge base, the number of extracted concepts and relationships from texts, obtained from Open Corpora text set. Training time linearly depends on the number of concepts in the knowledge base and the number of training examples.

The proposed mining analysis technique is a part of the semantic search system. Extracted concepts and relations from knowledge based are used for document indexing. Different rules for processing each language and common semantic module allow to combine knowledge mined from texts on different languages. This technique can be used for other text mining tasks: text classification, machine translation, etc

Pages: 5-13
For citation

Kanev A.I. Machine learning for metagraph knowledge base trained on results of text mining. Neurocomputers. 2020. Vol. 22. No. 5. P. 5–13. DOI: 10.18127/j19998554-202005-01. (in Russian)

References
  1. Zhang Y., Chen M., Liu L. A review on text mining. 2015 6th IEEE International Conference on Software Engineering and Service Science. September 2015. P. 681–685. DOI: 10.1109/ICSESS.2015.7339149.
  2. Sukanya M., Biruntha S. Techniques on text mining. 2012 IEEE International Conference on Advanced Communication Control and Computing Technologies. August 2012. P. 269–271. DOI: 10.1109/ICACCCT.2012.6320784.
  3. Mousavi H., Kerr D., Iseli M., Zaniolo C. Mining semantic structures from syntactic structures in free text documents. Semantic Computing International Conference. June 2014. P. 84–91. DOI: 10.1109/ICSC.2014.31.
  4. Verma V., Ranjan M., Mishra P. Text mining and information professionals: Role, issues and challenges. 2015 4th International Symposium on Emerging Trends and Technologies in Libraries and Information Services. 2015. P. 133–137. DOI: 10.1109/ ETTLIS.2015.7048186.
  5. Akilan A., Phil M. Text mining: Challenges and future directions. 2015 2nd International Conference on Electronics and Communication Systems. 2015. P. 1679–1684. DOI: 10.1109/ECS.2015.7124872.
  6. Wang K.Q., Wu Q.K., Mao H.Y., Zhou M.B., Jiang K., Zhu X.P., Yang L., Wang T., Wang H.Q. Intelligent text mining based financial risk early warning system. 2015 2nd International Conference on Information Science and Control Engineering. April 2015. P. 279–281. DOI: 10.1109/ICISCE.2015.68.
  7. Pinho I.C., Epstein D., Berni E., Corrêa R., Corrêa Y. The use of text mining to build a pedagogical agent capable of mediating synchronous online discussions in the context of foreign language learning. 2013 IEEE Frontiers in Education Conference. 2013. P. 393–399.
  8. Champion H., Pizzi N., Krishnamoorthy R. Tactical clinical text mining for improved patient characterization. 2014 IEEE International Congress on Big Data. 2014. P. 683–690.
  9. Chernen'kij V.M., Gapanyuk Yu.E., Revunkov G.I., Terekhov V.I., Kaganov Yu.T. Metagrafovyj podkhod dlya opisaniya gibridnykh intellektual'nykh informatsionnykh system. Prikladnaya informatika. 2017. T. 12. № 3 (69). S. 57–79. (in Russian)
  10. Revunkov G.I., Gapanyuk Yu.E., Fedorenko Yu.S. Opisanie nejronnoj seti s ispol'zovaniem metagrafovogo podkhoda. Estestvennye i tekhnicheskie nauki. 2016. № 12. S. 278–281. (in Russian)
  11. Kanev A., Cunningham S., Terekhov V. Application of formal grammar in text mining and construction of an ontology. Proceedings of the 7th International Conference of Internet Technologies and Applications. September 2017. P. 53–57. DOI: 10.1109/ ITECHA.2017.8101910.
Date of receipt: 25 августа 2020 г.