Radiotekhnika
Publishing house Radiotekhnika

"Publishing house Radiotekhnika":
scientific and technical literature.
Books and journals of publishing houses: IPRZHR, RS-PRESS, SCIENCE-PRESS


Тел.: +7 (495) 625-9241

 

Method of automated recovery of significant facts from natural-language documents

Keywords:

E.D. Voytko – Ph. D. (Eng.), Associate Professor, 18 CRI of DoD RF (Moscow)
E-mail: voltron2000@mail.ru
S.V. Klimenko – Employee, 18 CRI of DoD RF (Moscow)
E-mail: s-klim@yandex.ru
V.M. Klimenko – Associate Professor, Cherepovets Higher Military Engineering School of Radio Electronics
E-mail: klimenko-0776340@yandex.ru


This article describes a method for increasing the effectiveness of large arrays of unstructured documents analytical processing due to automatic information retrieval on monitoring phenomena.
A distinctive feature of the proposed method is a two-stage text data processing. At the first stage, fragments that contain factual in-formation about events of interest with a given probability are identified in the text. At the second stage, structured descriptions of the detected events are constructed based on the analysis of the selected fragments.
To highlight the fragments of a text in interested document, an extended search mechanism is used for the keywords of a fixed set of entities used to describe events.
The solution of the information structuring problem is based on the theory of Markov random fields and includes the temporal and geographic linkage of the detected event, as well as the persons and organizations identification mentioned in the event context.

References:
  1. Mitelkov D.V., Novikov A.Y., Klimenko S.V., Smirnov A.A. An improved method for the semantic formalization of the text message based on the definition of rhetorical relations between text elements based on their referential coherence. Scientific and technical collection Federal state unitary enterprise «18 Central research institute» of Ministry of Defense. М. 2016.
  2. Voytko E.D., Kurochka N.P., Klimenko S.V. The use of the apparatus of hidden Markov fields for the extraction of named entities from natural language documents. Scientific and technical collection Federal state unitary enterprise «18 Central research institute» of Ministry of Defense. М. 2016.
  3. Golikov I.Yu. Features of elementary and subsequent processing of big unstructured textual data // Materials of IV science-technical conference «RTI – Antiaircraft defense systems-2016». 2016. P. 520−527.
  4. Mitelkov D.V., Novikov A.Y. Metod opredeleniya informacionnoj cennosti tekstovyh soobschenij // Naukoemkie tehnologii. 2016. Т. 17. № 12. Р. 67−70.

© Издательство «РАДИОТЕХНИКА», 2004-2017            Тел.: (495) 625-9241                   Designed by [SWAP]Studio