D.O. Briukhov – Ph.D.(Eng.), Senior Research Scientist, Institute of Informatics Problems of FRC CSC RAS (Moscow) E-mail: dbriukhov@ipiran.ru
N.A. Skvortsov – Research Scientist, Institute of Informatics Problems of FRC CSC RAS (Moscow) E-mail: nskv@ipi.ac.ru
S.A. Stupnikov – Ph.D.(Eng.), Senior Research Scientist, Institute of Informatics Problems of FRC CSC RAS (Moscow) E-mail: sstupni-kov@ipiran.ru
Diversity of data sources on Arctic region that can be used for planning of search and rescue operations is quite significant. That is why the problem of development of data integration methods for this area is urgent. This paper presents an approach for implementation of methods for data integration into a unified warehouse. In particular, the following issues are considered: extraction of structured data from text documents, transformation of data into warehouse schema, fusion of data from various sources to create integrated entities, data integration program verification, warehouse implementation, loading of integrated data into the warehouse. Examples of analytical queries over warehouse schema that can be used for planning of search and rescue operations are presented.
- Bryuxov D.O. Istochniki danny’x dlya informaczionnoj podderzhki poiskovo-spasatel’ny’x operaczij // Sistemy’ vy’sokoj dostupnosti. 2015. T. 11. № 4. S. 83−89.
- Bryuxov D.O., Skvorczov N.A., Stupnikov S.A. Metody’ integraczii raznostrukturirovanny’x danny’x po Arkticheskoj zone dlya izvlecheniya informaczii, naczelennoj na podderzhku poiskovo-spasatel’ny’x operaczij // Sistemy’ vy’sokoj dostupnosti. 2017. T. 13. № 2. S. 3−19.
- White T. Hadoop: The Definitive Guide. Third Edition. O'Reilly Media. 2012.
- Skvorczov N.A., Bryuxov D.O. Razrabotka sxemy’ xranilishha danny’x dlya podderzhki poiskovy’x dejstvij v Arkticheskoj zone // Sistemy’ vy’sokoj dostupnosti. 2017. T. 13. № 2. S. 20−44.
- Capriolo E., Wampler D., Rutherglen J. Programming Hive Data Warehouse and Query Language for Hadoop. O'Reilly Media. 2012.
- Miner D. MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems. O'Reilly Media. 2012.
- Stupnikov S. Semantics and Verification of Entity Resolution and Data Fusion Operations via Transformation into a Formal Notation // In: Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2016. Communications in Computer and Information Science / Ed. by L. Kalinichenko, S. Kuznetsov, Y. Manolopoulos. Springer. 2017. V. 706. P. 145−162.
- Sistema monitoringa sudov «Viktoriya». URL = http://victoria.marsat.ru/ (data obrashheniya: 01.08.2018).
- KIIS «MoRe». URL = http://www.marsat.ru/ciis-more (data obrashheniya: 01.08.2018).
- ESIMO. URL = http://portal.esimo.ru/portal (data obrashheniya: 01.08.2018).
- Sistema KOSPAS-SARSAT. URL = https://www.cospas-sarsat.int/ru/ (data obrashheniya: 01.08.2018).
- Programmny’j kompleks «Poisk-More». URL = http://map.geopallada.ru/ (data obrashheniya: 01.08.2018).
- Introducing JSON. 2014. URL = http://www.json.org/ (data obrashheniya: 01.08.2018).
- Dmitriy Deviatkin, Artem Shelmanov Towards Text Processing System for Emergency Event Detection in the Arctic Zone // Trudy’ XVIII Mezhdunar. konf. «Analitika i upravlenie danny’mi v oblastyax s intensivny’m ispol’zovaniem danny’x DAMDID/RCDL′2016» (Ershovo, 11−14 oktyabrya 2016 g., Rossiya) / Pod red. L.A. Kalinichenko, Ya. Manolopulosa, S.O. Kuzneczova. M.: FICz IU RAN. 2016. S. 225−232.
- Devyatkin D.A., Shelmanov A.O. Primenenie metodov intellektual’nogo analiza tekstov v zadache monitoringa chrezvy’chajny’x situaczij v Arkticheskoj zone // Sistemy’ vy’sokoj dostupnosti. 2017. T. 13. № 2. S. 45−55.
- Annotation Query Language (AQL) reference. URL = https://www.ibm.com/support/knowledgecenter/SSPT3X_4.1.0/com.ibm. swg.im.infosphere.biginsights.aqlref.doc/doc/aql-overview.html (data obrashheniya: 01.08.2018).
- Beyer K.S., Ercegovac V., Gemulla R., Balmin A., Eltabakh M., Kanne C.-C., Ozcan F., Shekita E.J. Jaql: A Scripting Language for Large Scale Semistructured Data Analysis. VLDB 2011.
- Hernández M., Koutrika G., Krishnamurthy R., Popa L., Wisnesky R. HIL: a high-level scripting language for entity integration // Proc. of the 16th International Conference on Extending Database Technology EDBT 2013. P. 549−560.
- Abrial J.-R. The B-Book: Assigning Programs to Meanings. Cambridge: Cambridge University Press. 1996.
- Atelier B, the industrial tool to efficiently deploy the B Method. URL = http://www.atelierb.eu/ (data obrashheniya: 01.08.2018).