M.M. Gershkovich – Senior Research Scientist,
Institute of Informatics Problems of FRC CSC RAS (Moscow) E-mail: makmg@mail.ru
This paper formulates tasks for identification of image containing informational objects (IO) in area spread data systems, processing IO events of different origin. An informational object is an array of data compiled according to certain rules. The system considers IO to be a single entity. The method for solving these tasks is based on selection of criteria for merging and clustering of IO with a priori imprecisely defined parameters. Additionally, formulations and solution methods for tasks of monitoring the data in area spread automatic data system (ADS) are described for IO containing photos.
The paper considers ADS that accumulate the data from area spread sources of different types. The data contain an information about events with certain subjects, in example with persons. The record of event includes attributes of event (date of event, etc.) and attributes of subjects of event, where any subject is IO.
ADS with function of data analysis face the IO identification problem: selection of array of IO’s, which with certain probability are the same. Solving this problem enables the search of events with this certain IO, which attributes may randomly be distorted from one event to another. In example, the second name of an individual may randomly be distorted while entering the data in to the system or legally changed due to marriage. Hence, biometrical data that are hardly changing during person’s life span may be of critical importance for person’s identification. Photo is a valid example and usually only available kind of biometrical data.
In order to perform efficient data analysis ADS should automatically provide correct solution of following particular tasks of identification:
- the task of making IO from a stream of data relating to events with IO. This task consists of two subtasks: selection of IO’s from data streams and merging of IO’s (made from the data of different sources and origin) that with almost 100% of probability are the same;
- the task of clustering: making «Complex IO» – aggregated IO, consisting of IO similar with respect to a certain criteria.
IO merging procedure excludes the doubling of the data in the system database. Automatic clustering of IO data in informational system significantly expands the range of available analytical solutions, particularly for the task of setting the links between individuals and events, in which those persons have ever participated.
In order to perform merging procedure IO’s are selected from system database or/and data stream using a criterion that suggests identity of selected IO’s. Merging of selected IO’s makes a single IO with attributes originating from all selected IO’s (fitting the merging criteria).
«Complex IO» are in fact clusters of IO’s (subsets consisting of IO’s, similar according to certain criteria) from set of all IO’s from a database.
This paper presents different criteria of merging and clustering of IO, considering specific characteristic parameters of IO.
Automatic clustering of IO data, particularly on a person, enables the search through the complete system considering all available information on IO (including all images).
The area of application for presented IO’s identification technologies includes designing of extended toolkit for searching IO (including image-containing IO), setting the links between certain IO’s, data monitoring in large scaled area and department spread data systems. Systems that provide wide range of tools for analyzing imprecisely defined data may be employed for informatization of both state institutions and private commercial companies, particularly including objects, distributed through a large area. Areas of application may include public safety, banking, school and university educational programs. These technologies may be introduced in to working processing of Federal Migration Service, Pension Fund of Russian Federation, Citizens State Registry offices, Social services offices network. Presented algorithms may be employed for data processing in wide range of events, in example, large-scale sport competitions, federal and municipal elections, etc. Elaborated technologies enable the design of interdepartmental systems.
- Gershkovich M.M., Biryukova T.K. Zadachi identifikatsii informatsionnykh ob’‘ektov v raspredelennykh massivakh dannykh. Sistemy i sredstva informatiki. 2014. T. 24. № 1. S. 224−241. (In Russian).
- Budzko V.I. Razvitie sistem vysokoy dostupnosti s primeneniem tekhnologii «bol’shie dannye». Sistemy vysokoy dostupnosti. 2013. № 4. S. 3−11. (In Russian).
- Zakharov V.N., Sadovnikov D.A., Smirnov M.V., KHoroshilov A.A. Avtomaticheskoe vyyavlenie ugroz obshchestvu i gosudarstvu v sotsial’nykh setyakh i sredstvakh massovoy informatsii. Sistemy vysokoy dostupnosti. 2017. № 4. S. 13−17. (In Russian).
- Belousov V.V. Sovremennye podkhody k razvitiyu intellektual’nykh tekhnologiy dlya informatsionno-upravlyayushchikh sistem. Sistemy vysokoy dostupnosti. 2015. № 2. S. 43−49. (In Russian).
- Arlazarov V.L. Metod nechetkogo poiska izobrazheniy v bol’shikh ob’‘emakh videodannykh. Sistemy vysokoy dostupnosti. 2016. № 1. S. 53−58. (In Russian).
- Gershkovich M.M., Biryukova T.K., Sinitsin V.I. Problemy proektirovaniya federal’nykh informatsionno-telekommunikatsionnykh sistem, organizatsiya vzaimodeystviya khranilishch dannykh v territorial’no-raspredelennykh sistemakh i zadachi raspoznavaniya informatsionnykh ob’‘ektov. Doklad na IX Mezhdunar. nauchno-tekhnich. konf. «Optiko-elektronnye pribory i ustroystva v sistemakh raspoznavaniya obrazov, obrabotki izobrazheniy i simvol’noy informatsii» – «Raspoznavanie – 2010». Rossiya, Kursk. 18−20 maya 2010. (In Russian).
- Gershkovich M.M., Biryukova T.K., Sinitsin V.I., Dolgopolov B.C., Zaikin M.Yu., Kozlova L.M. Osobennosti organizatsii informatsionnogo obmena v mnogourovnevykh territorial’no-raspredelennykh sistemakh. Trudy XII Mezhdunar. nauchno-tekhnich. konf. «Kibernetika i vysokie tekhnologii XXI veka». Rossiya, Voronezh. 2011. T. 1. S. 1−12. (In Russian).
- Ushmaev O.S., Bosov A.V. Realizatsiya kontseptsii mnogofaktornoy biometricheskoy identifikatsii v integrirovannykh analiticheskikh sistemakh. Sistemy vysokoy dostupnosti. 2007. T. 3. № 4. S. 13−23. (In Russian).
- Ushmaev O.S. Problems of automatic fusion of biometric identificators. Pattern Recognition and Image Analysis. 2009. V. 19. № 3. P. 534−538.
- Kukharev G.A., Kamenskaya E.I., Matveev Yu.N., Shchegoleva N.L. Metody obrabotki i raspoznavaniya izobrazheniy lits v zadachakh biometrii. Pod red. M.V. KHitrova. SPb.: Politekhnika. 2013. 388 s. (In Russian).
- Nguen Zuy Tkhan’, KHachumov V.N. Modeli i metody sopostavleniya izobrazheniy v zadache raspoznavaniya lits. Iskusstvennyy intellekt i prinyatie resheniy. 2016. № 4. S. 5−14. (In Russian).
- Muhammad Sharif, Farah Naz, Mussarat Yasmin, Muhammad Alyas Shahidand Amjad Rehman Face Recognition: A Survey. Journal of Engineering Science and Technology Review. 2017. V. 10. № 2. P. 166−177. URL = http://www.jestr.org/downloads/Volume10Issue2/ fulltext201022017.pdf.
- Himanshu Sharma, Anand Pawar, Chandhrvardan Chourasia, Sushma Khatri Imlementation of face recognition system base on Elastic graph matching. International journal of engineering Sciences & Research technology. 2016. V. 5. № 3. P. 888−895. URL = https://zenodo.org/record/48386#.XFAJlVwzZ9M.
- Lin Wu, Chunhua Shen, Antonvanden Hengel Deep linear discriminant analysis on fisher networks: A hybrid architecture for person reidentification. Pattern Recognition. 2017. V. 65. P. 238−250. URL = https://doi.org/10.1016/j.patcog.2016.12.022.
- Changxing Dinga, Dacheng Tao Pose-invariant face recognition with homography-based normalization. Pattern Recognition. 2017. V. 66. P. 144−152. URL = https://doi.org/10.1016/j.patcog.2016.11.024.