350 rub
Journal Neurocomputers №11 for 2014 г.
Article in number:
Neural-network methods of piecewise-regular object recognition
Authors:
А. V. Savchenko - Ph.D. (Eng.), Associate Professor, National Research University Higher School of Economics (N. Novgorod); Doctoral-candidate, Nizhniy Novgorod State Technical University n.a. R.E. Alekseev. E-mail: avsavchenko@hse.ru
V. R. Milov - Dr.Sc. (Eng.), Professor, Head of Department «Electronics and Computer Networks», Nizhniy Novgorod State Technical University n.a. R.E. Alekseev. E-mail: vladimir.milov@gmail.com
Abstract:
One of the most significant tasks of pattern recognition is a classification problem. It is required to assign the query object to one of C classes given by the database (DB) which contains R ≥ C models. Depending of the analyzed object description, it is possible to extract 3 possible research directions: 1) conventional pointwise classification (the object is specified with the feature vector); 2) group-choice classification (the object is the group or sequence of feature vectors); and 3) classification of piecewise-regular objects (images, speech signals) which contain several independent homogeneous parts (segments). The decision is taken in favor of the best model in terms of the closeness for all its segments. Each segment is recognized with the pointwise or group classifier. The most well-studied tasks involve small number of classes and large model DB (C << R), e.g., optical character recognition, classification of traffic signs and phonemes. The review of the recent papers showed that the current trends in composite object recognition are connected with the refusal of feature extraction algorithms and application of classifiers with complex structure and primitive features (e.g., raw pixel matrix of image, spectrum of speech signal, etc.). The task becomes more complicated if only small number of models is available for each class. Practically all methods of piecewise-regular objects recognition are the nearest-neighbor methods. The special interest here is the probabilistic neural network with homogeneity testing in which the problem is reduced to a statistical testing of complex hypothesis with the decision based on the maximum likelihood method. Practical implementation of brute force nearest neighbor search in real-time applications is difficult for middle-sized DB (thousands of classes) and especially large DBs (tens and hundreds of thousands of classes). In the latter case the accuracy of modern classifiers is usually so low that they are integrated in automated decision-support system of content-based object retrieval. The algorithms return several potential candidates and the decision maker is responsible to choose the correct one. Unfortunately, most of known fast approximate nearest-neighbor methods, developed for very-large DBs, do not lead to a significant improvements of computing efficiency in comparison with brute-force strict nearest-neighbor search if the number C does not exceed thousands of classes. It is shown that in such case the directed enumeration method based on the asymptotic properties of the PNNH decision statistics can be applied to implement the real-time composite object recognition.
Pages: 10-20
References

  1. Theodoridis S., Koutroumbas K. Pattern Recognition, Fourth Edition. Burlington, MA. London: Academic Press. 2008. 984 p.
  2. Orlov A.I. O razvitii matematicheskikh metodov teorii klassifikatsii (obzor) // Zavodskaya laboratoriya. Diagnostika materialov. 2009. T. 75. № 7. S. 51-63.
  3. Zhuravlev Yu.I., Ryazanov V.V., Sen'ko O.V. Raspoznavanie. Matematicheskie metody. Programmnaya sistema. Prakticheskie primeneniya. M.: FAZIS. 2006. 176 s.
  4. Abusev R.A., Lumel'skiy Ya.P. Statisticheskaya gruppovaya klassifikatsiya: Uchebnoe posobie po spetskursu. Perm': Perm. un-t. 1987. 97 s.
  5. Haykin S.O. Neural Networks and Learning Machines. 3 ed. Harlow: Prentice Hall. 2008. 936 p.
  6. Rutkowski L. Computational Intelligence: Methods and Techniques. Softcover reprint of hardcover 1st ed. 2008 edition. Springer. 2010. 514 p.
  7. Abusev R.A. On group choice procedures for problems of classification and reliability in the case of lognormal variance // Journal of Mathematical Sciences. 2013. V. 189. № 6. P. 911-918.
  8. Savchenko A.V. Obraz kak sovokupnost' vyborok nezavisimykh odinakovo raspredelennykh znacheniy priznakov v zadachakh raspoznavaniya slozhnostrukturirovannykh ob''ektov // Zavodskaya laboratoriya. Diagnostika materialov. 2014. T. 80. № 3. S. 70-80.
  9. Dalal N., Triggs B. Histograms of oriented gradients for human detection // IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005. CVPR 2005. 2005. P. 886-893.
  10. Savchenko A.V. Directed enumeration method in image recognition // Pattern Recognition. 2012. V. 45. № 8. P. 2952-2961.
  11. Lowe D.G. Distinctive Image Features from Scale-Invariant Keypoints // International Journal of Computer Vision. 2004. V. 60. № 2. P. 91-110.
  12. Benesty J., Sondhi M.M., Huang Y. Springer Handbook of Speech Processing. Berlin: Springer. 2008. 1176 p.
  13. Qiao Y., Shimomura N., Minematsu N. Unsupervised optimal phoneme segmentation: Objectives, algorithm and comparisons // IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP 2008. 2008. P. 3989-3992.
  14. Myers C.S., Rabiner L.R. A comparative study of several dynamic time-warping algorithms for connected-word recognition // Bell System Technical Journal. 1981. V. 60. № 7. P. 1389-1409.
  15. Rabiner L., Juang B.-H. Fundamentals of Speech Recognition. Englewood Cliffs, N.J.: Prentice Hall. 1993. 496 p.
  16. Chapelle O., Schölkopf B., Zien A. Semi-Supervised Learning. 1 edition. Cambridge, Mass.: The MIT Press. 2010. 528 p.
  17. Erman L.D., Hayes-Roth F., Lesser V.R., Reddy D.R. The Hearsay-II Speech-Understanding System: Integrating Knowledge to Resolve Uncertainty // ACM Comput. Surv. 1980. V. 12. № 2. P. 213-253.
  18. Hinton G.E., Osindero S., Teh Y.-W. A Fast Learning Algorithm for Deep Belief Nets // Neural Computation. 2006. V. 18. № 7. P. 1527-1554.
  19. LeCun Y., Bottou L., Bengio Y., Haffner P. Gradient-based learning applied to document recognition // Proceedings of the IEEE. 1998. V. 86. № 11. P. 2278-2324.
  20. Hochreiter S., Schmidhuber J. Long Short-Term Memory // Neural Computation. 1997. V. 9. № 8. P. 1735-1780.
  21. Fukushima K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position // Biological Cybernetics. 1980. V. 36. P. 193-202.
  22. Shapiro L.G., Stockman G.C. Computer Vision. Upper Saddle River, NJ: Prentice Hall. 2001. 608 p.
  23. Savchenko A.V. Adaptive video image recognition system using a committee machine // Optical Memory and Neural Networks. 2012. V. 21. № 4. P. 219-226.
  24. Cireşan D., Meier U., Masci J., Schmidhuber J. Multi-column deep neural network for traffic sign classification // Neural Networks. 2012. V. 32. P. 333-338.
  25. Cireşan D., Meier U., Gambardella L.M., Schmidhuber J. Deep, Big, Simple Neural Nets for Handwritten Digit Recognition // Neural Computation. 2010. V. 22. № 12. P. 3207-3220.
  26. Schmidhuber J. Multi-column Deep Neural Networks for Image Classification // Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington. DC. USA: IEEE Computer Society. 2012. P. 3642-3649.
  27. Krizhevsky A., Sutskever I., Hinton G.E. ImageNet Classification with Deep Convolutional Neural Networks // Advances in Neural Information Processing Systems 25 / Ed. Pereira F. et al. Curran Associates, Inc. 2012. P. 1097-1105.
  28. Gillick L., Cox S.J. Some statistical issues in the comparison of speech recognition algorithms // International Conference on Acoustics, Speech, and Signal Processing (ICASSP-89). 1989. P. 532-535.
  29. Hand D.J. Classifier Technology and the Illusion of Progress // Statistical Science. 2006. V. 21. № 1. P. 1-14.
  30. Schuller B., Batliner A., Steidl S., Seppi D. Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge // Speech Communication. 2011. V. 53. № 9-10. P. 1062-1087.
  31. Waibel A., Hanazawa T., Hinton G., Shikano K., Lang K.J. Phoneme recognition using time-delay neural networks // IEEE Transactions on Acoustics, Speech and Signal Processing. 1989. V. 37. № 3. P. 328-339.
  32. Bottou L., Fogelman Soulié F., Blanchet P., Liénard J.S. Speaker-independent isolated digit recognition: Multilayer perceptrons vs. Dynamic time warping // Neural Networks. 1990. V. 3. № 4. P. 453-465.
  33. Hinton G. et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups // IEEE Signal Processing Magazine. 2012. V. 29. № 6. P. 82-97.
  34. Ghoshal A., Swietojanski P., Renals S. Multilingual training of deep neural networks // IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2013. P. 7319-7323.
  35. Huang J.-T., Li J., Yu D., Deng L., Gong Y. Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers // 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2013. P. 7304-7308.
  36. Graves A., Fernández S., Gomez F. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks // In Proceedings of the International Conference on Machine Learning (ICML 2006). 2006. P. 369-376.
  37. Chow C. On optimum recognition error and reject tradeoff // IEEE Transactions on Information Theory. 1970. V. 16. № 1. P. 41-46.
  38. Graves A., Mohamed A., Hinton G.E. Speech recognition with deep recurrent neural networks // IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013). 2013. P. 6645-6649.
  39. Tan X., Chen S., Zhou Z.-H., Zhang F. Face recognition from a single image per person: A survey // Pattern Recognition. 2006. V. 39. № 9. P. 1725-1745.
  40. Milov V.R. Sintez neparametricheskogo klassifikatora na osnove iskusstvennykh neyronnykh RBF-setey // Izv. vuzov. Ser. Radiofizika. 2003. T. 46. № 2.
  41. Liao S., Zhu X., Lei Z., Zhang L., Li S.Z. Learning Multi-scale Block Local Binary Patterns for Face Recognition // Advances in Biometrics / Ed. Lee S.-W., Li S.Z. Springer Berlin Heidelberg. 2007. P. 828-837.
  42. Zhang G., Huang X., Li S.Z., Wang Y., Wu X. Boosting Local Binary Pattern (LBP)-Based Face Recognition // Advances in Biometric Person Authentication / Ed. Li S.Z. et al. Berlin Heidelberg: Springer. 2005. P. 179-186.
  43. Kullback S. Information Theory and Statistics. Mineola, N.Y.: Dover Publications. 1997. 432 p.
  44. Savchenko V.V. Avtomaticheskaya obrabotka rechi po kriteriyu minimuma informatsionnogo rassoglasovaniya na osnove metoda obelyayushchego fil'tra // Radiotekhnika i elektronika. 2005. T. 50. № 3.
  45. Specht D.F. Probabilistic neural networks // Neural networks. 1990. V. 3. № 1. P. 109-118.
  46. Borovkov A.A. Matematicheskaya statistika: dopolnitel'nye glavy. M.: Nauka. 1984. 144 p.
  47. Savchenko A.V. Metod foneticheskogo kodirovaniya v zadache raspoznavaniya izolirovannykh slov // Radiotekhnika i elektronika. 2014. T. 59. № 4.
  48. Savchenko A.V. Probabilistic neural network with homogeneity testing in recognition of discrete patterns set // Neural Networks. 2013. V. 46. P. 227-241.
  49. Ushmaev O.S. Adaptatsiya biometricheskoy sistemy k iskazhayushchim faktoram na primere daktiloskopicheskoy identifikatsii // Informatika i ee primeneniya. 2009. T. 3. № 2. S. 25-33.
  50. Silpa-Anan C., Hartley R. Optimised KD-trees for fast image descriptor matching // IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008). 2008. P. 1-8.
  51. Gonzalez E.C., Figueroa K., Navarro G. Effective Proximity Retrieval by Ordering Permutations // IEEE Transactions on Pattern Analysis and Machine Intelligence. 2008. V. 30. № 9. P. 1647-1658.
  52. Savchenko A.V. Face Recognition in Real-Time Applications: A Comparison of Directed Enumeration Method and K-d Trees // Perspectives in Business Informatics Research / Ed. Aseeva N., Babkin E., Kozyrev O. Berlin Heidelberg: Springer. 2012. P. 187-199.