350 rub
Journal Neurocomputers №1 for 2017 г.
Article in number:
Convolutional neural network architecture using computations in residue number system with specific module set
Authors:
N.I. Chervyakov - Dr.Sc. (Eng.), Professor, Head of Department of the Applied Mathematics and Mathematical Modeling, Institute of Mathematics and Natural Sciences, North Caucasus Federal University (Stavropol) E-mail: k-fmf-primath@stavsu.ru P.A. Lyakhov - Ph. D. (Phis.-Math.), Assistant Professor, Department of the Applied Mathematics and Mathematical Modeling, Institute of Mathematics and Natural Sciences, North Caucasus Federal University (Stavropol) E-mail: ljahov@mail.ru D.I. Kalita - Post-graduate Student, Institute of Mathematics and Natural Sciences, North Caucasus Federal University (Stavropol) E-mail: diana.kalita@mail.ru M.V. Valueva - Master Student in «Applied Mathematics and Informatics», Institute of Mathematics and Natural Sciences, North Caucasus Federal University (Stavropol) E-mail: mriya.valueva@mail.ru
Abstract:
Residue Number System (RNS) due to property of parallel computations can be effectively used in the Convolutional Neural Network (CNN) architecture, which also has a parallel structure. Combination of CNN and RNS makes actual a problem of forward and reverse conversion operations realization. The simulation was performed on the FPGA Artix7 XC7A200T in CAD Xilinx ISE Design Suite 14.7. Its purpose was to compare the performance of the known CNN architecture from [12] and proposed CNN architecture. Simulation of forward conversion in proposed architecture shows 21 times faster delay and 56 times less hardware costs than in known architecture. Simulation of reverse conversion in proposed architecture shows 25% faster delay and 32,5% less hardware costs than in known architecture. Analysis of modular adders performance shows that the worst module in proposed specific moduli set is which increase the delay performance of modular adders by 22%. Summarizing these results, we can conclude that the proposed CNN hardware architecture can significantly reduce the time and cost of the most problematic forward and reverse conversions in RNS. The achieved advantage is obtained at the expense of a slight reduction in the efficiency of modular adders in CNN.
Pages: 3-15
References

 

  1. Chakradhar S., Sankaradas M., Jakkula V., Cadambi S. A dynamically configurable coprocessor for convolutional neural networks // 37th Annual Int-l Symp. on Computer architecture (ISCA2010). 2010. P. 247-257.
  2. Lecun Y., Bottou L., Bengio Y., Haffiner P. Gradient-based learning applied to document recognition // Proc. of the IEEE. 1998. V. 86. № 11. P. 2278-2324.
  3. Sankaradas M., Jakkula V., Gadami S., Chakradhar S., Durdanovic I., Cosatto E., Graf H.P. A massively parallel coprocessor for convolutional neural networks // 20th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP2009). 2009. P. 53-60.
  4. Peemen M., Setio A.A.A., Mesman B., Corporaal H. Memorycentric accelerator design for convolutional neural networks // 31st International Conference on Computer Design (ICCD2013). 2013. P. 13-19.
  5. Farabet C., Martini B., Akselrod P., Talay S., LeCun Y., Culurciello E. Hardware accelerated convolutional neural networks for synthetic vision systems, Int-l Symp. on Circuits and Systems (ISCAS2010). 2010. P. 257-260.
  6. Gribachev V.P. Nastojashhee i budushhee nejjronnykh setejj // Komponenty i tekhnologii. 2006. № 5.
  7. Barskijj A.B. Logicheskie nejjronnye seti: ucheb. posobie // M.: Internet-Universitet Informacionnykh Tekhnologijj; BINOM. Laboratorija znanijj. 2011. 352 s.
  8. Gorban A.N., Rossiev D.A. Nejjronnye seti na personalnom kompjutere // Novosibirsk: Nauka. Sibirskaja izdatelskaja firma RAN. 1996. 276 s.
  9. Sovremennye problemy nejjroinformatiki. CH. 3. 2007. S. 30-33.
  10. KHajjkin S. Nejjronnye seti: polnyjj kurs.: Per. s angl. Izd. 2-e // M.: Izdatelskijj dom «Viljams». 2008. 1104 s.
  11. Kozin N.E., Fursov V.A. Poehtapnoe obuchenie radialnykh nejjronnykh setejj // Kompjuternaja optika. 2004. № 26. S. 138-141.
  12. Nakahara H., Sasao T. A deep convolutional neural network based on nested residue number system, 2015 // 25th International Conference on Field Programmable Logic and Applications (FPL). London. 2015. P. 1-6.
  13. Balukhto A.N., Nazarov L. E. Nejjrosetevaja filtracija i segmentacija cifrovykh izobrazhenijj // Nejjrokompjutery v prikladnykh zadachakh obrabotki izobrazhenijj. Kn. 25. 2007. S. 7-24.
  14. Galushkin A.I., Tomashevich N.S., Rjabcev E.I. Nejjrokompjutery dlja obrabotki izobrazhenijj // Nejjrokompjutery v prikladnykh zadachakh obrabotki izobrazhenijj. Kn. 25. 2007. S. 74-109.
  15. Zhang C., Li P., Sun G., Guan Y., Xiao B., Cong J. Optimizing FPGA-based accelerator design for deep convolutional neural networks ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA2015). 2015. P. 161-170.
  16. Farabet C., Poulet P., Han J.Y., LeCun Y. CNP: An FPGA-based processor for convolutional networks. FPL2009. 2009. P. 32-37.
  17. CHervjakov N.I., Sakhnjuk P.A., SHaposhnikov A.V., Makokha A.N. Nejjrokompjutery v ostatochnykh klassakh. M.: Radiotekhnika. 2003. S. 272.
  18. Omondi A., Premkumar B. Residue Number Systems: Theory and Implementation. Imperial College Press. 2007. P. 296.
  19. Cardarilli G.C., Nannarelli A., Re M. Residue number system for low-power DSP applications // Proc. 41st Asilomar Conf. Signals, Syst., Comput. 2007. P. 1412-1416.
  20. Chervyakov N.I., Lyakhov P.A. Realizatsiya KIX-fil\'trov v sisteme ostatochnykh klassov [Implementation of FIR filters in Residue Number System], Neirokomp\'yutery: razrabotka, primenenie. 2012. № 5. P. 15-24. (In Russian).
  21. Hung C.Y., Parhami B. An approximate sign detection method for residue numbers and its application to RNS division // Computers &Mathematics with Applications. 1994. № 27(4). P. 23-35.
  22. Chervyakov N.I., Molahosseini A.S., Lyakhov P.A., Babenko M.G., Deryabin M.A. Residue-to-Binary Conversion for General Moduli Sets Based on Approximate Chinese Remainder Theorem // International Journal of Computer Mathematics. 2016. P. 1-17.
  23. Parhami B. Computer Arithmetic: Algorithms and Hardware Designs. Oxford University Press, Inc. 2000. 492 p.
  24. Deschamps J.P., Bioul G.J.A., Sutter G.D. Synthesis of arithmetic circuits: FPGA, ASIC and embedded systems. John Wiley & Sons, Inc. 2006. 556 p.
  25. Lynch T.W. Binary adders/ The University of Texas at Austrin. 1996. 135 p.
  26. Vergos H.T., Dimitrakopoulos G. On Modulo 2n+1 Adder Design // IEEE Trnsactions on computers. 2012. V. 61. № 2. P. 173-186.