350 rub
Journal Achievements of Modern Radioelectronics №7 for 2012 г.
Article in number:
Implementation possibilities of barrier synchronization on high-speed interconnection networks with multi-dimensional torus topology
Authors:
D.V. Makagon, E.L. Syromyatnikov
Abstract:
This article focuses on several approaches to barrier synchronization hardware support in high-speed interconnection networks with multi-dimensional torus topology. The barrier synchronization is a key element of a wide range of parallel algorithms; it is the efficiency of its implementation that very often restricts the scalability of applications when running on a large number of compute nodes. The basic definitions of a process, a task, a communication operation, a synchronization operation and synchronization guarantees are given in the first section of the article. In the next two sections the basic types of synchronization guarantees and operations are specified. The following sections describe the barrier and half-barrier synchronization for one-sided communications and non-blocking operations, as well as a multiphase barrier algorithm, designed to overcome the constraints and shortcomings of other methods. The implementation possibilities, analytical computations and performance evaluation are presented, based on the generalized model of an interconnection network with multi-dimensional torus topology and compute nodes with x86-based processors and PCI Express network interface cards. In summary the article gives a comprehensive insight into the problem of implementation of barrier synchronization for high-speed interconnection networks with multi-dimensional torus topology and proposes an algorithm, that provides a topology-aware traffic minimization and is suitable for bothone-sided and two-sided communication models.
Pages: 21-28
References
  1. Alverson R., Roweth D., Kaplan L. The Gamini System Interconnect, 18th// IEEE Symposium on High Performance Interconnects. 2010.
  2. Макагон Д.В., Сыромятников Е.Л. Сети для суперкомпьютеров // Открытые системы. СУБД. Сентябрь 2011. №7.
  3. Корж А.А., Макагон Д.В., Жабин И.А., Сыромятников Е.Л. и др.Отече ственная коммуникационная сеть 3D-тор с поддержкой глобально адресуемой памяти для суперкомпьютеров транспетафлопсного уровня производительности // Па- раллельные вычислительные технологии (ПаВТ-2010): Труды междунар. научн. конф. (Уфа, 29 марта - 2 апреля2010 г.), http://omega.sp.susu.ac.ru/books/conference/PaVT2010/full/134.pdf. Челябинск: Издательский центр ЮУрГУ. 2010. С. 227-237.
  4. Ramachandra Nanjegowda, Oscar Hernandez, Barbara Chapman, Haoqiang H. JinScalability Evaluation of Barrier Algorithms for OpenMP // IWOMP 2009. LNCS 5568. Springer-Verlag Berlin Heidelberg. 2009. Р. 42-52.
  5. John Sartori, Rakesh Kumar Low-Overhead, High-Speed Multi-core Barrier Synchronization // HiPEAC 2010. LNCS 5952.Springer-Verlag Berlin Heidelberg. 2010. Р. 18-34.
  6. Hoefler T. A survey of barrier algorithms for coarse grained supercomputers. Chemnitzer Informatik-Berichte. 2004.
  7. Kayhan M. Imre, Cesur Baransel, Harun Artuner Efficient and Scalable Routing Algorithms for Collective Communication Operations on 2D All-Port Torus Networks // Int J Parallel Prog. Springer Science+Business Media, LLC. 2011.
  8. Vijay Moorthy, Dhabaleswar K. Panda, and P. Sadayappan Fast Collective Communication Algorithms for Reflective Memory Network Clusters, CANPC 2000 //Lecture Notes in Computer Science 1797. Springer-Verlag Berlin Heidelberg New York. 2000. Р. 100-114.