D.V. Makagon, E.L. Syromyatnikov
This article focuses on several approaches to barrier synchronization hardware support in high-speed interconnection networks with multi-dimensional torus topology.
The barrier synchronization is a key element of a wide range of parallel algorithms; it is the efficiency of its implementation that very often restricts the scalability of applications when running on a large number of compute nodes.
The basic definitions of a process, a task, a communication operation, a synchronization operation and synchronization guarantees are given in the first section of the article.
In the next two sections the basic types of synchronization guarantees and operations are specified.
The following sections describe the barrier and half-barrier synchronization for one-sided communications and non-blocking operations, as well as a multiphase barrier algorithm, designed to overcome the constraints and shortcomings of other methods.
The implementation possibilities, analytical computations and performance evaluation are presented, based on the generalized model of an interconnection network with multi-dimensional torus topology and compute nodes with x86-based processors and PCI Express network interface cards.
In summary the article gives a comprehensive insight into the problem of implementation of barrier synchronization for high-speed interconnection networks with multi-dimensional torus topology and proposes an algorithm, that provides a topology-aware traffic minimization and is suitable for bothone-sided and two-sided communication models.