NVIDIA CUDA communications

NVIDIA creates a friendly solution to this interconnect issue by providing higher bandwidth that calls NVIDIA Collective Communications Library (NCCL). This library provides routines such as all-gather, all-reduce, broadcast, reduce, reduce-scatter, optimized to achieve high bandwidth over PCIe and NVLINK high-speed interconnect and implements multi-GPU and multi-node collective communication primitives that are performance-optimized for NVIDIA GPUs on NVLINK technology to interconnects. NCCL is a library of multi-GPU collective communication primitives that are topology-aware and easily integrated into your application. Initially developed as an open-source research project, NCCL is lightweight, depending only on the usual C++ and CUDA libraries.


Multi GPU systems