1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62
|
..
.. Copyright (C) Mellanox Technologies Ltd. 2019. ALL RIGHTS RESERVED.
..
.. See file LICENSE for terms.
..
.. _ucx_features:
*****************
UCX main features
*****************
High-level API features
***********************
- Select either a client/server connection establishment (similar to TCP), or
connect directly by passing remote address blob.
- Support sharing resources between threads, or allocating dedicated resources
per thread.
- Event-driven or polling-driven progress.
- Java and Python bindings.
- Seamless handling of GPU memory.
Main APIs
---------
- Stream-oriented send/receive operations.
- Tag-matched send/receive.
- Remote memory access.
- Remote atomic operations.
Fabrics support
***************
- RoCE
- InfiniBand
- TCP sockets
- Shared memory (CMA, knem, xpmem, SysV, mmap)
- Cray Gemini / Aries (ugni)
Platforms support
*****************
- Supported architectures: x86_64, Arm v8, Power.
- Runs on virtual machines (using SRIOV) and containers (docker, singularity).
- Can utilize either MLNX_OFED or Inbox RDMA drivers.
- Tested on major Linux distributions (RedHat/Ubuntu/SLES).
GPU support
***********
- Cuda (for NVIDIA GPUs)
- ROCm (for AMD GPUs)
Protocols, Optimizations and Advanced Features
**********************************************
- Automatic selection of best transports and devices.
- Zero-copy with registration cache.
- Scalable flow control algorithms.
- Optimized memory pools.
- Accelerated direct-verbs transport for Mellanox devices.
- Pipeline protocols for GPU memory
- QoS and traffic isolation for RDMA transports
- Platform (micro-architecture) specific optimizations (such as memcpy, memory barriers, etc.)
- Multi-rail and RoCE link aggregation group support
- Bare-metal, containers and cloud environments support
- Advanced protocols for transfer messages of different sizes
|