File: binary_semaphore.md

package info (click to toggle)
libcudacxx 1.8.1-2
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 66,464 kB
  • sloc: cpp: 517,767; ansic: 9,474; python: 6,108; sh: 2,225; asm: 2,154; makefile: 7
file content (77 lines) | stat: -rw-r--r-- 2,677 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
grand_parent: Extended API
parent: Synchronization Primitives
nav_order: 4
---

# `cuda::binary_semaphore`

Defined in header `<cuda/semaphore>`:

```cuda
namespace cuda {

template <cuda::thread_scope Scope>
using binary_semaphore = cuda::std::counting_semaphore<Scope, 1>;

}
```

The class template `cuda::binary_semaphore` is an extended form of
  [`cuda::std::binary_semaphore`] that takes an additional
  [`cuda::thread_scope`] argument.
`cuda::binary_semaphore` has the same interface and semantics as
  [`cuda::std::binary_semaphore`], but `cuda::binary_semaphore` is a class
  template.

## Concurrency Restrictions

An object of type `cuda::binary_semaphore` or `cuda::std::binary_semaphore`,
  shall not be accessed concurrently by CPU and GPU threads unless:
- it is in unified memory and the [`concurrentManagedAccess` property] is 1, or
- it is in CPU memory and the [`hostNativeAtomicSupported` property] is 1.

Note, for objects of scopes other than `cuda::thread_scope_system` this is a
  data-race, and thefore also prohibited regardless of memory characteristics.

Under CUDA Compute Capability 6 (Pascal) or prior, an object of type
  `cuda::binary_semaphore` or `cuda::std::binary_semaphore` may not be used.

## Implementation-Defined Behavior

For each [`cuda::thread_scope`] `S`, `cuda::binary_semaphore<S>::max()` is as
  follows:

| [`cuda::thread_scope`] `S` | `cuda::binary_semaphore<S>::max()` |
|----------------------------|------------------------------------|
| Any                        | `1`                                |

## Example

```cuda
#include <cuda/semaphore>

__global__ void example_kernel() {
  // This semaphore is suitable for all threads in the system.
  cuda::binary_semaphore<cuda::thread_scope_system> a;

  // This semaphore has the same type as the previous one (`a`).
  cuda::std::binary_semaphore<> b;

  // This semaphore is suitable for all threads on the current processor (e.g. GPU).
  cuda::binary_semaphore<cuda::thread_scope_device> c;

  // This semaphore is suitable for all threads in the same thread block.
  cuda::binary_semaphore<cuda::thread_scope_block> d;
}
```

[See it on Godbolt](https://godbolt.org/z/eKfjYYz58){: .btn }


[`cuda::thread_scope`]: ../thread_scopes.md

[`cuda::std::binary_semaphore`]: https://en.cppreference.com/w/cpp/thread/binary_semaphore

[`concurrentManagedAccess` property]: https://docs.nvidia.com/cuda/cuda-runtime-api/structcudaDeviceProp.html#structcudaDeviceProp_116f9619ccc85e93bc456b8c69c80e78b
[`hostNativeAtomicSupported` property]: https://docs.nvidia.com/cuda/cuda-runtime-api/structcudaDeviceProp.html#structcudaDeviceProp_1ef82fd7d1d0413c7d6f33287e5b6306f