File: syncval_design.md

package info (click to toggle)
vulkan-validationlayers 1.4.328.1-1
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 49,412 kB
  • sloc: cpp: 615,223; python: 12,115; sh: 24; makefile: 20; xml: 14
file content (1429 lines) | stat: -rw-r--r-- 65,495 bytes parent folder | download | duplicates (8)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
<!-- markdownlint-disable MD041 -->
<!-- Copyright 2015-2025 LunarG, Inc. -->
[![Khronos Vulkan][1]][2]

[1]: https://vulkan.lunarg.com/img/Vulkan_100px_Dec16.png "https://www.khronos.org/vulkan/"
[2]: https://www.khronos.org/vulkan/

# Synchronization Validation Design Documentation (phase1 release)

Authors: Tobin Ehlis, John Zulauf


[TOC]



## Overview

The Vulkan API features a number of unique synchronization primitives to express execution and memory dependencies and currently validation only has limited support for tracking fence and semaphore primitives.



Execution and memory dependencies are used to solve data hazards, i.e. to ensure that read and write operations occur in a well-defined order. Write-after-read hazards can be solved with just an execution dependency, but read-after-write and write-after-write hazards need appropriate memory dependencies to be included between them. If an application does not include dependencies to solve these hazards, the results and execution orders of memory accesses are undefined.

This document is intended to outline a design that will unify the tracking of all Vulkan memory accesses and synchronization primitives within Vulkan Validation. Such a model will enable flagging error cases where synchronization primitives are missing, but also allow for visualization of such primitives and provide for recommendations on optimal use of synchronization for any given combination of Vulkan commands and resources.

The pipelined and multi-threaded nature of Vulkan makes it particularly important for applications to correctly insert needed synchronization primitives, and for validation to diagnose unprotected hazards.  These hazards are:


<table>
  <tr>
   <td>RAW
   </td>
   <td>Read-after-write
   </td>
   <td>Occurs when a subsequent operation uses the result of a previous operation without waiting for the result to be completed.
   </td>
  </tr>
  <tr>
   <td>WAR
   </td>
   <td>Write-after-read
   </td>
   <td>Occurs when a subsequent operation overwrites a memory location read by a previous operation before that operation is complete. (requires only execution dependency)
   </td>
  </tr>
  <tr>
   <td>WAW
   </td>
   <td>Write-after-write
   </td>
   <td>Occurs when a subsequent operation writes to the same set of memory locations (in whole or in part) being written by a previous operation.
   </td>
  </tr>
  <tr>
   <td>WRW
   </td>
   <td>Write-racing-write
   </td>
   <td>Occurs when unsynchronized subpasses/queues perform writes to the same set of memory locations.
   </td>
  </tr>
  <tr>
   <td>RRW
   </td>
   <td>Read-racing-write
   </td>
   <td>Occurs when unsynchronized subpasses/queues perform read and write operations on the same set of memory locations
   </td>
  </tr>
</table>


## Algorithmic Overview

### Approach

In order to validate synchronization the effect of action and synchronization commands must be tracked.  The full validation can *only* be done at queue submission time, when the full context of the command batches is known. However, partial validation of hazards knowable within the context of a single recorded command buffer can be done at record time at. This partial validation includes record time detection of internally present hazards during sub-pass and can be extended to include secondary command buffer execution. To support this usage, the validation and update components must be designed to operate at record, secondary command buffer execute record, and queue submit time.

 Synchronization validation centers around examining the previous access and synchronization operations for a range of memory addresses against a current access or operation against that same range. The _state_ (Resource Access State) tracks the most recent read, write, and synchronization operations for a given range of memory addresses. These are captured in contexts (Access Context) which reflect the Queue, Command Buffer, or Subpass instance in which the memory accesses were made. This first approach is sufficient for single command buffer validation. Note that only the most recent state (relative to *write* operations)  is retained. Additionally, the “dependency chain” logic can be represented by a cumulative state reflecting the impact of all synchronization operations on the most recent read/write state. Between the _state_ and _context_, enough information is present to do record time validation of single command buffers.

### Most Recent Access

When detecting memory access hazards, synchronization validation considers only the most recent access (MRA) for comparison. All prior hazards are assumed to have been reported.

For read operations the most recent access rules apply to prior reads with execution barriers (or ordering) relative to the current read. The prior write access is only considered the most recent access if no intervening prior read has occurred that *happens-before* the current read. Consider the following sequence of access and barriers (listed in submission order) acting on the same memory address:



| Operation | Description                                                  |
| --------- | ------------------------------------------------------------ |
| W         | write operation                                              |
| M         | memory barrier guarding access at R0                         |
| R0        | first read operation                                         |
| R1        | second read operation                                        |
| E         | execution barrier s.t. R2 *happens-after R1*                 |
| R2        | third read operation                                         |
| R3        | fourth read operation with stage not in second execution scope of E |

![sync_wrr](images/sync_wrr.png)

For write hazard checks in a given range of memory addresses, if there are intervening read operations between the current write and the most recent previous write, these intervening read operations are considered the most recent access. In that case, write-after-write checks are not done.

Consider the following sequence of operations one the same memory address:

| Operation | Description                             |
| --------- | --------------------------------------- |
| W0        | first write operation                   |
| M         | memory barrier guarding access at R     |
| R         | read access                             |
| E         | execution barrier guarding access at W1 |
| W1        | second write operation                  |

![sync_wrw](images/sync_wrw.png)

In this case, a read-after-write check is done for R based on W0 and M, and a write-after-read check is performed on W1 based on R and E. W1 is not checked against W0 for write-after-write. If W0, M, R **is** not a hazard, this guarantees W0 is available and visible to R, and thus to any operations that *happen-after*. As such, the correctness of R, E, W1 depends solely on those operations. The correctness of the entire sequence can be assured by pairwise hazard checks.

### Tracking Across Command Buffers

For multiple command buffers/secondary buffers, a partial recording of memory accesses and synchronization operations is required.  Current design targets the use of storing the “first” access information (first reads, first write), along with the either a recording of all synchronization operations *or* some "barrier state" that can be stored with each “first access”.  This approach should be familiar to those that have seen the “image layout” queue-submit, and execute-command validation, which validates against a similar “first” record.

Overall the operations to perform include

*   Access/operation validation against current state.  This involves lookup of the affected state, potentially traversing a context tree or barrier record, and then comparing the access relative to the state -- i.e. the previous read and/or write operations in light of the synchronization operations recorded in the state
*   Reporting detected hazard
*   Updating the current state to reflect the access or operation.

Additionally

*   Traversal of context trees/barrier replay (or reconstruction)
*   Initialization (import) or Resolution (export) of access contexts to parent or subsequent contexts


## Memory Access Storage

Memory access is characterized by the _memory extent_ and _memory usage_.  Memory extent described the memory affected including the `VkMemory` handle, offset, size, and (for images) subresource range. The memory usage describe the pipeline stage (`VkPipelineStageFlagBits`) and memory access (`VkAccessFlagBits`) -- for example “vertex shader stage / shader read”.  A memory access only describes a single usage for a given extent.


### Memory Address Space

To simplify tracking and updating the state of memory extents as listed below, state is stored in a consistent way for all resource types. Also, to eliminate needing to define separate address spaces for each device memory allocations, VkMemory are placed at “fake” base addresses (FBA) within a global address space using a simple “next available” allocation scheme. (Which is robust for all rational Vulkan usage models for months of runtime.) Additionally, because the actual memory organization of certain resources is opaque, an "opaque range" is allocated with an independent FBA to the extent needed to hold the Synchronization Validation internal representation of these opaque resources.


### Memory Extent Definitions and Comparisons

Memory extent can be specified in a variety of ways:

<table>
  <tr>
   <td>Memory range
   </td>
   <td>A range of addresses within either <code>VkDeviceMemory</code> allocation offset within the "fake" address space, referencing a specific device memory range (i.e. a "linear range"), or a range of "fake" address space represention a portion of the Synchronization Validation representation of a tiled image or other opaquely represented device resource (i.e. an "opaque range").  All accesses to a given resource are contained within the linear range(s) of the `VkDeviceMemory`(s) to which it is bound, or within the opaque range representing the resource.
   </td>
  </tr>
  <tr>
   <td>Linear (transparent) image subresource range
   </td>
   <td>A <code>VkImage</code> handle for transparent (linear tiling) image with a <code>VkSubresourceRange </code>are mapped to the (potentially non contiguous) memory ranges of the <code>VkMemory </code>bound to the VkImage for each subresource within the range.
   </td>
  </tr>
  <tr>
   <td>Tiled (opaque) image subresource range
   </td>
   <td>A VkImage handle to an opaque (optimal tiling) image with a VkSubresourceRange which cannot be mapped to a specific subset of the the <code>VkMemory </code>bound to the <code>VkImage</code>.  The subresources are encoded to the opaque range reserved for the resource, specific to the image creation parameters which can only be compared relative to “compatible” aliases.  Otherwise all subresources within the range are assumed to alias to *every* device memory address bound to the image.
   </td>
  </tr>
  <tr>
   <td>Sparsely bound Extents
   </td>
   <td>TBD
   </td>
  </tr>
</table>


These distinctions affect the ability of memory references to be compared w.r.t. “overlap” --

<table>
  <tr>
   <td>For same memory handle
   </td>
   <td>Memory range
   </td>
   <td>Linear Image
   </td>
   <td>Tiled Image
   </td>
  </tr>
  <tr>
   <td>Memory range
   </td>
   <td>Compare range
   </td>
   <td>Encode subresource range to memory range and compare..
   </td>
   <td>Compare memory range to all accesses within image binding memory range.
   </td>
  </tr>
  <tr>
   <td>Linear Image
   </td>
   <td>
   </td>
   <td>Encode subresource range to memory range and compare.
   </td>
   <td>Compare memory range to all accesses within image binding memory range.
   </td>
  </tr>
  <tr>
   <td>Tiled image (Opaque Resources)
   </td>
   <td>
   </td>
   <td>
   </td>
   <td>For “compatible*” tiled images (or other opaque resources), compare encoded opaque ranges.  For incompatible images, every subresource is assumed to alias to all subresources of the other image.
   </td>
  </tr>
</table>




* Compatible tiled images are those with same image resource handle, or with the same image create info with `VK_IMAGE_CREATE_ALIAS_BIT ` set and the same `VkDeviceMemory` and binding offset.

For incompatible memory extent types any overlap of the bound range must be assumed to be an overlap of the entire bound range (and thus all subresource ranges).  Note: looking for binding range overlap is a likely important first step before doing more detailed overlap analysis.  Also, as a lower precision option, any comparison of non-similar resources (offset or create info) could revert to the opaque “tiled” case for linear images.


### Algorithmic Aside: Interval Trees

Several of the lookups below require tracking and comparing non-unique collections of intervals (memory binding range, mip levels or slices in a subresource).  To compress storage improve traversal of the access state these intervals are stored in non-overlapping interval trees -- effectively a run-length compression for ranges of memory addresses over which the state is constant.

The implementation used is a wrapper to std::map (which is typically implemented as a red-black tree), that implements a non-overlapping 1 dimensional range map.  Conversion to and from subresource address space is performed by a “encoder”/”generator” classes the create a set of single-dimensional ranges from subresource range definitions.  For buffer and linear images, the index for the interval tree represents device memory in an simplified memory allocation scheme for VkDeviceMemory allocations.  For tiled/opaque images, the index represents an opaque memory range, not corresponding to device memory in any way.

### VK_KHR_synchronization2

Synchronization validation uses the pipeline stages and access masks defined in this extension, even when the extension is not enabled.  This is mostly transparent to the user, except that a few pipeline stages and access masks are split, to give more fine grained visibility into the  pipeline. This lets the validation code give better output in some cases. For example, a hazard caused by a vkCmdCopy() command will record its access as COPY_TRANSFER_READ or COPY_TRANSFER_WRITE, rather than TRANSFER_TRANSFER_READ or TRANSFER_TRANSFER_WRITE. A pipeline barrier using VK_PIPELINE_STAGE_TRANSFER_BIT should still fix this hazard.

For convenience,  the tables below show the pipeline stages and access masks affected:

| Vulkan 1.2 Pipeline Stage          | Synchronization2 Pipeline Stages                             |
| ---------------------------------- | ------------------------------------------------------------ |
| VK_PIPELINE_STAGE_TRANSFER_BIT     | VK_PIPELINE_STAGE_2_COPY_BIT				VK_PIPELINE_STAGE_2_RESOLVE_BIT				VK_PIPELINE_STAGE_2_BLIT_BIT				VK_PIPELINE_STAGE_2_CLEAR_BIT |
| VK_PIPELINE_STAGE_VERTEX_INPUT_BIT | VK_PIPELINE_STAGE_2_INDEX_INPUT_BIT 				    VK_PIPELINE_STAGE_2_VERTEX_ATTRIBUTE_INPUT_BIT |



| Vulkan 1.2 Access Mask     | Synchronization2 Access Masks                                |
| -------------------------- | ------------------------------------------------------------ |
| VK_ACCESS_SHADER_READ_BIT  | VK_ACCESS_2_SHADER_SAMPLED_READ_BIT VK_ACCESS_2_SHADER_STORAGE_READ_BIT |
| VK_ACCESS_SHADER_WRITE_BIT | VK_ACCESS_2_SHADER_STORAGE_WRITE_BIT                     |

   These stages also affect the Stage / Access map described in the next section.



### Memory Usage Specification

Vulkan classifies memory access by pipeline stage and access. However for purposes of tracking specific access, availability, and visibility state changes the two masks would allow for aliasing that could hide hazards.   In order to avoid that aliasing, the valid combinations of Stage and Access flags can be combined into memory usages-- with each represented by a unique index (currently 79 valid combinations) )and tracked with single bit within a mask(list in roughly pipeline order).

<table>
  <tr>
   <td>Stage
   </td>
   <td>Stage/Access
   </td>
  </tr>
  <tr>
   <td>Draw Indirect
   </td>
   <td> DRAW_INDIRECT_INDIRECT_COMMAND_READ<br>
        DRAW_INDIRECT_TRANSFORM_FEEDBACK_COUNTER_READ_EXT
   </td>
  </tr>
  <tr>
   <td>Index Input
   </td>
   <td> INDEX_INPUT_INDEX_READ
   </td>
  </tr>
  <tr>
 </tr>
  <tr>
   <td>Vertex Attribute Input
   </td>
   <td>
        VERTEX_ATTRIBUTE_INPUT_VERTEX_ATTRIBUTE_READ
   </td>
  </tr>
  <tr>
   <td>Vertex Shader
   </td>
   <td> VERTEX_SHADER_SHADER_READ<br>
        VERTEX_SHADER_ACCELERATION_STRUCTURE_READ<br>
       VERTEX_SHADER_SHADER_SAMPLED_READ<br>
       VERTEX_SHADER_SHADER_STORAGE_READ<br>
       VERTEX_SHADER_SHADER_STORAGE_WRITE<br>
       VERTEX_SHADER_UNIFORM_READ
   </td>
  </tr>
  <tr>
   <td>Tessellation Control
   </td>
   <td> TESSELLATION_CONTROL_SHADER_ACCELERATION_STRUCTURE_READ<br>
        TESSELLATION_CONTROL_SHADER_SHADER_SAMPLED_READ<br>
        TESSELLATION_CONTROL_SHADER_SHADER_STORAGE_READ<br>
        TESSELLATION_CONTROL_SHADER_SHADER_STORAGE_WRITE<br>
        TESSELLATION_CONTROL_SHADER_UNIFORM_READ
   </td>
  </tr>
  <tr>
   <td>Tessellation Evaluation
   </td>
   <td> TESSELLATION_EVALUATION_SHADER_ACCELERATION_STRUCTURE_READ<br>
        TESSELLATION_EVALUATION_SHADER_SHADER_SAMPLED_READ<br>
        TESSELLATION_EVALUATION_SHADER_SHADER_STORAGE_READ<br>
        TESSELLATION_EVALUATION_SHADER_SHADER_STORAGE_WRITE<br>
        TESSELLATION_EVALUATION_SHADER_UNIFORM_READ
   </td>
  </tr>
  <tr>
   <td>Geometry Shader
   </td>
   <td> GEOMETRY_SHADER_ACCELERATION_STRUCTURE_READ<br>
        GEOMETRY_SHADER_SHADER_SAMPLED_READ<br>
        GEOMETRY_SHADER_SHADER_STORAGE_READ<br>
        GEOMETRY_SHADER_SHADER_STORAGE_WRITE<br>
        GEOMETRY_SHADER_UNIFORM_READ
   </td>
  </tr>
  <tr>
   <td>Transform Feedback
   </td>
   <td> TRANSFORM_FEEDBACK_BIT_EXT_TRANSFORM_FEEDBACK_COUNTER_READ_BIT_EXT<br>
        TRANSFORM_FEEDBACK_BIT_EXT_TRANSFORM_FEEDBACK_COUNTER_WRITE_BIT_EXT<br>
        TRANSFORM_FEEDBACK_BIT_EXT_TRANSFORM_FEEDBACK_WRITE_BIT_EXT
   </td>
  </tr>
  <tr>
   <td>Fragment Density
   </td>
   <td>FRAGMENT_DENSITY_PROCESS_EXT_FRAGMENT_DENSITY_MAP_READ_EXT
   </td>
  </tr>
  <tr>
   <td>Task Shader
   </td>
   <td> TASK_SHADER_NV_ACCELERATION_STRUCTURE_READ<br>
        TASK_SHADER_NV_SHADER_SAMPLED_READ<br>
        TASK_SHADER_NV_SHADER_STORAGE_READ<br>
        TASK_SHADER_NV_SHADER_STORAGE_WRITE<br>
        TASK_SHADER_NV_UNIFORM_READ
   </td>
  </tr>
  <tr>
   <td>Mesh Shader
   </td>
   <td> MESH_SHADER_NV_ACCELERATION_STRUCTURE_READ<br>
        MESH_SHADER_NV_SHADER_SAMPLED_READ<br>
        MESH_SHADER_NV_SHADER_STORAGE_READ<br>
        MESH_SHADER_NV_SHADER_STORAGE_WRITE<br>
        MESH_SHADER_NV_UNIFORM_READ
   </td>
  </tr>
  <tr>
   <td>Shading Rate
   </td>
   <td>
   FRAGMENT_SHADING_RATE_ATTACHMENT_FRAGMENT_SHADING_RATE_ATTACHMENT_READ
   </td>
  </tr>
  <tr>
   <td>Early Fragement Tests
   </td>
   <td> EARLY_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_READ<br>
        EARLY_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_WRITE
   </td>
  </tr>
  <tr>
   <td>Fragment Shader
   </td>
   <td> FRAGMENT_SHADER_ACCELERATION_STRUCTURE_READ <br>
        FRAGMENT_SHADER_INPUT_ATTACHMENT_READ<br>
        FRAGMENT_SHADER_SHADER_SAMPLED_READ<br>
        FRAGMENT_SHADER_SHADER_STORAGE_READ<br>
        FRAGMENT_SHADER_SHADER_STORAGE_WRITE<br>
        FRAGMENT_SHADER_UNIFORM_READ
   </td>
  </tr>
  <tr>
   <td>Late Fragment Tests
   </td>
   <td>
        LATE_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_READ<br>
        LATE_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_WRITE
   </td>
  </tr>
  <tr>
   <td>Color Attachment Output
   </td>
   <td>
        COLOR_ATTACHMENT_OUTPUT_COLOR_ATTACHMENT_READ<br>
        COLOR_ATTACHMENT_OUTPUT_COLOR_ATTACHMENT_READ_NONCOHERENT_EXT<br>
        COLOR_ATTACHMENT_OUTPUT_COLOR_ATTACHMENT_WRITE
   </td>
  </tr>
  <tr>
   <td>Compute Shader
   </td>
   <td>
        COMPUTE_SHADER_ACCELERATION_STRUCTURE_READ<br>
        COMPUTE_SHADER_SHADER_SAMPLED_READ<br>
        COMPUTE_SHADER_SHADER_STORAGE_READ<br>
        COMPUTE_SHADER_SHADER_STORAGE_WRITE<br>
        COMPUTE_SHADER_UNIFORM_READ
   </td>
  </tr>
  <tr>
   <td>Copy
   </td>
   <td>
        COPY_TRANSFER_READ<br>
        COPY_TRANSFER_WRITE
   </td>
  </tr>
  <tr>
   <td>Resolve
   </td>
   <td>
        RESOLVE_TRANSFER_READ<br>
        RESOLVE_TRANSFER_WRITE
   </td>
  </tr>
  <tr>
   <td>Blit
   </td>
   <td>
        BLIT_TRANSFER_READ<br>
        BLIT_TRANSFER_WRITE
   </td>
  </tr>
  <tr>
   <td>Clear
   </td>
   <td>
        CLEAR_TRANSFER_WRITE
   </td>
  </tr>
  <tr>
   <td>Command Preprocess
   </td>
   <td>
        COMMAND_PREPROCESS_NV_COMMAND_PREPROCESS_READ_NV<br>
        COMMAND_PREPROCESS_NV_COMMAND_PREPROCESS_WRITE_NV
   </td>
  </tr>
  <tr>
   <td>Host Memory Access
   </td>
   <td>
        HOST_HOST_READ<br>
        HOST_HOST_WRITE
   </td>
  </tr>
  <tr>
   <td>Barrier operations
   </td>
   <td>
        IMAGE_LAYOUT_TRANSITION<br>
        QUEUE_FAMILY_OWNERSHIP_TRANSFER
   </td>
  </tr>
</table>

Note that of interest is also the “pipeline type” (compute, graphics, or transfer) of the action causing this memory access. It is needed when determining logically “earlier” and “later” stages of the pipeline for apply synchronization scope updates. This information is code-generated from parsing of the Vulkan specification, with appropriate snippets *manually* update within the code generation scripts.


### Access Scopes

Within the specification the scope of effect for synchronization operations is characterized differently for execution and memory access.  Execution scopes are _expanded _to include logically earlier stage for “source” (or “first”) execution scopes, and logically later for “second” (or “destination”) execution scopes. Validation and update operations perform meta-stage substitution of `VK_PIPELINE_STAGE_ALL_COMMANDS_BIT `and  `VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT `prior to stage expansion.

The scope of memory access is instead _restricted_ to the set of accesses both legal for the execution scope as _provided_ to the synchronization operation and within the mask of access types to be included.  Prior to any validation or update operations as described in subsequent sections, this _expansion_ or _restriction _is performed yielding _source execution scope, source access scope, destination execution scope, and destination access scope _which are encoded as bitmasks (encoding stage and stage/access appropriately)._._


## State Tracking


### Data Structures

#### ResourceUsageTag

A marker representing a specific access for error reporting or sequence specific operations.

#### ResourceAccessState

The specific access/barrier state for a given extent of memory (or opaque range) addresses. The `ResourceAccessState` reflects the common state of a single contiguous range of "fake" addresses.

Stores:

*   Write usage -- information about most recent write usage for the memory range
    *   State/Access of usage. Updated on write
    *   Write barriers -- the destination stage/access pairs which are _non-hazardous_ to the last write.  Cleared on write.
    *   Memory access synchronization chain mask.  Tracks the stages for which barriers applying to the write usage. Cleared on write.
    *   Aggregated Read Execution Barriers -- all stages in the second synchronization scope of all barriers applied to read accesses since the last write  (used for read-after-write testing). Cleared on write.
    *   Usage tag. Updated on write
    *   Queue Id: The queue on which the access was made. Set at queue submit time.
    *   None cleared by read operations
*   read usage  -- Per stage, All cleared on write operations
    * Stage/Access of usage.  Updated on stage read
    * Execution (read) barriers -- the union of destination execution scopes for synchronization operations affecting this read at this stage. Cleared at read operation for this stage.
    * Sync stages -- all read stages that have synchronously executed after this stage Cleared at read operation for this stage.
    * Queue Id: The queue on which the access was made. Set at queue submit time.
    * Usage Tag. Updated on on stage read

Implements:

*   Hazard detection functions
*   State update/resolution operations
*   Wait operations

Queue Id and Sync stages are used for semaphore, queue, and fence wait operations, and for synchronization operation replay at queue submit time.

#### ResourceAccessRangeMap

#### ResourceAccessRangeMap

Interval tree of ResourceAccessState records within the unified "fake" address space, comprising VkDeviceMemory linear ranges and opaque ranges

#### AccessContext

ResourceAccessState and dependency information for a given “context” (for example within a given command buffer, but outside a renderpass, or within a given subpass of a renderpass instance)

Stores:

*   "Fake"address space ResourceAccessRangeMap
*   Subpass dependency and barrier information (for traversal within renderpass instances) if needed.

Implements:

*   Hazard detection traversal over (potentially discontiguous) ranges for images and buffers
*   VkMemory FBA offset
*   State update traversal over (potentially discontiguous) ranges for images and buffers
*   Subpass dependency graph traversal for state lookup and resolution operations (including support  for RenderPass specific and Hazard detection operations)
*   Range map operations to split (and potentially combine) ResourceAccessState records, to ensure state changes only impact the correct portion of the address space.

#### CommandBufferAccessContext

container for all access contexts for a specific command buffer, and state reflecting the current context.

Stores:

*   Command buffer state pointer (saves StateTracker lookup)
*   AccessContext for all outside-renderpass-instance operations
*   RenderPassAccessContext

Implements:

*   RenderPass instance operations top level begin/next/end operations (calling RenderPassAccessContext operations)

#### RenderPassAccessContext

Manages the current access context and context transitions for render pass instance begin/next/end operations.

Stores:

*   Render pass state pointer (saves StateTracker lookup)
*   Per subpass AccessContext’s
*   Attachment view state pointers
*   Parent (command buffer) access context pointer

Implements:

Begin/next/end operations

RenderPass specific hazard check and state update operations

#### QueueBatchContext

Contains the access and event context information for a given queue submission batch.  This includes a full copy of all accesses from earlier queue batches.  "Earlier" includes the previous batches in submission order on the same queue and batches corresponding to Semaphore wait operations.  To minimize memory footprint, only two groups of contexts are retained. The first contains the access and  event context for a given queue as of its most recent submission operation. The second group are the QueueBatchContext corresponding to the state of a queue with a signaled semaphore or fence operation which has not yet been awaited.  Batches that have neither unresolved semaphore or fence operations associated, nor are the most recently submitted, may have QueueBatchContext with lifespans of only the submit entry point call.

Accesses within a QueueBatchContext are tagged with "global" (at device scope) ResourceUsageTag id's, which are atomically allocated when the batch context is created. These tags are unique and monotonically increasing across all queues for a given VkDevice. While CommandBufferAccessContext contains the access usage records for all commands and synchronization operations, the QueueBatchContext tagging information references a global usage record store.  *Note: Clean up of the global usage record store will require traversing the access contexts of the queue batch contexts to find and eliminate unreferenced usage records.  This will need to be assessed as to priority relative to other apparent memory leaks in the maintenance of unresolved/unknown signal and event states.*


### Range Based Operations

In the discussion below, validation or update of the state of resource is discussed as if it is a uniform, singular operation.  However, over the input range of a memory access or synchronization operation the resource access state can vary.  For images, the input range may even be discontinuous, covering a series of locally contiguous ranges (mapping to `VkDeviceMemory` or opaque range addresses).

All resource access state operations described below happen over ranges over which the input and current access state values are constant.  As such, top-level operations (such as hazard detection for the use a of memory range in a copy operation), are implemented using range traversal functions to hide the complexity of the underlying variability in current state. Return values from certain query operations (like those needed for “resolve” operations below) are not single valued but are ResourceAccessRangeMap objects reflecting the set of range/value pairs in the query range.  State update operations also must deal with the potential of a non-constant current state over the range, range boundary misalignment, or gaps in current state, requiring record splitting or infilling of initial data.

The interposition of the traversal functions between the Access Context level operations and the Resource Access State Operations adds an unavoidable complexity to the implementation of each operation. To the extent possible, common traversal frameworks should be used to prevent the proliferation of range (and or graph) walkers.  (To the extent these are truly general, they should be add to the algorithms implemented with the underlying range_map.) Maintainers should become familiar and comfortable with the traversal tools for the range maps and their use in Synchronization Validation.

Fragmentation of the access range maps is a distinct possibility, with the map implementation supporting record coalescence, should this prove a performance or memory usage issue. Optimizations such as using alternative update traversal algorithms are possible, for example, in situations where it is known that the update will fully overwrite the existing records.


### ResourceAccessState Operations

The ResourceAccessState is the leaf level structure at which the synchronization validation and state tracking is performed.  The operations on the access state implement the logic described in the Vulkan specification regarding the effects of access and synchronization operations from a resource (memory address range) point of view.  The caller _must _assure that both the input and the current state are constant and correctly bounded over the range of the operation.


#### State Update

*   Update: Record the state change based an “ordinary” memory access
    *   Read usages: add or update the current usage to the read usage list, clearing the stage barriers and updating the stage read tag.  Updates the sync stages information of read stages with barriers to current read.
    *   Write usages: erase read usages, update write usage and clear write barriers and update the write tag

*   Resolve: Combine access states from different contexts
    *   If one write tag is strictly after another, the later state is preserved
    *   If the write tags match, the pre-stage read information is resolved as a union of stages accessed, with most recent usage retained if stage present in both contexts.

*   Execution Barrier
    *   Effect on read usage (per stage with usage)
        *   If the _source execution scope_ includes the stage, or intersects with the mask of barriers, the _destination execution scope_ is OR’d with the barrier mask.
    *   Effect on write usage
        *   If the_ source execution scope_ is present in the dependency chain mask for writes, the _destination execution scope_ is OR’d with the dependency chain mask. Note that if no prior _memory_ barrier has occurred since the last write, no dependency chaining will occur, and in no case is the write barrier information updated.
*   Memory Barrier (assumes execution barrier operation will also be applied barriers)
    *   Effect on read usage (per stage with usage)
        *   No effect, except from corresponding execution barrier
    *   Effect on write usage
        *   If the write usage is within the _source access scope_, or if _source execution scope_ intersects the write dependency chain, the write barrier is updated to include the _destination access scope_, and the write dependency chain is updated to include the _destination execution scope_..

* Semaphore Wait

  *   Per Access (read stage/write)
      *   For accesses *not* on waiting queue
          *   Replace barrier/chain information with Signal/Wait effective barrier
      *   For accesses on same queue, treat Signal/Wait barrier as a Set/Wait *event* barrier

* Queue/Fence Wait

  *   For read stages
      *   Identify reads matching queue/tag of wait operation (tag for queue wait is *max* tag)
      *   Identify all reads know to precede waited stages
      *   Remove synchronized stages (both direct and indirect) from access state list
  *   For write stages
      *   Remove write access from access state if:
          *   Write access matching queue/tag of wait operation
          *   -OR- if there are any read stages matching the wait criteria (MRA)
  *   If all resulting ResourceAccessState contains no accesses, delete from AccessContext.

  > Note: Queue/Fence Wait operations require inspecting and updating all QueueBatchContext Access contexts, and is likely a heavyweight operation, though far less common that other state update operations.


#### Validation / Hazard Checking

*   Memory stage/access hazard check

    This test compares a stage/access against a resource access state.

    *   For read access
        *   If there is a write recorded, test  stage/access flag vs. write barriers.  If there is no write barrier for stage/access, report RAW hazard
    *   For write access
        *    If there have been read accesses since the last write,  test each per-stage read record, if any read does not have a barrier for the write access stage, report WAR hazard.
        *    If there is a write recorded, and no read accesses since, test stage/access flag vs. write barriers.  If there is no write barrier for stage/access, report WAW hazard
        > Note: Excluding WAW checks when reads after last write reflect the MRA principle discussed.  If those reads were not hazards, no WAW hazard would be possible unless a WAR hazard exists.
    *   For accesses with Ordering Guarantees
        *   Certain operations (rasterization, load, store, clear, resolve) have guaranteed ordering at the sample level.
        *   Hazard check performed for these operations supply ordering execution and access scope masks, indicating the stages and accesses that have ordering guarantees.  If the last write (for read access) or the last write and all last reads (for write accesses) are all with the access and execution scopes, there is no hazard.
        *   For Rasterization (draw) operations the scopes are (by attachment type)

<table>
  <tr>
   <td>
Attachment Type
   </td>
   <td>Ordering Stage Mask
   </td>
   <td>Ordering Access Mask
   </td>
  </tr>
  <tr>
   <td>Color
   </td>
   <td>STAGE_COLOR_ATTACHMENT_OUTPUT
   </td>
   <td>ACCESS_COLOR_ATTACHMENT_READ | ACCESS_COLOR_ATTACHMENT_WRITE | ACCESS_COLOR_ATTACHMENT_READ_NONCOHERENT
   </td>
  </tr>
  <tr>
   <td>Depth/Stencil
   </td>
   <td>STAGE_EARLY_FRAGMENT_TESTS | STAGE_LATE_FRAGMENT_TESTS
   </td>
   <td>ACCESS_DEPTH_STENCIL_ATTACHMENT_READ | ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE
   </td>
  </tr>
</table>


> Note: this special case is handled in validation to avoid the potential of polluting the “barrier chain” information by storing the effect of an implicit barrier in the access state. (Unclear to the designer what all the side-effects might be.)
>
> Note: Also the ordering guarantees only apply to accesses on the same queue.



*   Layout transition hazard check

    The layout transition is treated as a subresource range read/write operation. Given the implied source barrier this check is distinct from a normal Memory stage/access hazard check described above.

    *   For resource accesses with write operations
        *   If the previous write is in the src access scope **or **the source execution scope is in the dependency chain **and** there are _any _memory barriers set then layout transition is safe,  otherwise a WAW hazard is reported.
    *   For resources with read operations
        *   If the source execution scope intersect with either the read stage flat or the per stage read barriers, the access is safe, otherwise a WAR hazard is reported.
* Asynchronous access hazard check

  Compares a stage/access against the current state of a resource access, assuming asynchronous operation (such as unsynchronized sub-passes or queues)

  *   For resource access with write operations
      *   If input access is read report RRW (read racing write)
      *   If input access is write report WRW (write racing write)
  *   For resource access with only read accesses
      *   If input access is write, report RRW
      *   If input access is read, report ‘no hazard’


##  Vulkan Commands and Synchronization
Vulkan commands of interest to synchronization are:



*   Command buffer render pass/subpass operations affecting the synchronization and access context within a command buffer
*   Command buffer synchronization commands -- set/wait events, pipeline barrier.
*   Command buffer action commands -- draw, dispatch, clear, copy, query/timestamp operations, begin/end subpass operations that record device side actions that are to be taken when a command buffer is submitted.
*   Queue submission commands -- vkQueueSubmit, vkQueueBindSparse which specify work to be done on the device as well as synchronization operations before and after the submitted work
*   Host synchronization commands -- commands that set, query, or wait for device-side synchronization operations

 Several of the validation and update operations combine one of more of the above Resource Access actions above to implement the complex behaviors of the Vulkan synchronization system.

Command buffer state commands also affect the execution of the action commands, but validation and recording of the effects of these commands is generally deferred until an action command affected is either recorded or enqueued.


### Image Layout Transitions

As noted above Image Layout Transition are typically implemented in the context of a barrier operation, as such these have special support at the Resource Access level.  In the implementation of synchronization validation for these barriers, validation checks are only required for barriers with a layout transition.  State update for barriers with layout transitions first update the state to reflect the write-access, then update the destination barriers using the layout transition as the source access scope. Otherwise no state up operation is performed, and the source and destination scopes are passed to the ResourceAccess state to update the barrier state.


### Renderpass Operations

As within a renderpass instance subpasses may execute dependently or independently, and include implicit image layout transitions, barriers, load, store and resolve operations.  These operations use the Access Context objects connected in the render pass dependency graph.  All validation operations are performed on a backward-looking basis, with operations in higher index subpasses begin validated against previous access, without deferral or replay of earlier accesses.


#### Resource Access State Resolution

When a resource is accessed and must be tested for hazard, has a barrier applied, or has some state update, these operations must be performed against the current state of the resource. Within a render pass instance Access Contexts for subpasses are lazily populated, and thus the state not be known within the subpass Access Context. If so, the Access Context supports graph traversal of earlier Access Contexts.  Resource access resolution traverses the graph through previous subpasses (including the Access Context for the command buffer prior to BeginRenderPass), until it finds non-empty state (or the end-of-graph), and returns a copy of that state, applying inter-subpass barrier to that copy as it is returned.  If at any point in the graph traversal no state is found and more than one previous subpass exists, results from all previous subpass Access Contexts are resolved as described in Resource Access State Update “Resolve” operation above. For purposes of access state resolution, subpasses outside of the “previous” subpasses within the DAG are ignored.

Note:  there is a bit of a complication arising from the following:

> For non-attachment resources, the memory dependency expressed by subpass dependency is nearly identical to that of a <code>VkMemoryBarrier </code>(with matching <code>srcAccessMask/dstAccessMask</code> parameters) submitted as a part of a vkCmdPipelineBarrier (with matching srcStageMask/dstStageMask parameters). The only difference being that its scopes are limited to the identified subpasses rather than potentially affecting everything before and after.

**TODO/KNOWN LIMITATION**: Ensure that Access State Resolution checks against tag sequencing vs. the previous NextSubpass to ensure barrier application is correctly restricted


#### Resource Stage/Access Validation and Update

For simple stage/access references, hazard detection operates on the state found in the Access Context, or failing that on the state returned by state resolution above. The detail is encapsulated within the Access Context, and for any given stage/access and address range, may be divided between state stored currently within Access Context and state resolved.

State update likewise operates on state either from the Access Context of the subpass, or resolved from the DAG, with the updated state stored into the subpass Access Context. This detail is also encapsulated in the Access Context, and Access Context objects _outside_ of a renderpass instance simply have an empty “previous” context graph.


#### RenderPass Begin/Next Operations

For renderpass begin/next entry points, two types of accesses occur -- image layout transitions and load operations. As these operations and tests occur on the boundary between subpasses (or between “external” and a subpass) they are handled within the Begin/Next validate and record functions. If no implicit accesses (layout transition or load operation), no state update occurs, leaving the subpass barrier application to the lazy resolve operations described above.

**TODO/KNOWN LIMITATION: **First phase does not include implementation of multi-view renderpass support.

#### Subpass Image Layout Transitions

Image layout transitions are validated _independently_ as the equivalent barrier for each source subpass without resolution across multiple previous subpasess. State update for image layout transition, updates the state to reflect the layout transition, and applies the destination scopes for a source subpass entries in the same image layout transition.

As layout transitions affect entire image subresource ranges, the validation and update are _not _restricted to the `VkRenderPassBeginInfo::renderArea`, but are applied to the entire attachment view.


#### Load Operations

Load operations are also validated and applied at Begin/Next time, and must be validated/updated relative to current state with any image layout transition effects applied.  The validation and update operations are restricted to the `VkRenderPassBeginInfo::renderArea`, as is the load operation. As load operations occur _only_ in the first use of an attachment, the only previous context that needs to be considered is the external (command buffer) access context.  Validation is then performed on the external Resource Access state (copied and modified by the image layout transition as needed). Additionally the load operation must be validated against any image layout transition if present.  This can be done by comparing the destination access scope of the barrier from external with the usage(s) implied in the loadOp and/or stencilLoadOp.  If those usages lie within the destination access scope, then no hazard with the layout transition occurs.  If there is a layout transition no check versus the external context is needed as the transition is the “most recent access” (though at validation time not yet recorded).

The stage/access pair used for validation is based on the format of the attachment (color or depth/stencil) and the load operation specified.


<table>
  <tr>
   <td>Format
   </td>
   <td>Load Stage / Load Access Prefix
   </td>
   <td>Op Load
   </td>
   <td> \
Op Clear / Don’t care
   </td>
  </tr>
  <tr>
   <td>Color
   </td>
   <td>STAGE_COLOR_ATTACHMENT_OUTPUT / ACCESS_COLOR_ATTACHMENT
   </td>
   <td>_READ
   </td>
   <td>_WRITE
   </td>
  </tr>
  <tr>
   <td>Depth Stencil
   </td>
   <td>STAGE_EARLY_FRAGMENT_TESTS / \
ACCESS_DEPTH_STENCIL_ATTACHMENT
   </td>
   <td>_READ
   </td>
   <td>_WRITE
   </td>
  </tr>
</table>


Load operations have guarantees memory access order guarantees:

> The load operation for each sample in an attachment happens-before any recorded command which accesses the sample

To support these guarantees, load operations update resource access state to reflect the effective stage/access of the load operation, and then apply an effective barrier with the source execution and access scopes reflecting the load operation stage/access and the destination execution and access scopes based on the the valid stages and accesses listed. _(Note: some consideration was given to simply zeroing out the access state  s.t. no hazard relative to the access could occur, however, given the presence of asynchronous access hazards, conserving the record of the operation, while providing a correct effective barrier was preferred, here and elsewhere in the design.)_

**_PHASE1 TODO: review both in document and code (in light of the DetectHazard time Ordering Guarantee support) whether this effective barrier is still required, as there are good reasons (such as synchronization operation chaining) not to pollute the barrier state._**

The valid stages and accesses for attachments are:


```
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT
VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

VK_ACCESS_INPUT_ATTACHMENT_READ_BIT
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT
VK_ACCESS_COLOR_ATTACHMENT_READ_NONCOHERENT_BIT_EXT
```



#### Renderpass Next/End Operations

As next/end calls end the current subpass can, the subpass attachment can be tested for violations of the “preserve attachment” no access rule. This validation inspects the subpass access context to ensure that no access state change is found for the address range covered by the preserve attachment view. Given the current _lazy_ evaluation of access context contents, is done by testing that no resource state access records are found within the attachment range. The range tested _is _restricted to the `VkRenderPassBeginInfo::renderArea`.

Automated of resolve attachment occurs at the end of each subpass with valid resolve attachments.  They are automatically synchronized with the matching _input_ color/depth/stencil attachments. The reads of the matching color or depth/stencil attachment are validated as attachment read operations, the raster order guarantees embedded in state update correctly accounting for that synchronization.  The resolve attachments are validated as attachment write operations relative to the resolve target attachment, with the state update consistent with the attachment write stage/access `COLOR_ATTACHMENT_OUTPUT_COLOR_ATTACHMENT_WRITE`.


#### Renderpass End Operations

As renderpass end completes a renderpass instance several pieces of cleanup work are required. Primarily these are attachment store and resolve operations.

Store operations also have implicit ordering guarantees that it “happens-after any recorded command which accesses the sample in the last subpass where the attachment is used”.  Similar to the load operation there is an implicit barrier, but as validation precedes state update, the implicit barrier is applied to a copy of the resource access state during validation.  The state update reflects the store op:

<table>
  <tr>
   <td>Format
   </td>
   <td>Store Stage
   </td>
   <td>All Store Ops
   </td>
  </tr>
  <tr>
   <td>Color
   </td>
   <td>STAGE_COLOR_ATTACHMENT_OUTPUT
   </td>
   <td>ACCESS_COLOR_ATTACHMENT_WRITE
   </td>
  </tr>
  <tr>
   <td>Depth Stencil
   </td>
   <td>STAGE_LATE_FRAGMENT_TESTS
   </td>
   <td>ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE
   </td>
  </tr>
</table>
Resolve operations export the state of the subpass access contexts to the access context for the command buffer. For each access state, the dstSubpass=VK_SUBPASS_EXTERNAL barrier is applied, and the result is merged with the command buffer access context. The resulting state is the resolve of all subpass access with their final barrier applied.  The resolve is performed after the store operation state update has completed.


### Synchronization operations


#### CmdPipelineBarrier

For each memory, buffer, and image barrier defined use the range traversal functions to update all accesses in source scope. For images, prior to update (i.e. in PreValidate…) test for image layout transition hazards as described above.

Apply global execution barriers from the src/dstStageMask.


#### CmdSet/WaitEvent

**TODO/KNOWN LIMITATION:** host set event not supported through at least phase 2.

vkCmdSetEvent defines the source execution scope for the matching vkCmdWaitEvent but does not cause any update to the resource access state.  While a simplistic model would allow simple recording of the resource usage tag to define the set of resource accesses affected at wait time, the access state’s storage of dependency chain information is not time-stamped.  Thus to record the accesses within the scope of the the set event, a range map (containing potentially simply a boolean) is constructed to record the address ranges containing “in scope” accesses.  This map should likely have a “merge operation” run, combining adjacent ranges with the same value to minimize the memory footprint of the map.  Additionally, vkCmdSetEvent must record the usage tag (or equivalent sequence/timestamp).

CmdWaitEvent uses both the source execution scope map and the usage tag to determine the _effective _source scopes.  Resource access state barrier information are _only _updated if they lie within a range of the source scope map, have not (by means of testing usage tags) been updated since the CmdSetEvent call, and (as appropriate) lie within the access scopes defined by the wait event command.


#### Semaphore Operations

Semaphore Operations synchronize accesses between submitted command buffers at submission "batch" granularity. Semaphores (unlike barriers and events) can synchronize both accesses within command buffers submitted to a single queue, and (more typically) between accesses within command buffers on different queues.  Semaphores, like events, are two part synchronization operations.  The first scope of the combined operation is defined relative to the signaling queue, and includes the usual chaining of the first scopes with prior accesses and synchronization operations.  The second scope of the wait operation is applied to all accesses within the first scope of the signal operation.  For semaphores synchronizing different queues, the second scope of the combined operation *replaces* the barrier state for access imported into the QueueBatchContext.

There are additional impacts to the application of barriers/event and hazard detection for accesses which are from different queues -- i.e. that are *not* in submission order.  Since barriers and events are defined only to include operations that are earlier in submission order, barriers (and event) first scopes do not include accesses performed in other queues. For those accesses, only the chaining effects of earlier synchronization operations, such as semaphore, include a given access within the first scope of a barrier. Hazard detection is also affected in that implicit ordering guarantees are not applied to accesses not in submission order.


### Command Buffer Action Commands

Recorded action command describe compute, graphics, or transfer actions to be executed on a device when the containing command buffer is submitted to a queue of the appropriate capabilities. Validation of these consists (in the first phase of implementation) of hazard detection for all known &lt;range, stage/access> pairs against the current access context. The access context is then updated for the same set of &lt;range, stage/access> pair.  Each update is tagged with the Usage Tag as described above

#### Commands

<table>
  <tr>
   <td>Pipeline type
   </td>
   <td>Command
   </td>
   <td>Notes
   </td>
  </tr>
  <tr>
   <td>compute
   </td>
   <td><code>vkCmdDispatch</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>compute
   </td>
   <td><code>vkCmdDispatchIndirect</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>graphics
   </td>
   <td><code>vkCmdClearAttachments</code>
   </td>
   <td>TobinE: No memory barriers are needed between vkCmdClearAttachments and preceding or subsequent draw or attachment clear commands in the same subpass. I believe color/depth bits are used depending on aspectMask for attachment to be cleared  (TODO: Verify this)
   </td>
  </tr>
  <tr>
   <td>graphics
   </td>
   <td><code>vkCmdDraw</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>graphics
   </td>
   <td><code>vkCmdDrawIndexed</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>graphics
   </td>
   <td><code>vkCmdDrawIndexedIndirect</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>graphics
   </td>
   <td><code>vkCmdDrawIndexedIndirectCountAMD</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>graphics
   </td>
   <td><code>vkCmdDrawIndexedIndirectCountKHR</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>graphics
   </td>
   <td><code>vkCmdDrawIndirect</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>graphics
   </td>
   <td><code>vkCmdDrawIndirectCountAMD</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>graphics
   </td>
   <td><code>vkCmdDrawIndirectCountKHR</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdBlitImage</code>
   </td>
   <td>
       </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdClearColorImage</code>
   </td>
   <td>                    “
   </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdClearDepthStencilImage</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdCopyBuffer</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdCopyBufferToImage</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdCopyImage</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdCopyImageToBuffer</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdCopyQueryPoolResults</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdFillBuffer</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdResolveImage</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdUpdateBuffer</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdWriteBufferMarkerAMD</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>transfer
   </td>
   <td><code>vkCmdWriteTimestamp</code>
   </td>
   <td>
   </td>
  </tr>
  <tr>
   <td>N/A
   </td>
   <td><code>vkCmdExecuteCommands</code>
   </td>
   <td>Not supported in initial release.
   </td>
  </tr>
</table>



**TODO/KNOWN LIMITATION:** ExecuteCommands and QueueSubmit not supported in phase 1

Recording and validating `vkCmdExecuteCommands `which is more similar in scope to the various `vkQueue… `commands.  The “first” access state of the executed command buffers must be added to the access context of the calling parent buffer. Record time validation of `vkCmdExecuteCommands `may require a replay of the secondary command buffer synchronization operations up to the point of the first access within the secondary command buffer. TODO: Finish in phase 2 design, with as much code reuse w.r.t. queue submit as possible.


#### Command Buffer State Bindings

##### Vertex Input Fixed-function bindings

These are read only, but as the actual indices and vertices referenced are only known at shader execution time, the entire bound resource is treated as being read by draw commands that reference these bound resources

	Vertex

	Index

##### Bound Descriptor Set Access

The stage/access for each bound descriptor referenced by a draw or dispatch call is determined by the shader stage, access control decoration and descriptor type.  For those descriptors that _can_ be stored to, access of read vs. write is determined by the “GLSL readonly” or Spir-V NonWritable decoration.  An binding not marked as readonly, and legal for store or atomic operations will be treated as a write access.

<table>
  <tr>
   <td>Descriptor Type
   </td>
   <td><code>VK_DESCRIPTOR_TYPE_</code>
   </td>
   <td>Resource
   </td>
   <td>Operations
   </td>
   <td><code>VK_ACCESS_2_..._BIT_KHR</code>
   </td>
  </tr>
  <tr>
   <td>storage image
   </td>
   <td><code>STORAGE_IMAGE</code>
   </td>
   <td>Image view
   </td>
   <td>load, store, and atomic
   </td>
   <td><code>SHADER_STORAGE_READ(WRITE) </code>
   </td>
  </tr>
  <tr>
   <td>sampled image
   </td>
   <td><code>SAMPLED_IMAGE</code>
   </td>
   <td>Image view
   </td>
   <td>sampling
   </td>
   <td><code>SHADER_SAMPLED_READ</code>
   </td>
  </tr>
  <tr>
   <td>combined image sampler
   </td>
   <td><code>COMBINED_IMAGE_SAMPLER</code>
   </td>
   <td>Sampler and Image view
   </td>
   <td>sampling
   </td>
   <td><code>SHADER_SAMPLED_READ</code>
   </td>
  </tr>
  <tr>
   <td>uniform texel buffer
   </td>
   <td><code>UNIFORM_TEXEL_BUFFER</code>
   </td>
   <td>Buffer view
   </td>
   <td>load
   </td>
   <td><code>SHADER_STORAGE_READ</code>
   </td>
  </tr>
  <tr>
   <td>storage texel buffer
   </td>
   <td><code>STORAGE_TEXEL_BUFFER</code>
   </td>
   <td>Buffer view
   </td>
   <td>load, store, and atomic
   </td>
   <td><code>SHADER_STORAGE_READ(WRITE) </code>
   </td>
  </tr>
  <tr>
   <td>storage buffer
   </td>
   <td><code>STORAGE_BUFFER</code>
   </td>
   <td>Buffer
   </td>
   <td>load, store, and atomic
   </td>
   <td><code>SHADER_STORAGE_READ(WRITE) </code>
   </td>
  </tr>
  <tr>
   <td>uniform buffer
   </td>
   <td><code>UNIFORM_BUFFER</code>
   </td>
   <td>Buffer
   </td>
   <td>load
   </td>
   <td><code>SHADER_STORAGE_READ</code>
   </td>
  </tr>
  <tr>
   <td>dynamic uniform buffer
   </td>
   <td><code>UNIFORM_BUFFER_DYNAMIC</code>
   </td>
   <td>Buffer
   </td>
   <td>load
   </td>
   <td><code>SHADER_STORAGE_READ</code>
   </td>
  </tr>
  <tr>
   <td>dynamic storage buffer
   </td>
   <td><code>STORAGE_BUFFER_DYNAMIC</code>
   </td>
   <td>Buffer
   </td>
   <td>load, store, and atomic
   </td>
   <td><code>SHADER_STORAGE_READ(WRITE)</code>
   </td>
  </tr>
  <tr>
   <td>inline uniform block
   </td>
   <td><code>INLINE_UNIFORM_BLOCK_EXT</code>
   </td>
   <td>Storage in the encompassing descriptor set
   </td>
   <td>
   </td>
   <td><code>SHADER_STORAGE_READ</code>
   </td>
  </tr>
  <tr>
   <td>input attachment
   </td>
   <td><code>INPUT_ATTACHMENT</code>
   </td>
   <td> Image view
   </td>
   <td>Framebuffer local load
   </td>
   <td><code>INPUT_ATTACHMENT_READ</code>
   </td>
  </tr>
  <tr>
   <td>acceleration structure
   </td>
   <td><code>ACCELERATION_STRUCTURE_KHR</code>
   </td>
   <td>Acceleration structure
   </td>
   <td>Shaders read-only access
   </td>
   <td><code>SHADER_STORAGE_READ</code>
   </td>
  </tr>
</table>


** Attachment read, based on image format VK_ACCESS_COLOR_ATTACHMENT_READ_BIT for Color format attachment and VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT for depth/stencil formats.

#### Attachments -- Framebuffer local images

##### Color Attachments

Color attachment use is controlled by the fragment shader output declarations and `VkPipelineRasterizationStateCreateInfo::rasterizerDiscardEnable`. Fragment shader output variables with Location decorations correspond to the pColorAttachments index for the current subpass. The renderArea portion of these attachments is treated as stage/access SYNC_COLOR_ATTACHMENT_OUTPUT_COLOR_ATTACHMENT_WRITE, unless `rasterizerDiscardEnable `is true -- in which case a operations on color attachments result in no memory accesses.

**TODO/KNOWN LIMITATION:** Memory access checks not suppressed for  VK_CULL_MODE_FRONT_AND_BACK -- would need to ensure we have sufficient information to know TOPOLOGY being emitted to rasterization fixed function state.

**TODO/KNOWN LIMITATION: **First phase does not include component granularity access tracking.

> Designer’s note.  Given the “inmost loop” nature of component read/write enable, it is likely the performance and memory use will be unacceptable unless the opaque range encoding changes the apparent memory organization of components to be at least just above the x, y, and z dimensions, s.t. the effective “run-length-compression” of the range maps is not lost. Storing access state byte-by-byte is likely not going to be acceptable.  Whether this encoding should be adaptive (only present for images with component level information or globally present in the opaque range Synchronization Validation specific encoding) is a design detail TBD. For broadest application this may mean that non-aliased linear tiling images should also be given an an validation specific opaque encoding w.r.t. component granularity information.


##### Depth/Stencil Attachments

Depth/Stencil Attachment usage is controlled by `VkPipelineRasterizationStateCreateInfo::rasterizerDiscardEnable `and various parameter of `VkPipelineDepthStencilStateCreateInfo`. If `rasterizerDiscardEnable` is set,  Depth/Stencil attachments are ignored for hazard detection and state update.  Depth aspect detection and update are controlled by depthTestEnable and depthWriteEnable and the ExecutionMode EarlyFragmentTests. If depthTestEnable is false, depth attachments aspects are ignored. Otherwise the following stage/access are used for the renderArea portion of the depth aspect. \


<table>
  <tr>
   <td><code>depthWriteEnable</code>
   </td>
   <td><code>EarlyFragmentTests</code>
   </td>
   <td>Stage/Access
   </td>
  </tr>
  <tr>
   <td>FALSE
   </td>
   <td>Specified
   </td>
   <td><code>SYNC_EARLY_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_READ</code>
   </td>
  </tr>
  <tr>
   <td>TRUE
   </td>
   <td>Specified
   </td>
   <td><code>SYNC_EARLY_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_WRITE</code>
   </td>
  </tr>
  <tr>
   <td>FALSE
   </td>
   <td>Not set
   </td>
   <td><code>SYNC_LATE_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_READ</code>
   </td>
  </tr>
  <tr>
   <td>TRUE
   </td>
   <td>Not set
   </td>
   <td><code>SYNC_LATE_FRAGMENT_TESTS_DEPTH_STENCIL_ATTACHMENT_WRITE</code>
   </td>
  </tr>
</table>


Stencil attachments are controlled by `stencilTestEnable`, with additional state to determine the type of access.  The type of access is controlled by the the `VkStencilOpState`. Stencil aspect access is read-only If  `writeMask `is zero, or the one of the following combination of stencil parameters is set for all  non-culled (see `VkPipelineRasterizationStateCreateInfo::cullMode)` faces front or back.


<table>
  <tr>
   <td>compareOp
   </td>
   <td>failOp
   </td>
   <td>passOp
   </td>
   <td>depthFailOp
   </td>
  </tr>
  <tr>
   <td>VK_COMPARE_OP_
   </td>
   <td colspan="3" >VK_STENCIL_OP_
   </td>
  </tr>
  <tr>
   <td>NEVER
   </td>
   <td>KEEP
   </td>
   <td>any
   </td>
   <td>any
   </td>
  </tr>
  <tr>
   <td>ALWAYS
   </td>
   <td>any
   </td>
   <td>KEEP
   </td>
   <td>KEEP*
   </td>
  </tr>
  <tr>
   <td>other
   </td>
   <td>KEEP
   </td>
   <td>KEEP
   </td>
   <td>KEEP*
   </td>
  </tr>
</table>


If depthTestEnable is TRUE, otherwise, any, since with depth testing disabled the depth test cannot fail.

### Host Synchronization commands

**TODO/KNOWN LIMITATION:** Host synchronization not supported in phases 1 and 2.