File: NEWS.md

package info (click to toggle)
mpich 4.3.2-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 101,184 kB
  • sloc: ansic: 1,040,629; cpp: 82,270; javascript: 40,763; perl: 27,933; python: 16,041; sh: 14,676; xml: 14,418; f90: 12,916; makefile: 9,270; fortran: 8,046; java: 4,635; asm: 324; ruby: 103; awk: 27; lisp: 19; php: 8; sed: 4
file content (4338 lines) | stat: -rw-r--r-- 156,011 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722
2723
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746
2747
2748
2749
2750
2751
2752
2753
2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765
2766
2767
2768
2769
2770
2771
2772
2773
2774
2775
2776
2777
2778
2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
2807
2808
2809
2810
2811
2812
2813
2814
2815
2816
2817
2818
2819
2820
2821
2822
2823
2824
2825
2826
2827
2828
2829
2830
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
2850
2851
2852
2853
2854
2855
2856
2857
2858
2859
2860
2861
2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872
2873
2874
2875
2876
2877
2878
2879
2880
2881
2882
2883
2884
2885
2886
2887
2888
2889
2890
2891
2892
2893
2894
2895
2896
2897
2898
2899
2900
2901
2902
2903
2904
2905
2906
2907
2908
2909
2910
2911
2912
2913
2914
2915
2916
2917
2918
2919
2920
2921
2922
2923
2924
2925
2926
2927
2928
2929
2930
2931
2932
2933
2934
2935
2936
2937
2938
2939
2940
2941
2942
2943
2944
2945
2946
2947
2948
2949
2950
2951
2952
2953
2954
2955
2956
2957
2958
2959
2960
2961
2962
2963
2964
2965
2966
2967
2968
2969
2970
2971
2972
2973
2974
2975
2976
2977
2978
2979
2980
2981
2982
2983
2984
2985
2986
2987
2988
2989
2990
2991
2992
2993
2994
2995
2996
2997
2998
2999
3000
3001
3002
3003
3004
3005
3006
3007
3008
3009
3010
3011
3012
3013
3014
3015
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
3026
3027
3028
3029
3030
3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
3045
3046
3047
3048
3049
3050
3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
3082
3083
3084
3085
3086
3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
3135
3136
3137
3138
3139
3140
3141
3142
3143
3144
3145
3146
3147
3148
3149
3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
3208
3209
3210
3211
3212
3213
3214
3215
3216
3217
3218
3219
3220
3221
3222
3223
3224
3225
3226
3227
3228
3229
3230
3231
3232
3233
3234
3235
3236
3237
3238
3239
3240
3241
3242
3243
3244
3245
3246
3247
3248
3249
3250
3251
3252
3253
3254
3255
3256
3257
3258
3259
3260
3261
3262
3263
3264
3265
3266
3267
3268
3269
3270
3271
3272
3273
3274
3275
3276
3277
3278
3279
3280
3281
3282
3283
3284
3285
3286
3287
3288
3289
3290
3291
3292
3293
3294
3295
3296
3297
3298
3299
3300
3301
3302
3303
3304
3305
3306
3307
3308
3309
3310
3311
3312
3313
3314
3315
3316
3317
3318
3319
3320
3321
3322
3323
3324
3325
3326
3327
3328
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
3388
3389
3390
3391
3392
3393
3394
3395
3396
3397
3398
3399
3400
3401
3402
3403
3404
3405
3406
3407
3408
3409
3410
3411
3412
3413
3414
3415
3416
3417
3418
3419
3420
3421
3422
3423
3424
3425
3426
3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
3440
3441
3442
3443
3444
3445
3446
3447
3448
3449
3450
3451
3452
3453
3454
3455
3456
3457
3458
3459
3460
3461
3462
3463
3464
3465
3466
3467
3468
3469
3470
3471
3472
3473
3474
3475
3476
3477
3478
3479
3480
3481
3482
3483
3484
3485
3486
3487
3488
3489
3490
3491
3492
3493
3494
3495
3496
3497
3498
3499
3500
3501
3502
3503
3504
3505
3506
3507
3508
3509
3510
3511
3512
3513
3514
3515
3516
3517
3518
3519
3520
3521
3522
3523
3524
3525
3526
3527
3528
3529
3530
3531
3532
3533
3534
3535
3536
3537
3538
3539
3540
3541
3542
3543
3544
3545
3546
3547
3548
3549
3550
3551
3552
3553
3554
3555
3556
3557
3558
3559
3560
3561
3562
3563
3564
3565
3566
3567
3568
3569
3570
3571
3572
3573
3574
3575
3576
3577
3578
3579
3580
3581
3582
3583
3584
3585
3586
3587
3588
3589
3590
3591
3592
3593
3594
3595
3596
3597
3598
3599
3600
3601
3602
3603
3604
3605
3606
3607
3608
3609
3610
3611
3612
3613
3614
3615
3616
3617
3618
3619
3620
3621
3622
3623
3624
3625
3626
3627
3628
3629
3630
3631
3632
3633
3634
3635
3636
3637
3638
3639
3640
3641
3642
3643
3644
3645
3646
3647
3648
3649
3650
3651
3652
3653
3654
3655
3656
3657
3658
3659
3660
3661
3662
3663
3664
3665
3666
3667
3668
3669
3670
3671
3672
3673
3674
3675
3676
3677
3678
3679
3680
3681
3682
3683
3684
3685
3686
3687
3688
3689
3690
3691
3692
3693
3694
3695
3696
3697
3698
3699
3700
3701
3702
3703
3704
3705
3706
3707
3708
3709
3710
3711
3712
3713
3714
3715
3716
3717
3718
3719
3720
3721
3722
3723
3724
3725
3726
3727
3728
3729
3730
3731
3732
3733
3734
3735
3736
3737
3738
3739
3740
3741
3742
3743
3744
3745
3746
3747
3748
3749
3750
3751
3752
3753
3754
3755
3756
3757
3758
3759
3760
3761
3762
3763
3764
3765
3766
3767
3768
3769
3770
3771
3772
3773
3774
3775
3776
3777
3778
3779
3780
3781
3782
3783
3784
3785
3786
3787
3788
3789
3790
3791
3792
3793
3794
3795
3796
3797
3798
3799
3800
3801
3802
3803
3804
3805
3806
3807
3808
3809
3810
3811
3812
3813
3814
3815
3816
3817
3818
3819
3820
3821
3822
3823
3824
3825
3826
3827
3828
3829
3830
3831
3832
3833
3834
3835
3836
3837
3838
3839
3840
3841
3842
3843
3844
3845
3846
3847
3848
3849
3850
3851
3852
3853
3854
3855
3856
3857
3858
3859
3860
3861
3862
3863
3864
3865
3866
3867
3868
3869
3870
3871
3872
3873
3874
3875
3876
3877
3878
3879
3880
3881
3882
3883
3884
3885
3886
3887
3888
3889
3890
3891
3892
3893
3894
3895
3896
3897
3898
3899
3900
3901
3902
3903
3904
3905
3906
3907
3908
3909
3910
3911
3912
3913
3914
3915
3916
3917
3918
3919
3920
3921
3922
3923
3924
3925
3926
3927
3928
3929
3930
3931
3932
3933
3934
3935
3936
3937
3938
3939
3940
3941
3942
3943
3944
3945
3946
3947
3948
3949
3950
3951
3952
3953
3954
3955
3956
3957
3958
3959
3960
3961
3962
3963
3964
3965
3966
3967
3968
3969
3970
3971
3972
3973
3974
3975
3976
3977
3978
3979
3980
3981
3982
3983
3984
3985
3986
3987
3988
3989
3990
3991
3992
3993
3994
3995
3996
3997
3998
3999
4000
4001
4002
4003
4004
4005
4006
4007
4008
4009
4010
4011
4012
4013
4014
4015
4016
4017
4018
4019
4020
4021
4022
4023
4024
4025
4026
4027
4028
4029
4030
4031
4032
4033
4034
4035
4036
4037
4038
4039
4040
4041
4042
4043
4044
4045
4046
4047
4048
4049
4050
4051
4052
4053
4054
4055
4056
4057
4058
4059
4060
4061
4062
4063
4064
4065
4066
4067
4068
4069
4070
4071
4072
4073
4074
4075
4076
4077
4078
4079
4080
4081
4082
4083
4084
4085
4086
4087
4088
4089
4090
4091
4092
4093
4094
4095
4096
4097
4098
4099
4100
4101
4102
4103
4104
4105
4106
4107
4108
4109
4110
4111
4112
4113
4114
4115
4116
4117
4118
4119
4120
4121
4122
4123
4124
4125
4126
4127
4128
4129
4130
4131
4132
4133
4134
4135
4136
4137
4138
4139
4140
4141
4142
4143
4144
4145
4146
4147
4148
4149
4150
4151
4152
4153
4154
4155
4156
4157
4158
4159
4160
4161
4162
4163
4164
4165
4166
4167
4168
4169
4170
4171
4172
4173
4174
4175
4176
4177
4178
4179
4180
4181
4182
4183
4184
4185
4186
4187
4188
4189
4190
4191
4192
4193
4194
4195
4196
4197
4198
4199
4200
4201
4202
4203
4204
4205
4206
4207
4208
4209
4210
4211
4212
4213
4214
4215
4216
4217
4218
4219
4220
4221
4222
4223
4224
4225
4226
4227
4228
4229
4230
4231
4232
4233
4234
4235
4236
4237
4238
4239
4240
4241
4242
4243
4244
4245
4246
4247
4248
4249
4250
4251
4252
4253
4254
4255
4256
4257
4258
4259
4260
4261
4262
4263
4264
4265
4266
4267
4268
4269
4270
4271
4272
4273
4274
4275
4276
4277
4278
4279
4280
4281
4282
4283
4284
4285
4286
4287
4288
4289
4290
4291
4292
4293
4294
4295
4296
4297
4298
4299
4300
4301
4302
4303
4304
4305
4306
4307
4308
4309
4310
4311
4312
4313
4314
4315
4316
4317
4318
4319
4320
4321
4322
4323
4324
4325
4326
4327
4328
4329
4330
4331
4332
4333
4334
4335
4336
4337
4338
Libfabric release notes
=======================

This file contains the main features as well as overviews of specific
bug fixes (and other actions) for each version of Libfabric since
version 1.0.  New major releases include all fixes from minor
releases with earlier release dates.

v1.20.1, Mon Jan 22, 2024
=========================

## Core

- hmem/ze: Change the library name passed to dlopen
- hmem/ze: map device id to physical device
- hmem/ze: skip duplicate initialization
- hmem/ze: dynamically allocate device resources based on number of devices
- hmem/ze: fix hmem_ze_copy_engine variable look up
- hmem/ze: Increase ZE_MAX_DEVICES to 32
- man: Fix typo in fi_getinfo man page
- Fix compiler warning when compiling with ICX
- man: Fix fi_rxm.7 and fi_collective.3 man pages
- man: Update EFA docs for FI_EFA_INTER_MIN_READ_WRITE_SIZE

## EFA

- efa_rdm_ep_record_tx_op_submitted() rm peer lookup
- Remove peer lookup from efa_rdm_pke_sendv()
- Make handshake response use txe
- test: Only close SHM if SHM peer is Created
- Handshake code allocs txe via efa util
- Initialize txe.rma_iov_count to 0
- Switch fi_addr to efa_rdm_peer in trigger_handshake
- Downgrade EFA Endpoint Creation WARN to INFO
- Init srx_ctx before use
- Clean up generic_send path
- Pass in efa_rdm_ep to efa_rdm_msg_generic_recv()
- Make recv path slightly more efficient
- re-org rma write to avoid duplicate checks
- Add missing sync_memops call to writedata
- use peer pointer from txe in read, write and send
- Pass in peer pointer to txe
- Get rid of noop instruction from empty #define
- Remove noop memset
- Fix the ibv cq error handling.
- Don't do handshake for local read
- Fix a typo in configure.m4
- Make runt_size aligned

## NetDir

- Add missing unlock in error path of nd_send_ack()

## OPX

- Initialize cq error data size

## RXM

- Fix data error with FI_OFI_RXM_USE_RNDV_WRITE=1

## SHM

- Fix coverity issue about resource leak
- Adjust the order of smr_region fields.
- Allocate peer device fds dynamically

## Util

- Fix coverity issue about missing lock
- Implement timeout in util_wait_yield_run()
- Fix bug in util_cq startup error case
- util_mem_hooks: add missing parantheses

## Verbs

- Windows: Resolve regression in user data retrieval

## Fabtests

- efa: Close ibv device after use
- efa: Get device MR limit from ibv_query_device
- efa: Add simple unexpected test to MR exhaustion test
- pytest: add a new ssh connection error pattern

v1.20.0, Fri Nov 17, 2023
=========================

## Core

- General bug fixes and code clean-up
- configure.ac: add extra check for 128 bit atomic support
- hmem/synapseai: Refine the error handling and warning
- Introduce FI_ENOMR
- hmem/cuda: fix a bug when calculating aligned size.
- Handle dmabuf for ofi_mr_cache* functions.
- Handle dmabuf flag in ofi_mr_attr_update
- Handle dmabuf for mr_map insert.
- man: Fix the description of virtual address when FI_MR_DMABUF is set
- man: Clarify the defition of FI_OPT_MIN_MULTI_RECV
- hmem/cuda: Add dmabuf fd ops functions
- include/ofi_atomic_queue: Properly align atomic values
- Define fi_av_set_user_id
- Support multiple auth keys per EP
- Simplify restricted-dl feature
- hmem: Only initalize synapseai if device exists
- Add "--enable-profile" option
- windows: Updated config.h
- Add environment variable for selective HMEM initialization
- Add restricted dlopen flag to configure options
- hmem: generalize the use of OFI_HMEM_DATA to non-cuda iface
- hmem: fail cuda_dev_register if gdrcopy is not enabled
- Add 1.7 ABI compat
- Define fi_domain_attr::max_ep_auth_key
- hmem: Add new op to hmem_ops for getting dmabuf fd
- hmem/cuda: Update cuda_gdrcopy_dev_register's signature
- mr_cache: Define ofi_mr_info::flags
- Add ABI compat for fi_cq_err_entry::src_addr
- Define fi_cq_err_entry::src_addr
- Add base_addr to fi_mr_dmabuf
- hmem: Set FI_HMEM_HOST_ALLOC for ze addr valid
- hmem: Support dev reg with FI_HMEM_ZE
- tostr: Added fi_tostr() for data type struct fi_cq_err_entry.
- hmem_ze: fix incorrect device id in copy function
- Introduce new profiling interface for low-level statistics
- hmem: Support dev reg with FI_HMEM_CUDA
- hmem: Support dev reg with FI_HMEM_ROCR
- hmem: Support dev reg with FI_HMEM_SYSTEM
- hmem: Define optimized HMEM memcpy APIs
- Implement memhooks atfork child handler
- hmem: Support ofi_hmem_get_base_addr with sys mem
- hmem: Add length field to ofi_hmem_get_base_addr
- mr_cache: Improve cache hit rate
- mr_cache: Purge dead regions in find
- mr_cache: Update find to remove invalid MR entries
- mr_cache: Update find with MM valid check
- Add direct support for dma-buf memory registration
- man/fi_tagged: Remove the peek for data ability
- indexer: Add byte idx abstraction
- Add missing FI_REMOTE_CQ_DATA for fi_inject_writedata
- Add configure flags for more sanitizers
- Fix fi_peer man page inconsistency
- include/fi_peer: Add cq_data to rx_entry, allow peer to modify on unexp
- Add XPMEM support

## EFA

- General bug fix and code clean-up
- Do not abort on all deprecated env vars
- Onboard fi_mr_dmabuf API in mem reg ops.
- Try registering cuda memory via dmabuf when checking p2p
- Introduce HAVE_EFA_DMABUF_MR macro in configure
- Add read nack protocol docs
- Receiver send NACK if runt read fails with ENOMR
- Sender switch to long CTS protocol if runt read fails with ENOMR
- Receiver send NACK if long read fails with ENOMR
- Update efa_rdm_rxe_map_remove to accept msg_id and addr
- Sender switch to long CTS protocol if long read fails with ENOMR
- Introduce new READ_NACK feature
- Use SHM's full inject size
- Add testing for small messages without inject
- Enable inject rdma write
- Use bounce buffer for 0 byte writes
- Onboard ofi_hmem_dev_register API
- Update cuda_gdrcopy_dev_register's signature
- Allocate pke_vec, recv_wr_vec, sge_vec from heap
- Close shm resource when it is disabled in ep
- Disable RUNTING for Neuron
- Move cuda-sync-memops from MR to EP
- Do not insert shm av inside efa progress engine
- Enable shm when FI_HMEM and FI_ATOMIC are requested
- Adjust posted receive size to pkt_size
- Do not create SHM peer when SHM is disabled
- Use correct threading model for shm
- Restrict RDMA read to compatible EFA devices
- Add EFA device version to handshake
- Add missing locks in efa_cntr_wait.
- Add writedata RNR fabtest
- Handle RNRs from RDMA writedata
- Check opt_len in efa_rdm_ep_getopt
- Use correct tx/rx op_flags for shm

## Hooks

- dmabuf: Initialize fd to supress compiler warning
- trace: Add log on FI_VAR_UNEXP_MSG_CNT when enabled.
- trace: Fixed trace log format on some attributes.

## OPX

- Fix compiler warnings

## PSM3

- Fix compiler warnings
- Update provider to sync with IEFS 11.5.1.1.1

## RXM

- Remove unused function
- Use gdrcopy in rma when emulating injection
- Use gdrcopy in eager send/recv
- Add hmem gdrcopy functions
- Remove unused dynamic rbuf support

## SHM

- General bug fixes and cleanup
- Add ofi_buf_alloc error handling
- Only copy header + msg on unexpected path
- Add FI_HMEM atomic support
- Add memory barrier before updating resp for atomic
- Add more error output
- Reduce atomic locking with ofi_mr_map_verify
- Only increment tx cntr when inject rma succeeded.
- Use peer cntr inc ops in smr_progress_cmd
- Allow for inject protocol to buffer more unexpected messages
- Change pending fs to bufpool to allow it to grow
- Add unexpected SAR buffering
- Use generic acronym for shm cap
- Move CMA to use the p2p infrastructure
- Add p2p abstraction
- Load DSA dependency dynamically
- Replace tx_lock with ep_lock
- Calculate comp vars when writing completion
- Move progress_sar above progress_cmd
- Rename SAR status enum to be more clear
- Make SAR protocol handle 0 byte transfer.
- Move selection logic to smr_select_proto()

## Sockets

- Fix compiler warnings
- Fix provider name and api version in returned fi_info struct

## TCP

- Add profiling interface support
- Pass through rdm_ep flags to msg eps
- Derive cq flags from op and msg flags
- Do not progress ep that is disconnected
- Set FI_MULTI_RECV for last completed RX slice
- Return an error if invalid sequence number received
- xnet_progress_rx() must only be called when connected
- Reset ep->rx_avail to 0 after RX queue is flushed
- Disable the EP if an error is detected for zero-copy
- Add debug tracking of transfer entries
- Negotiate support for rendezvous
- Add rendezvous protocol option
- Generalize xnet_send_ack
- Flatten protocol header definitions
- Remove unused dynamic rbuf support
- Define tcp specific protocol ops
- Remove unneeded and incorrect rx_entry init code

## UCX

- Add FI_HMEM support
- Initialize ep_flush to 1

## Util

- General bug fixes
- memhooks: Fix a bug when calculating mprotect region
- Check the return value of ofi_genlock_init()
- Update checks for FI_AV_AUTH_KEY
- Define domain primary and secondary caps
- Add profiling util functions
- Update util_cq to support err_data
- Update ofi_cq_readerr to use new memcpy
- Update ofi_cq_err_memcpy to handle err_data
- Zero util cancel err entry
- Move FI_REMOTE/LOCAL_COMM to secondary caps
- Alter domain max_ep_auth_key
- Add domain checks for max_ep_auth_key
- Revert util_cntr->ep_list_lock to ofi_mutex
- Add NIC FID functions to ofi.h
- Add EP and domain auth key checking
- Add bounds checks to ibuf get
- Define dlist_first_entry_or_null
- Update util_getinfo to dup auth_key
- Revert util_av, util_cq and util_cntr to mutex
- Add missing calls to (de)initialize monitor's mutexes
- Avoid attempting to cleanup an uninitialized MR cache
- Rename ofi_mr_info fields
- Add rv64g support to memory hooks

## Verbs

- Windows: Check error code from GetPrivateData
- Add missing lock to protect SRX
- Add synapseai dmabuf mr support
- Bug fix for matching domain name with device name
- Windows: Fetch rejected connection data
- Add support for DMA-buf memory registration
- Windows: Fix use-after-free in case of failure in fi_listen
- Windows: Map ND request type to ibverbs opcode
- Fix memory leak when creating EQ with unsupported wait object
- Track ep state to prevent duplicate shutdown events

## Fabtests

- Update man page
- pytests/efa: onboard dmabuf argument for test_mr
- pytest: make do_dmabuf_reg_for_hmem an cmdline argument
- Bump Libfabric API version.
- mr_test: Add dmabuf support
- Introduce ft_get_dmabuf_from_iov
- unexpected_msg: Use ft_reg_mr to register memory
- pytest: Allow registering mr with dmabuf
- Add dmabuf support to ft_reg_mr
- Add dmabuf ops for cuda.
- Test max inject size
- Add FI_HMEM support to fi_rdm_rma_event and fi_rdm tests
- memcopy-xe: Fix data verification error for device buffer
- dmabuf-rdma: Increase the number of NICs that can be tested
- dmabuf-rdma: Remove redundant libze_ops definition
- fi-mr-reg-xe: Skip native dmabuf reg test for system memory
- Check if fi_info is returned correctly in case of FI_CONNREQ
- cq_data: relax CQ data validation to cq_data_size
- Add ZE host alloc function
- Use common device host buffer for check_buf
- hmem_ze: allocate one cq and cl on init
- fi-mr-reg-xe: Add testing for dmabuf registration
- scripts: use yaml safe_load
- macos: Fix build error with clang
- multinode: Use FI_DELIVERY_COMPLETE for 'barrier'
- Handle partial read scenario for fi_xe_rdmabw test For cross node tests
- pytest/efa: add cuda memory marker
- pytest/efa: Skip some configuration for unexp msg test on neuron.
- runfabtests.py: ignore error due to no tests are collected.
- pytest/efa: extend unexpected msg test range
- pytest/shm: extend unexpected msg test range
- pytest: Allow running shm fabtests in parallel
- unexpected_msg.c: Allow running the test with FI_DELIVERY_COMPLETE
- runfabtests.sh: run fi_unexpected_msg with data validation
- pytest/shm: Extend test_unexpected_message
- unexpected_msg: Make tx/rx_size large enough
- pytest/shm: Extend shm's rma bw test
- Update shm.exclude

v1.19.0, Fri Sep 1, 2023
========================

## Core

- General code cleanup and restructuring
- Add ofi_hmem_any_ipc_enabled()
- ofi_consume_iov allows 0-byte consume
- ofi_consume_iov consistency
- ofi_indexer: return error code when iterating
- getinfo: Add post filters for domain and fabric names
- Filter loopback device if iface is specified
- bsock: Fix error checking for -EAGAIN
- windows/osd: Remove unneeded check to silence coverity
- windows/osd: Move variable declaration to silence coverity
- Introduce gdrcopy awareness to hmem copy
- mr/cache: Fix fi_mr_info initialization
- hmem_cuda: remove gdrcopy from cuda hmem copy path
- iouring: Fix wrong indent in ofi_sockapi_accept_uring()
- Implement ofi_sockctx_uring_poll_add()
- hmem: introduce gdrcopy from/to cuda iov functions
- hmem: Deprecate `FI_HMEM_CUDA_ENABLE_XFER`
- hmem_cuda: Restrict CUDA IPC based on peer accessibility
- hmem_cuda: Log number of CUDA devices detected
- hmem_cuda: Refactor global variables
- tostr: Remove the extra dir "shared/" from "include/" and "src/" .
- hmem_ze: fix ZE is valid check
- hmem_rocr: fix offset calculation
- hmem_rocr: use ofi spinlock functions
- hmem_rocr: minor fixes
- hmem_neuron: convert warn to info for nrt_get_dmabuf_fd not found
- hmem_neuron: check existance of neuron devices during initialization
- tostr: Moved Windows functions in shared/ofi_str.c to windows/osd.h
- tostr: Add helper functions ofi_tostr_size() and ofi_tostr_count().

## EFA

- Onboard Peer API, use shm provider as a peer provider
- Uses util SRX framework in shared receive procedures.
- Register shm MR with hmem_data, allow shm to use gdrcopy for cuda data movement
- Finish the refactor for rxr squash.
- Use rdma-core WR API for send requests
- Check optlen in getopt call
- Fix the rdma-read support check in RMA and MSG operations
- Optimize ep lock usage
- Use an internal fi_mr_attr for memory registration

## Hooks

- Init field in mr_attr to silence coverity
- Add profiling hook provider
- Rename cq hooking functions' names
- Added trace for resource creation operations

## OPX

- Initialize ofi_mr_info
- Fix dput credit check
- Only allocate replay buffer if psn is valid
- Support SHM Intra-node communication between single server HFI devices
- Fix incorrect packet size in packet header when sending CTS packet
- Added check to address Coverity scan defect
- Add multi-entry caching to TID rendezvous
- Fall back to default domain name for TID fabric
- Properly handle multiple IOVs in fi_opx_tsendmsg
- Fix OPX Rzv RTS receive operation SHM error (DAOS-related)
- Fix non-tagged sends may incorrectly set FI_TAGGED in send completions
- Add more info to reliability IOV buffer validation check
- Move dput packet build functions to new inline include
- Use fi_mr_attr in fi_opx_mr
- Disable Pre-NAKing by default, throttle until all outstanding replays ACK'd
- Fix reliability bug when NAKing the last PSN
- Update HeaderQ Register more frequently
- No rbuf_wrap needed for expected receive (TID)
- Fixes for Coverity scan issues
- Enhanced tag matching
- Tune expected recv for unaligned buffers
- Observability: Add finer logging granularity
- Reduce RTS immediate data and fix packet estimate for odd TID lengths
- Add additional sources for FI_OPX_UUID

## Peer

- Add cq_data to rx_entry, allow peer to modify on unexp
- Introduce peer cntr API
- Add foreach_unspec_addr API
- Add size as an input of the get_tag op

## PSM3

- Sync with IEFS 11.5.0.0.172

## SHM

- Only poll IPC list when ROCR IPC is enabled
- Allow for SAR and inject protocol to buffer more unexpected messages
- Remove unused sar fields
- Make SAR protocol handle 0 byte transfer
- Load DSA dependency dynamically
- Change recv entry freestack into bufpool
- Remove shm signal
- Use util peer cntr implementation
- Make SHM default to domain level threading level
- Replace internal shared receive implementation with util_srx
- Lock entire progress loop
- Fix ROCR data coherency
- Add FI_LOCAL_COMM to shm attrs
- Handle empty freestack
- Fix bug in configure.m4 in atomics_happy assignment happy
- Add memory barrier before update resp->status for SAR
- Do not use inline/inject for read op
- Allow shm to use gdrcopy
- Refactor protocol selection code
- Init map fi addrs to FI_ADDR_NOTAVAIL

## TCP

- General code cleanups
- Restrict which EPs can be opened per domain
- Increase CM error debug output
- Avoid calling close() on an invalid socket after accept error
- Mark the EP as disconnected before flushing the queues
- Add assertion failures for xnet_{monitor,halt}_sock
- Disable ofi_dynpoll_wait() for non-blocking progress
- Move PEP pollin operations to io_uring
- Move EP poll operations to io_uring
- Early exit if ofi_bsock_flush() has operation in progress
- Implement pollin sockctx in bsock
- Add missing call to xnet_submit_uring()
- Add return error to xnet_update_pollflag()
- Remove the cancel sockctx from the EP structure
- Move io_uring cqe from the stack to progress struct
- Reduce stack size for epoll event array
- handle NULL av in xnet_freeall_conns()

## UCX

- Publish FI_LOCAL_COMM and FI_REMOTE_COMM capabilities
- Fix configure error with newer MOFED
- Fix segfault in unsignalled completions

## Util

- Add FI_PEER support to util counter
- Refactor the usage of cntrs
- Change util_ep to be a genlock
- Add util shared receive implementation
- Update log message for invalid AV type message
- Fix fi_mr_info initialization
- Add peer ID to MR cache
- Store hmem_data in ofi_mr_map
- Split the cq progress and reading entries in ofi_cq_readfrom

## Verbs

- Add event lock to EQ to serialize closing ep
- Remove saved_wc_list and use CQ directly
- Consolidate peer_mem and dmabuf support check
- Fix vrb_add_credits signature
- Introduce new progress engine structure
- Simplify (and correct) locking around progress operations
- General code restructuring

## Fabtests

- Fix reading addressing options
- Allow to change only the OOB address
- Allow to use FI_ADDR_STR with -F
- Fix bw buffer utilization
- Separate RX and RMA counters
- Fix tx counter with RMA
- Add FI_CONTEXT mode to rdm_cntr_pingpong
- Add HMEM support to fi_unexpected_msg test
- Fix array OOB during fabtest list parsing
- Enable shm tagged_peek test
- Fix windows build warnings
- Make tx_buf and rx_buf aligned to 64 bytes by default
- Fix windows build warnings for sscanf
- Use dummy ft_pin_core on macOS
- Fix some header includes
- sock_test: Do not use epoll if not available
- recv_cancel: initialize error entry
- Fix wrong size used to allocate tx_msg_buf
- unexpected: change defaults to support tcp
- unexpected: add unknown unexpected peer test
- Enable a list of arbitrary message sizes
- Enabled data validation for rma read & write
- bw_rma operates on distinct buffer offsets
- ft_post_rma issues reads from remote's tx_buf
- General code cleanup and restructuring
- rdm_tagged_peek: fix race condition synchronization
- Add FI_LOCAL_COMM/FI_REMOTE_COMM presence check to fi_getinfo_test
- Correct ft_exchange_keys in prefix-mode
- Make rdm_tagged_peek test more general
- Add unit test for fi_setopt

v1.18.2, Fri Sep 1, 2023
========================

## Core

- Check for CUDA devices with nvmlDeviceGetCount_v2() first
- Try libnvidia-ml.so.1 if .so symlink missing
- Fix ssize_t format specifiers

## EFA

- Remove rxr_rm_tx/rx_cq_check()
- Report cntr completion for shm inject write

## SHM

- Change recv entry freestack into bufpool
- Load DSA dependency dynamically

## TCP

- Fix missing iov truncation on saved message path
- Add locking to trywait path for potential data race
- Fix incorrect locking around MR operations

## UCX

- Updated ucx.exclude and Makefile.am

## Verbs

- Add additional checks to vrb_shutdown_qp_in_err
- Prevent duplicate FI_SHUTDOWN events
- Fix memory leak when creating EQ with unsupported wait object

## Fabtests

- Extend the test_unexpected_msg
- Rename dmabuf-rdma tests to prefix with xe

v1.18.1, Fri Jun 30, 2023
=========================

## Core

- Fix build warning for ofi_dynpoll_get_fd

## EFA

- Handle 0-byte writes
- Apply byte_in_order_128_byte for all memory type
- Increase default shm_av_size to 256
- Force handshake before selecting rtm for non-system ifaces.
- Only select readbase_rtm when both sides support rdma-read
- Bugfix for initializing SHM offload
- Correct CPPFLAGS during configure
- Make setopt support sendrecv aligned 128 bytes
- Make data size to be 128 byte multiples for in-order aligned send/recv
- prepare local read pkt entry for in-order aligned send/recv.
- Disable gdrcopy and cudamemcpy for in-order aligned recv.
- Increase the pad size in rxr_pkt_entry
- Make readcopy pkt pool 128 byte aligned
- Introduce alignment to support in order aligned ops
- Fix a bug when calling ibv_query_qp_data_in_order
- RMA operations will ensure FI_ATOMIC cap
- RMA operations will ensure FI_RMA cap
- Unittest atomics without FI_ATOMIC cap.
- Unittest RMA without FI_RMA cap.
- Refactor pkt_entry assignment in poll_ibv loop
- Fixes for RDMA Write and Writedata

## RXM

- Revert rxm util peer CQ support
- Fix credit size parameter for flow ctrl

## SHM

- Fix DSA enable
- Assert read op and inject proto are mutually exclusive
- Fix ROCR data coherency
- Add FI_LOCAL_COMM to shm attrs
- Signal peer when peer is out of resources
- Handle empty freestack
- Fix bug in configure.m4 in atomics_happy assignment happy
- Add memory barrier before update resp->status for SAR
- Fix resource leak reported by coverity
- Switch cmd_ctx pool from freestack to bufpool
- Add iface parameter to smr_select_proto

## TCP

- Fix spinning on fi_trywait()
- Handle truncation of active message
- Handle prefetched data after reporting ETRUNC error
- Progress all ep's on unexp_msg_list when posting recv
- Removed unused saved_msg::ep field to fix assert
- Continue receiving after truncation error
- Create function to allocate internal msg buffer
- Add runtime setting for max saved message size
- Increase default max_saved value
- Dynamically allocate large saved Rx buffers
- Separate the max inject and recv buf size
- Remove 1-line xnet_cq_add_progress function
- Changed default wait object to epoll
- Handle case where epoll isn't natively supported
- Hold domain lock while deregistering memory
- Rename DL package from libnet to libtcp

## UCX

- Align the provider version with the libfabric version

## Verbs

- Delay device initialization to when fi_getinfo is called
- Consolidate peer_mem and dmabuf support check
- verbs_nd: Init len to 0 for WCSGetProviderPath call
- verbs_nd: Verify CQs are valid in rdma_create_qp
- verbs_nd: Initialize ibv_wc fields
- verbs_nd: Release lock in network direct error paths
- Fix vrb_add_credits signature
- Fix credit size parameter for flow ctrl
- Recover RXM connection from verbs QP in error state

## Fabtests

- Add ze-dlopen functions to component tests
- Call cudaSetDevice() for selected device
- pytest/efa: Adjust get_efa_devices()
- pytest/common: Support parallel neuron test
- pytest/common: Use different cuda device for parallel cuda set
- efa: Test_flood_peer.py increase timeout
- pytest/efa: Test to flood peer during startup
- fi-rdmabw-xe: Add option to set maximum message size
- fi-rdmabw-xe: Add option to set batch size

v1.18.0, Fri Apr 7, 2023
========================

## Core

- rocr: fix offset calculation
- rocr: use ofi spinlock functions
- rocr: minor fixes
- neuron: convert warn to info for nrt_get_dmabuf_fd not found
- neuron: check existance of neuron devices during initialization
- neuron: Add support for neuron dma-buf
- ze: update ZE to support new driver index specification
- List variables read from config file
- Add switch to prefer system-config over environment
- Add basic system-config support for setting library variables
- Move peer provider defines into new header
- rocr: Support asynchronous memory copies
- rocr: Add support for ROCR IPC
- rocr: rename rocr data-structures
- synpaseai: return 0 for host_register and host_deregister
- fabric: Improve log level of provider mismatch
- cuda: Allow CUDA IPC when P2P disabled
- ze: add ZE command list pool to reuse command lists
- cuda: implement cuda_get_xfer_setting for non cuda build
- cuda: adjust FI_HMEM_CUDA_ENABLE_XFER behavior
- cuda.c: Add const to param to remove warning
- Add IFF_RUNNING check to indicate iface is up and running
- io_uring support enhancements

## EFA

- Implement CUDA support on instance types that do not support GPUDirect RDMA
- Implement fi_write using device's RDMA write capability
- Enrich error messages with debug and connection info
- Implement support for FI_OPT_EFA_USE_DEVICE_RDMA in fi_setopt
- Implement support for FI_OPT_CUDA_API_PERMITTED in fi_setopt
- Add support for neuron dma-buf
- Use gdrcopy to improve the intra-node CUDA communication performance for small messages
- Use shm provider's FI_AV_USER_ID support
- Fix bugs in efa provider’s shm info initialization procedure

## Hooks

- dmabuf_peer_mem: Handle IPC handle caching in L0
- trace: Add trace log for CM operation APIs
- trace: Change tag in trace log to hex format
- trace: Enhance trace log for data transfer API calls
- trace: Add trace log for API fi_cq_readerr()
- trace: Add trace log for CQ operation APIs
- Add tracing hook provider

## Net

- Net provider optimizations have been integrated into the tcp provider.
- Net provider has been removed as a reported provider.

## OPX

- Fixes for Coverity scan issues
- Enhanced tag matching
- Tune expected recv for unaligned buffers
- Add finer logging granularity
- Reduce RTS immediate data and fix packet estimate for odd TID lengths
- Add additional sources for FI_OPX_UUID
- Exclude opx from build if missing needed defines
- Move some logs to optimized builds
- Fix build warnings for unused return code from posix_memalign
- Add reliability sanity check to detect when send buffer is illegally altered
- SDMA Completion workaround for driver cache invalidation race condition
- Fix replay payload pointer increment
- Handle completion counter across multiple writes in SDMA
- Cleanup pointers after free()
- Modify domain creation to handle soft cache errors
- Two biband performance improvements
- Fixes based on Coverity Scan related to auto progress patch
- Changed poll many argument to rx_caps instead of caps
- Resync with server configured for Multi-Engines (DAOS CART Self Tests)
- Remove import_monitor as ENOSYS case
- Address memory leaks reported on OFIWG issues page
- General code cleanup
- Add replays over SDMA
- Implement basic TID Cache
- Revert work_pending check change
- Fix use_immediate_blocks
- Restore state after replay packet is NULL
- Fix memory leak from early arrival packets
- Fix segfault in SHM operations from uninitialized value in atomic path
- Prevent SDMA work entries from being reused with outstanding replays
- Set runtime as default for OPX_AV
- Fix RTS replay immediate data
- Fix errors caught by the upstream libfabric Coverity Scan
- fi_getInfo - Support multiple HFI devices
- Support OFI_PORT and Contiguous endpoint addresses for CART & Mercury
- Add fi_opx_tid.h to Makefile.include
- Fix progress checks and default domain
- Revert is_intranode simplification.
- Don't inline handle_ud_ping function
- Allow atomic fetch ops to use SDMA for sufficiently large counts
- Cleaned up FI_LOG_LEVEL=warn output
- Cleaned up unused macros for FI_REMOTE_COMM and FI_LOCAL_COMM
- Reset default progress to FI_PROGRESS_MANUAL
- Fixed GCC 10 build error with Auto Progress
- Add support for FI_PROGRESS_AUTO
- Use max allowed packet size in SDMA path when expected TID is off
- Expected receive (TID) rendezvous
- RMA Read/Write operations over SDMA
- Remove origin_rs from cts and dput packet header
- Fix for hang in DAOS CART tests
- Use single IOV for bounce buffer in SDMA requests.
- Check for FI_MULTI_RECV with bitwise OR instead of AND
- Fix for intermittent intra-node deadlock hang (DAOS CART tests)
- Fix to RPC transport error failure (DAOS CART tests)
- Fix for context->buf set to NULL
- Fix bad asserts
- Ensure atomicity of atomic ops
- fi_opx_cq_poll_inline count and head check fix
- Fix intermittent intra-node hang causing RPC timeouts (DAOS CART tests)

## PSM3

- Update provider to sync with IEFS 11.4.1.1.2
- Fix warnings from build
- Add oneapi ZE support to OFI configure

## RXD

- Ignore error path in av_close return

## RXM

- Handle NULL av in rxm_freeall_conns()
- Implement the FI_OPT_CUDA_API_PERMITTED option
- Write "len" field for remote write
- Ignore error path domain_close return
- Free coll_pool on ep close
- Update rxm to use util_cq FI_PEER support functions
- Fix incorrect CQ completion field
- Rename srx to msg_srx
- Disable FI_SOURCE if not requested
- Memory leaks removed
- Set offload_coll_mask based on actual configuration
- Report on coll offload capabilities with OFI_OFFLOAD_PROV_ONLY
- Fabric setups collective offload fabric
- Create eq for collective offload provider
- Close collective providers ep when rxm_ep is closed
- Fix incorrect use of OFI_UNUSED()
- Rework collective support to use collective provider(s)

## SHM

- Fix potential deadlock in smr_generic_rma()
- smr_generic_rma() wwrite error completion with positive errno
- Update SHM to use ROCR
- Fix incorrect discard call when cleaning up unexpected queues
- Separate smr_generic_msg into msg and tagged recv
- Fix start_msg call
- Implement the FI_OPT_CUDA_API_PERMITTED option
- Assert not valid atomic op
- Fix a bug in smr_av_insert
- Optimize locking on the SAR path
- Remove unneeded sar_cnt
- Optimize locking
- Enable multiple GPU/interface support
- Remove HMEM specific calls from atomic path
- Use util_cq FI_PEER support
- Import shm as device host memory
- Add HMEM flag to smr region
- Fix user_id support
- Write tx err comp to correct cq
- Fix index when setting FI_ADDR_USER_ID

## TCP

- Provider source has been replaced by net provider source
- Removed incorrect reporting of support for FI_ATOMIC
- Do not save unmatched messages until we have the peer's fi_addr
- Use internal flag for FI_CLAIM messages, versus a reserved tag bit
- Fix updating error counter when discarding saved messages
- Allow saved messages to be received after the underlying ep has been closed
- Enhanced debug logging in connection path
- Force CM progress on unconnected ep's when posting data transfers
- Support connect and accept calls with io_uring
- Fix segfault accessing an invalid fi_addr
- Add io_uring support for CM message exchange
- Move CM progress from fabric to EQ to improve multi-threaded performance
- Fix small memory leak destroying an EQ
- Fix race where same rx entry could be freed twice
- Handle NULL av in rdm ep cleanup
- Reduce stack use for epoll event array

## UCX

- New provider targeting Nvidia fabrics that layers over libucp

## Util

- Fix the behavior of cq_read for FI_PEER
- rocr: Fix compilation issue
- cuda: Use correct debug string calls
- Free cq->peer_cq on close
- Remove extra new line from av insert log
- Check for count = 0 in ofi_ip_av_insert
- rocr: Add support for ROCR IPC
- Add FI_PEER support to util_cq
- Disable FI_SOURCE if not requested
- Remove FID events from the EQ when closing endpoint
- Rework collective support to be a peer collective provider(s)
- Allow FI_PEER to pass CQ, EQ and AV attr checking
- Remove annoying WARNING message for FI_AFFINITY
- Add utility collective provider

## Verbs

- Implement the FI_OPT_CUDA_API_PERMITTED option
- Add support for ROCR IPC

## Fabtests

- Add fi_setopt_test unit test
- Update ze device registration calls
- fi-rdmabw-xe: Always use host buffer for synchronization
- Fix bug in posting RMA operation
- fi_cq_data: Extend test to fi_writedata
- fi_cq_data: Extend validation of completion data
- Rename fi_msg_inject tests to fi_inject_test to reflect its use
- fi_rdm_stress: Add count option to json key/pair options
- Add and fix OOB option handling in several tests
- fi_eq_test: Fix incorrect return value
- fi_rdm_multi_client: Increase the size of ep name buffer
- Add FI_MR_RAW to default mr_mode
- Support larger control messages needed by newer providers
- fi-rdmabw-xe: Update to work with the ucx provider
- fi_ubertest: Cleanup allocations in failure cases
- Change ft_reg_mr to not assume hmem iface & device
- fi_multinode: Bugfix multinode test for ze + verbs
- fi_multinode: Remove unused validation print
- fi_multinode: Skip tests for unsupported collective operations
- fi_ubertest: Fix data validation with device memory
- fi_peek_tagged: Restructure and expand test

v1.17.1, Fri Mar 3, 2023
========================

## Core

- Fix spinlocks for macOS
- hmem_cuda Add const to param to remove warning
- Fix typos in fi_ext.h
- ofi_epoll: Remove unused hot_index struct member

## EFA

- Print local/peer addresses for RX write errors
- Unit test to verify no copy with shm for small host message
- Avoid unnecessary copy when sending data from shm
- Compare pci bus id in hints
- Fix double free in rxr endpoint init
- Initialize efawin library before EFA device on Windows

## Hooks

- dmabuf_peer_mem: Handle IPC handle caching in L0

## OPX

- Exclude from build if missing needed defines
- Move some logs to optimized builds
- Fix build warnings for unused return code from posix_memalign
- Add reliability sanity check to detect when send buffer is illegally altered
- SDMA Completion workaround for driver cache invalidation race condition
- Fix replay payload pointer increment
- Handle completion counter across multiple writes in SDMA
- Cleanup pointers after free()
- Modify domain creation to handle soft cache errors
- Two biband performance improvements
- Fixes based on Coverity Scan related to auto progress patch
- Changed poll many argument to rx_caps instead of caps
- Resynch with server configured for Multi-Engines (DAOS CART Self Tests)
- Remove import_monitor as ENOSYS case
- Address memory leaks reported on OFIWG issues page
- Remove unused fields
- Fix unwanted print statement case
- Add replays over SDMA
- Implement basic TID Cache
- Revert work_pending check change
- Fix use_immediate_blocks
- Restore state after replay packet is NULL
- Fix memory leak from early arrival packets.
- Fix segfault in SHM operations from uninitialized value in atomic path.
- Prevent SDMA work entries from being reused with outstanding
  replays pointing to bounce buf.
- Set runtime as default for OPX_AV
- Fix RTS replay immediate data
- Fix errors caught by the upstream libfabric Coverity Scan
- Support multiple HFI devices
- Support OFI_PORT and Contiguous endpoint addresses
- Update man pages

## Util

- util_cq: Remove annoying WARNING message for FI_AFFINITY

v1.17.0, Fri Dec 16, 2022
=========================

## Core

- Add IFF_RUNNING check to indicate iface is up and running
- General code cleanups
- Add abstraction for common io_uring operations
- Support ROCR get_base_addr
- Add a 'flags' parameter to fi_barrier()
- Introduce new calls for opening domain and endpoint with flags
- Add ability to re-sort the fi_info list
- Allowing layering of rxm over net provider
- General cleanup of provider filtering functions
- Add io_uring operations to be used by sockapi
- Modify internal handling of async socket operations
- Sockets operations are moved to a common sockapi abstraction
- Add support for Ze host register/unregister
- Add new offload provider type
- Rename fi_prov_context and simplify its use
- Convert interface prefix string checks to exact checks

## EFA

- Code cleanups and various bug fixes
- Improved debug logging and warnings and assertions
- Do not ignore hints->domain_attr->name
- Fix the calculation of REQ header size for a packet entry
- Fix default value for host memory's max_medium_msg_size
- Add tracepoints to send/recv/read ops
- Simplified emulated read protocol
- Set use_device_rdma according to efa device id
- Fix shm initialization path on error
- Fix Implementation of FI_EFA_INTER_MIN_READ_MESSAGE_SIZE
- Do not enable rdma_read if rxr_env.use_device_rdma is false
- Remove de-allocated CUDA memory region during registration
- Fix the error handling path of efa_mr_reg_impl()
- Fix rxr_ep unit tests involving ibv_cq_ex
- Add check of rdma-read capability for synapseai
- Report correct default for runt_size parameter
- Toggle cuda sync memops via environment variable.

## Net

- Continued fork of tcp provider, will eventually merge changes back
- Fix inject support
- Fix memory leak in peek/claim path
- General code cleanups and bug fixes from initial fork
- Allow looking ahead in tcp stream to handle out-of-order messages
- Add message tracing ability
- Fetch correct ep when posting to a loopback connection
- Release lock in case of error in rdm_close
- Fix error path in xnet_enable_rdm
- Add missing progress lock in srx cleanup
- Code restructuring and enhancements with longer term goal of supporting io_uring
- Disable the progress thread in most situations
- Rename DL from libxnet-fi to libnet-fi
- Add missing initialization calls for DL provider
- Add support for FI_PEEK, FI_CLAIM, and FI_DISCARD
- Include source address with CQ entry
- Fix support for FI_MULTI_RECV

## OPX

- Bug fixes and general code cleanup
- Fix progress checks and default domain
- Allow atomic fetch ops to use SDMA for sufficiently large counts
- Cleaned up FI_LOG_LEVEL=warn output
- Reset default progress to FI_PROGRESS_MANUAL
- Fixed GCC 10 build error with Auto Progress
- Add support for FI_PROGRESS_AUTO
- Use max allowed packet size in SDMA path when expected TID is turned off
- Expected receive (TID) rendezvous
- RMA Read/Write operations over SDMA
- Remove origin_rs from cts and dput packet header.
- Fix for hang - unable to match inbound packets with receive
  context->src_addr (DAOS CART tests)
- Use single IOV for bounce buffer in SDMA requests.
- Check for FI_MULTI_RECV with bitwise OR instead of AND
- Fix for intermittent intra-node deadlock hang (DAOS CART tests)
- Fix to RPC transport error failure (DAOS CART tests)
- Fix for context->buf set to NULL
- Fix bad asserts
- Ensure atomicity of atomic ops
- fi_opx_cq_poll_inline count and head check fix
- Fix intermittent intra-node hang causing RPC timeouts (DAOS CART tests)
- Temporarily reduce SDMA queue ring size for possible driver bug workaround
- Fix alignment issue and asserts
- Enable more parallel SDMA operations

## PSM3

- Synced to IEFS 11.4.0.0.198
- Tech Preview Ubuntu 22.04 Support
- Tech Preview Intel DSA Support
- Improved Intel GPU Support
- Various performance improvements
- Various bug fixes

## RxM

- Always use rendezvous protocol for ZE device memory send
- Code cleanup
- Add option to free resources on AV removal

## SHM

- Fix user_id support
- Write tx err comp to correct cq
- Fix index when setting FI_ADDR_USER_ID
- Remove extraneous ofi_cirque_next() call
- Add support for FI_AV_USER_ID
- Fix multi_recv messaging
- General code restructuring for maintainability
- Implement shared completion queues
- Decouple error processing from cq completion path to avoid switch
- Fix incorrect op passed into recv cancel operation
- Enhanced SHM implementation with DSA offload
- Use multiple SAR buffers per copy operation
- Fix ZE IPC race condition on startup

## TCP

- Minor updates in preparation for io_uring support (via net provider)

## Util

- Add option to free resources on AV removal
- Add 'flags' parameter to new fi_barrier2() call
- Add debugging in ofi_mr_map_verify
- Rename internal bitmask struct to include ofi prefix

## Verbs

- Add option to disable dmabuf support
- FI_SOCKADDR includes support of FI_SOCKADDR_IB

## Fabtests

- shared: Expand hmem support
- fi_loopback: Add support for tagged messages
- fi_mr_test: add support of hmem
- fi_rdm_atomic: Fix hmem support
- fi_rdm_tagged_peek: Read messages in order, code cleanup and fixes
- fi_multinode: Add performance and runtime control options, cleanups
- benchmarks: Add data verification to some bw tests
- fi_multi_recv: Fix possible crash in cleanup

v1.16.1, Fri Oct 7, 2022
========================

## EFA

- Flush MR cache when fork() is called

## RxM

- Disable 128-bit atomics

## SHM

- Add safeguards around peer mapping initialization
- Fix Ze IPC race condition on startup

## Verbs

- Add missing header file to release package

## Fabtests

- Add net provider test config files to release package

v1.16.0, Fri Sep 30, 2022
=========================

## Core
- Added HMEM IPC cache
- Use exact string comparison checks for network interfaces
- Restructuring of poll/epoll abstraction
- Add ability to disable locks completely in debug builds
- Serialize access to modifying the logging calls
- Minor fixes to fi_tostr text formatting
- Fix Windows build warnings
- Add hmem interface checks to memory registration

## EFA
- Added support of Synapse AI memory.
- Introduced Runting read message protocol for CUDA memory and Neuron memory
- Mix use of both local read and gdrcopy for copying of CUDA memory
- Use SHM provider's CUDA IPC support to implement intra-node
  communication for CUDA memory
- Improved error message

## Net

- Temporarily forked, optimized version of tcp provider
- Focused on improved performance and scalability over tcp sockets
- Fork ensures tcp provider stability while net provider is developed
- Shares the tcp provider protocol and base implementation for msg endpoints
- Integrates direct support for rdm endpoints, using a derivative from rxm
- Implements own protocol for rdm endpoints, separate from rxm;tcp

## OPX

- Added initial support for SDMA
- General performance enhancements
- Performance improvements to reliability protocol
- Improved deferred work pending complete
- Added support for OPX_AV=runtime
- Support iov memory registration ops
- Added DAOS RPC support
- Atomic ops enhancements
- Improved documentation
- Debug build enhancements
- Fixed compiler warnings
- Reduced time to compile prov/opx code
- General bug fixes
- Fixed PSN wrapping scaling
- Added intranode fence
- Addressed bugs discovered by coverity scan

## PSM2

- Fix sending CQ data in some instances of fi_tsendmsg

## PSM3

- Updated to match Intel Ethernet Fabric Suite (IEFS) 11.3 release

## RxM

- Update to read multiple completions at once from msg provider
- Move RxM AV implementation to util code to share with net provider
- Minor code cleanups

## SHM

- Implement and use ipc_cache
- Add log messages for debugging and error tracking
- Fix check for FI_MR_HMEM mr_mode
- Move shm signal handlers initialization to EP
- Added log messages for errors detected

## TCP

- Fix incorrect signaling of the CQ
- Increase max number of poll events to retrieve
- Acquire ep lock prior to flushing socket in shutdown
- Verify ep state prior to progressing socket data
- Read cm error data when receiving connreq response
- Log error on connect failure
- Fix assertion failure in CQ progress function

## Util

- Fix text in log of UFFD ioctl failure
- Introduce cuda ipc monitor
- Fix CQ memory leak handling overflow
- Fix MR mode bit check for ver 1.5 and greater
- Add max_array_size to track/check array overflow
- Always progress transfers when reading from a CQ
- Handle NULL address insertion
- Try IPv4 before IPv6 addresses when starting name server
- Fix IP util av default address length
- Fix util IP getinfo path to read hints->addr_format
- Fix debug print mismatch
- Fix return code when memory allocation fails.
- Fix build sign warning in ofi_bufpool_region_alloc
- Minor code cleanups
- Print warning if an addr is inserted into an AV again

## Verbs

- Fix support of FI_SOCKADDR_IB when requested by the application
- Ensure all posted receives are flushed to the application
- Update ofi_mr_cache_search API for hmem IPC support
- Reduce logging verbosity for "no active ports"
- Fix incorrect length used in memory registration
- Various minor bug fixes for test failures
- Fix a memory leak getting IB address
- Implement verbs provider on Windows over NetworkDirect API
- Set and check address format correctly
- Only close qp if it was initialized
- Portable detection of loopback device

## Fabtests

- multi_ep: Separate EP resources and fix MR registration
- multi_recv: Fix possible crash and check for valid buffer
- unexpected_msg: Fix printf compiler warning
- dgram_pingpong.c: Use out-of-band sync
- multinode: Make multinode tests platform agnostic, fix formatting
- ubertest: Fix string comparison to include length, fix writedata completion check
- av_test: add support for -e <ep_type>

### New tests

- dmabuf-rdma: Component level test for dma-buf RDMA
- sock_test: Component level performance test of poll, epoll, and select
- rdm_stress: Multi-threaded, multi-process stress test for RDM endpoints
- sighandler_test: Regression test for signal handler restoration

### Common

- Pass in correct remote_fi_addr instead of 0 on fi_recv
- Ensure that first option is processed in getopt
- Save and restore errno in log messages
- Windows: Free hints memory in module that allocated it,
  allow building verbs tests on windows

### pytest/efa

- Run fi_getinfo_test with GID as address
- Add function efa_retrieve_gid()
- Skip runt_read test for single node
- Increase number of sends for read_rnr_cq_entry
- Verify the prov error message in rnr_read_cq_error()
- Add test case for runt read protocol
- Return None if HW counter does not exist
- Extend rma_bw test to multiple memory types
- Test multi_recv with 8k message size
- Increase timeout limit for cuda tests when testing all msg sizes.
- Adjust dgram test

### pytest/common

- Introduce shm test suite
- Skip cuda tests if provider does not support hmem hints
- Add pyyaml to requirements.txt
- Fix a bug in processing return code
- Add warmup_iteration_type to ClientServerTest
- Adjust default behavior of junit_xml
- Increase ssh ConnectTimeout

### HMEM testing options

- ZE: Increase the number of supported ZE devices
- CUDA: Use device allocated host buffer to fill device buffer
- CUDA: Ensure data consistency, add cuda_memory market in pytest
- Run check_hmem correctly
- Fix issues in check_hmem.c

### EFA provider specific tests

- Add more message range tests
- Add fork-related test
- Fix the command in efa_retrieve_hw_counter_value()
- Do not run efa rnr test with strict mode
- Add efa fabric id test check

### Scripts

- runfabtests.py: Support an argument to specify junit report verbosity
- runfabtests.py  remove unnecessary good_address argument
- runfabtests.cmd: Rewrite to be more like runfabtests.sh
- fabtests/scripts: Add runmultinode.sh
- runfabtests.sh: Print timestamp of each test, fix -e option

v1.15.2, Mon Aug 22, 2022
=========================

## Core

- Fix incorrect cleanup on gdrcopy initialization success
- Change the neuron library file name
- Use neuron's memcpy API to copy from host to device.
- Fix signaling race in pollfds abstraction
- Check correct number of events in pollfds abstraction
- Fix locking in pollfds reading event contexts
- Reserve CXI provider constants to avoid future conflicts
- Initialize genlock lock_type to fix always picking mutex
- Prioritize psm2 over opx provider

## EFA

- Release tx_entry on error in rxr_atomic_generic_efa()
- Avoid iteration of iov array to address coverity report
- Add locks around MR map
- Fix RNR error reporting and handling

## Hooks

- Close dmabuf fd when no longer in use

## RxM

- Read multiple completions at once to limit progress looping
- Fix windows compiler errors
- Use sparse logging for common EQ errors
- Fix a memory leak in AV remove path
- Reject simultaneous connections with correct error code

## SHM

- Fix incorrect use of peer_id with id in SAR path

## TCP

- Verify endpoint state prior to progressing socket data
- Acquire ep lock prior to flushing socket in shutdown
- Fix incorrect signaling of the CQ for threads waiting in sread

## OPX

- Disable OPX provider if not supported by platform

## PSM3

- Add missing reference to neuron_init
- Remove bashisms from configure script

## Util

- Fix non-error CQ auxillary queue entry memory leak
- Check for duplicate address insertion in util AV code
- Fix locking around removing an address from an AV

## Verbs

- Fix incorrect length for Ze HMEM memory registration
- Fix memory leak when closing a device

v1.15.1, Fri May 13, 2022
=========================

## Core

- Fix windows implementation to remove fd from poll set

## PSM3

- Add missing files to release tarball

## Util

- Handle NULL address insertion to fi_av_insert

v1.15.0, Fri Apr 29, 2022
=========================

## Core

- Fix fi_info indentation error in fi_tostr
- hmem_ze: Add runtime option to choose specific copy engine
- Cleanup of configure HMEM checks
- Fixed stringop-truncation in ofi_ifaddr_get_speed
- Add utility provider log suffix to make logs easier to read
- Fix truncation of ipv6 addressing
- hmem: add support for AWS Trainium devices
- Fix potential sscanf overflows
- hmem: pass through device and flags when querying memory interface
- Rework locking in several areas to convert spinlocks to mutexes
- Add new locking abstractions to select lock types at runtime
- Add new FI_PROTO_RXM_TCP for optimized rxm over tcp path

## EFA

- Added windows support through efawin (https://github.com/aws/efawin)
- Added support of AWS neuron.
- Added support of using gdrcopy to copy data from host to device.
- Fixed a bug that cause 0 byte read to fail.
- Fixed a memory corruption issue that can caused forked process to crash.
- Extended testing coverage through new pytest based testing framework.

## HOOKS

- Add new hooking provider dmabuf_peer_mem
- Enable DL build of hooking providers
- Add HMEM memory registration hook

## OPX

- New provider supporting Cornelis Networks Omni-path hardware

## PSM3

- Updated psm3 to match IEFS 11.2.0.0 release
- Added support for sockets (TCP/UDP) via a runtime selectable Hardware
  Abstraction Layer (HAL)
- Added support for IPv6 addressing in RoCE and sockets
- Added various NIC selection filtering options (wildcarded NIC name,
  address format, wildcarded IP subnet, link speed)
- Performance tuning in conjunction with OneAPI and OneCCL
- Tested with aws-nccl-plugin using NVIDIA NCCL over OFI over PSM3
- Improved PSM3_IDENTIFY output
- Rename most internal symbols to psm3_
- Corrected vulnerabilities found during Coverity scans
- configure options refined and help text improved
- PSM3_MULTI_EP has been deprecated (recommend always enabled, default
  is enabled [same default as previous releases])
- Various bug fixes

## RxM

- Add check that atomic size is valid
- Add support to passthru calls to tcp provider in specific cases

## TCP

- Add assert to verify RMA source/target msg sizes match
- Wake-up threads blocked on CQ to update their poll events
- Fix use of incorrect events in progress handler
- Fixes for various compile warnings, mostly on Windows
- Add support for FI_RMA_EVENT capability
- Add support for completion counters
- Fix check for CQ data in tagged messages
- Add cancel support to shared rx context
- Add src_addr receive buffer matching
- Add provider control to assign a src_addr with an ep
- Handle trecv with FI_PEEK flag
- Allow binding a CQ with an SRX
- Restructuring of code in source files
- Handle EWOULDBLOCK returned by send call
- Add hot (active) pollfd list

## SHM

- Properly chain the original signal handlers
- Avoid uninitialized variable with invalid atomic parameters
- Fix 0 byte SAR read
- Initialize len parameter to accept
- Refactor and simplify protocol code
- Remove broken support for 128-bit atomics
- Fix FI_INJECT flag support
- Add assert to verify RMA source/target msg sizes match
- Set domain threading to thread safe
- Fix possible use of uninitiated var in av_insert

## Util

- Fix sign warning in ofi_bufpool_region_alloc
- Remove unused variable from ofi_bufpool_destroy
- Fix check for valid datatype in ofi_atomic_valid
- Return with error if util_coll_sched_copy fails
- Fix use of uninitialized variable in ofi_ep_allreduce
- Fix memory access in ip_av_insertsym
- Track ep per collective operation not with multicast
- Restructure collective av set creation/destruction
- Change most locks from spin locks to mutexes
- Allow selection of spinlocks for CQ and domain objects
- Fix AV default addrlen
- Update fi_getinfo checks to include hints->addr_format

## Verbs

- Initial changes for compiling on Windows (via NetworkDirect)
- Add a failover path to dma-buf based memory registration
- Replace use of spin locks with mutexes
- Check for valid qp prior to cleanup
- Set and check for address format correct in fi_getinfo

## Fabtests

- hmem_cuda: used device allocated host buff to fill device buf
- Add python scripts to control test execution
- test_configs: include util provider in core config file
- Add option "--pin-core"
- Only call nrt_init once
- Fix a bug in ft_neuron_cleanup
- Correct help for unit test programs
- Remove duplicate help prints from fi_mcast
- configure.ac: fix --enable-debug=no not properly detected
- msg_inject: handle the case ft_tsendmsg return -FI_EAGAIN
- Add AWS Trainium device support
- fi_inj_complete: Add FI_INJECT to fabtests
- inj_complete.c: Make arguments align with the other tests
- dgram_pingpong: handle the error return of fi_recv
- recv_cancel: Remove requirement for unexpected msg handling
- poll: Fix crash if unable to allocate pollset
- ubertest: Add GPU testing and validation support
- Add HMEM options parsing support
- Update and re-enable fi_multi_ep test

v1.14.1, Fri Apr 15, 2022
=========================

## Core

- Use non-shared memory allocations to use MADV_DONTFORK safely
- Various fixes for compiler warnings
- Fix incorrect use of gdr_copy_from_mapping
- Ensure proper timeout time for pollfds to avoid early exit

## EFA

- Use non-shared buffer pool allocations to use MADV_DONTFORK safely
- Handle read completion properly for multi_recv
- Use shm's inject write when possible
- Support 0 byte read

## RxD

- Verify valid atomic size

## RxM

- Ensure signaling the CQ fd after writing completion
- Fix inject path for sending tagged messages with cq data
- Negotiate credit based flow control support over CM
- Add PID to CM messages to detect stale vs duplicate connections
- Fix race handling unexpected messages from unknown peers
- Fix possible leak of stack data in cm_accept
- Restrict reported caps based on core provider
- Delay starting listen until endpoint fully initialized
- Verify valid atomic size

## Sockets

- Fix coverity reports on uninitialized data
- Check for NULL pointers passed to memcpy
- Minor cleanups
- Add missing error return code from sock_ep_enable

## TCP

- Fix performance regression resulting from sparse pollfd sets
- Fix assertion failure in CQ progress function
- Do not generate error completions for inject msgs
- Fix use of incorrect event names in progress handler
- Fix check for CQ data in tagged messages
- Make start_op array a static to reduce memory
- Wake-up threads blocked on CQ to update their poll events

## Verbs

- Generate error completions for all failed transmits
- Set all fields in the fi_fabric_attr for FI_CONNREQ events
- Set proper completion flags for all failed transfer
- Minor updates to silence coverity warnings on NULL pointers
- Ensure that all attributes are provided when opening an endpoint
- Fix error handling in vrb_eq_read
- Fix memory leak in error case in vrb_get_sib
- Work-around bug in verbs HW not reported correct send opcodes
- Only call ibv_reg_dmabuf_mr when kernel support exists
- Add a failover path to dma-buf based memory registration
- Negotiate credit based flow control support over CM
- Add OS portable detection of loopback devices

## Fabtests

- Disable inject when FI_HMEM is enabled
- Increase the number of supported ZE devices
- Change cq format if remote cq data is received
- Fix ubertest config exclude file check
- Fix ubertest checks for expected completions

v1.14.0, Fri Nov 19, 2021
=========================

## Core

- Add time stamps to log messages
- Fix gdrcopy calculation of memory region size when aligned
- Allow user to disable use of p2p transfers
- Update fi_tostr print FI_SHARED_CONTEXT text instead of value
- Update fi_tostr to output field names matching header file names
- Fix narrow race condition in ofi_init
- Minor optimization to pollfds to handle timeout of 0
- Add new fi_log_sparse API to rate limit repeated log output
- Define memory registration for buffers used for collective operations

## EFA
- Provide better support for long lived applications utilizing the RDM
  endpoint, that may reuse an EFA queue pair after an application restarts.
- Fixes for RNR support (enabled in v1.13.1), to allow Libfabric to manage
  backoff when a receiver's queue is exhausted. A setopt parameter was added to
  allow applications to set the number of re-transmissions done by the device
  before a packet is queued by Libfabric, or if Libfabric is configured to not
  handle resource errors, write an error entry to the application.
- Potentially reduce memory utilization by waiting until first CQ read to
  allocate pools
- Deprecate the FI_EFA_SHM_MAX_MEDIUM_SIZE environment variable
- Fix a bug in the send path which caused a performance regression for large
  messages
- Fix issue in MR registration path when cache is used with CUDA buffers
- Print a clearer warning message when the reorder buffer is too small
- Various bugfixes in send path causing unneeded copies
- Various bugfixes caught by inspection and coverity
- Add documentation describing version 4 of the RDM protocol

## SHM

- Separate HMEM caps and disable FI_ATOMIC when requested
- Fix casting ints to pointers of different sizes
- Add error checking in smr_setname
- Distinguish between max shm name and max path name
- Move allocation of sar_msg into smr_format_sar()

## TCP

- Use IP_BIND_ADDRESS_NO_PORT socket option to improve scaling
- Fix situation where we can leave socket in blocking mode
- Add specific fi_info output to fi_getinfo for srx case
- Code restructuring and renames to improve maintenance
- Initial implementation to support tagged messages at tcp layer
- Optimize RMA handling at receiver
- Remove non-defined CQ flags when reporting completions

## RXM

- Reset connection state if we receive a new connection request
- Increase and update debug log messages to be more consistent
- Force CM progress if msg ep's are actively connecting
- Optimize handling for cm_progress_interval = 0

## Util

- Fix fi_getinfo check if provider requires the use of shared contexts
- Replace deprecated pthread_yield with sched_yield
- Fix compiler warning mixing u64 with size_t fields
- Fix memory leak in util_av_set_close
- Fix ofi_av_set to use passed in start_addr and end_addr values
- Add logic to detect if another library is intercepting memory calls
- Update 128-bit atomic support
- Fix possible deadlock if multiple memory monitors are enabled for the
  same memory type

## Verbs

- Fix setting MR access to handle read-only buffers
- Expand debug output
- Fail FI_HMEM support if p2p is disabled
- Handle FI_HMEM_HOST_ALLOC flag for FI_HMEM_ZE

## Fabtests

- Fix rdm_rma_trigger support for hmem
- Add key exchanges to common code to support device memory
- Remove need for OOB address exchange when hmem is enabled
- Always use command line provided inject size when given
- Add ability to test tagged messages over msg ep's
- Add support for shared rx contexts to common code
- Update scripts to allow provider specific fabtests
- Add an EFA RDM RNR fabtest

v1.13.2, Fri Oct 15, 2021
========================

## Core

- Provide work-around for segfault in Ze destructor using DL provider
- Minor code fixes supporting Ze
- Use copy only engine when accessing GPUs through Ze
- Sort DL providers to ensure consistent load ordering
- Update hooking providers to handle fi_open_ops calls to avoid crashes
- Replace cassert with assert.h to avoid C++ headers in C code
- Enhance serialization for memory monitors to handle external monitors

## EFA

- Limit memcpy in packet processing to only copy valid data
- Removed maximum wait time sending packet to avoid silent drops
- Fix unconditionally growing buffer pools that should not grow
- Handle possible large backlog of unexpected messages via SHM
- Update Tx counter for inject operations
- Allow in flight sends to finish when closing endpoint
- Fix handing of prefix size when receiving data
- Removed unnecessary data copy

## SHM

- Fix possible sigbus error
- Handle errors if peer is not yet initialized

## TCP

- Fix reporting RMA write CQ data
- Fix RMA read request error completion handling
- Avoid possible use after free in reject path
- Remove restriction where EQs and CQs may not share wait sets
- Increase max supported rx size
- Fix possible memory leak of CM context structure in error cases
- Set source address for active EPs to ensure correct address is used
- Fix memory leak of dest address in CM requests

## RxM

- Improve connection handling responsiveness to fix application stalls
- Add missing locks around AV data structures
- Add missing hmem initialization for DL builds
- Do not ignore user specified rx/tx sizes
- Fix source address reported to peer
- Fix possible use of uninitialized memory handling CQ errors
- Fix address comparison to remove duplicate connections
- Reworked CM code to fix several possible crash scenarios
- Fix setting the conn_id when generated 'fake' tagged headers

## Util

- Fix AV set to use non-zero starting address
- Fix setting of CQ completion flags

## Verbs

- Work-around compilation error with Intel compiler 2018.3.222
- Avoid possible user after free issue accessing rdma cm id in error cases

## Fabtests

- Add missing prints to fi_av_xfer to report failures
- Fix memory leak in fi_multinode test
- Add device validation for hmem tests
- Update fi_info hints mode field based on user options
- Fix use of incorrect message prefix sized in fi_pingpong test

v1.13.1, Tue Aug 24, 2021
=========================

## Core

- Fix ZE check in configure
- Enable loading ZE library with dlopen()
- Add IPv6 support to fi_pingpong
- Fix the call to fi_recv in fi_pingpong

## EFA

- Split ep->rx_entry_queued_list into two lists
- Split ep->tx_entry_queued_list into two lists
- Only set FI_HMEM hint for SHM getinfo when requested
- Include qkey in smr name
- Do not ignore send completion for a local read operation
- Convert pkt_entry->state to pkt_entry->flags
- Detect recvwin overflow and print an error message
- Add function ofi_recvwin_id_processed()
- Let efa_av_remove() remove peer with resources
- Ignore received packets from a remove address
- Check for and handle empty util_av->ep_list in efa_av
- Invalidate peer's outstanding TX packets' address when removing peer
- Extend the scope of deep cleaning resources in rxr_ep_free_res()
- Eefactor error handling functions for x_entry
- Only write RNR error completion for send operation
- Ignore TX completion to a removed peer.
- Release peer's tx_entry and rx_entry when removing peer
- Make efa_conn->ep_addr a pointer and use it to identify removed peer
- Mix the using of released packet in rxr_cq_handler_error()
- Refactor tx ops counter updating
- Make rxr_release_tx_entry() release queued pkts
- Rename rxr_pkt_entry->type to rxr_pkt_entry->alloc_type
- Initialize rxr_pkt_entry->x_entry to NULL
- Fix ep->pkt_sendv_pool size
- Add rnr_backoff prefix to variables related to RNR backoff
- Refactor rxr_cq_queue_pkt()
- Eliminate rnr_timeout_exp in rdm_peer
- Eliminate the flag RXR_PEER_BACKED_OFF
- Adjust unexpected packet pool chunk size
- Defer memory allocation to 1st call to progress engine
- Enable RNR support
- Remove peer from backoff peer list in efa_rdm_peer_reset()
- Make rxr_pkt_req_max_header_size use RXR_REQ_OPT_RAW_ADDR_HDR_SIZE
- Use ibv_is_fork_initialized in EFA fork support

## PSM3

- Update Versions
- Clean ref's to split cuda hostbufs when no longer needed
- Fix issue when running gpudirect on gpu with small bar size
- Fix issues with debug statistics
- Fix issue with unreleased MR in cache

## SHM

- Fix unsigned comparison introduced in #6948
- Use hmem iov copies in mmap progression
- Correct return values in smr_progress.c
- Fix smr_progress_ipc error handling

## Util

- Do not override default monitor if already set
- Do not set impmon.impfid to NULL on monitor init
- Initialize the import monitor
- Add memory monitor for ZE

## Fabtests

- Use dlopen to load ZE library
- Bug fixes related to IPv6 address format
- Do not immediately kill server process

v1.13.0, Thu Jul 1, 2021
========================

## Core

- Fix behavior of fi_param_get parsing an invalid boolean value
- Add new APIs to open, export, and import specialized fid's
- Define ability to import a monitor into the registration cache
- Add API support for INT128/UINT128 atomics
- Fix incorrect check for provider name in getinfo filtering path
- Allow core providers to return default attributes which are lower then
  maximum supported attributes in getinfo call
- Add option prefer external providers (in order discovered) over internal
  providers, regardless of provider version
- Separate Ze (level-0) and DRM dependencies
- Always maintain a list of all discovered providers
- Fix incorrect CUDA warnings
- Fix bug in cuda init/cleanup checking for gdrcopy support
- Shift order providers are called from in fi_getinfo, move psm2 ahead of
  psm3 and efa ahead of psmX

## EFA

- Minor code optimizations and bug fixed
- Add support for fi_inject for RDM messages
- Improve handling of RNR NACKs from NIC
- Improve handling of zero copy receive case, especially when sender does not
  post receive buffer
- Numerous RMA read bug fixes
- Add unexpected receive queue for each peer
- Fixed issue releasing rx entries
- Decrease the initial size of the out-of-order packeet pool allocation size
  to reduce the common-case memory footprint
- Handle FI_ADDR_NOTAVAIL in rxr_ep_get_peer
- Identify and handle QP reuse
- Use the memory monitor specified by the user
- Replace provider code with common code in select places
- Update efa_av_lookup to return correct address
- Update rdm endpoint directly poll cq from ibv_cq
- Avoid possible duplicate completions
- Add reference counting for peer tracking
- Fix EFA usage of util AV causing incorrect refcounting
- Do not allow endpoints to share address vectors
- Improve fork support; users can set the FI_EFA_FORK_SAFE environment variable
  for applications which call fork()
- Adjust the timing of clearing deferred memory registration list
- Do not use eager protocol for cuda message and local peer
- Fixes for shm support
- Enable MR cache for CUDA
- Disable shm when application requests FI_HMEM

## PSM3

- Added CUDA Support, GPU Direct through RV kernel module
- Changed PSM3 Provider Version to match IEFS version
- Expanded Multi-Rail support
- Enhanced debug logging
- Removed internal copy of libuuid, added as linked lib
- Various Bug Fixes

## RxD

- Fix peer connection and address cleanup
- Maintain peer connection after AV removal to send ACKs

## RxM

- Fix rx buffer leak in error case
- Dynamically allocate buffer space for large unexpected messages
- Separate the eager protocol size from allocated receive buffers
  to reduce memory footprint
- Make eager limit a per ep value, rather than global for all peers
- Separate definitions and use of buffer, eager, and packet sizes
- Fix calling fi_getinfo to the msg provider with FI_SOURCE set but
  null parameters
- General code cleanups, simplifications, and optimizations
- Fix retrieving tag from dynamic receive buffer path
- Enable dynamic receive buffer path over tcp by default
- Use correct check to select between tagged and untagged rx queues
- Repost rx buffers immediately to fix situation where applications can hang
- Update help text for several environment variables
- Fix use_srx check to enable srx by default layering over tcp provider
- Reduce default tx/rx sizes to shrink memory footprint
- Fix leaving stale peer entries in the AV
- Handle error completions from the msg provider properly, and avoid passing
  internal transfers up to the application
- Reduce memory footprint by combining inject packets into one
- Reduce inject copy overhead by using memcpy instead of hmem copy routines
- Restrict the number of outstanding user transfers to prevent memory
  overflow
- Enable direct send feature by default for the tcp provider
- Fix initialization of atomic headers
- Only ignore interrupts in wait calls (e.g. poll) in debug builds, otherwise
  return control to the caller
- Combine and simplify internal buffer pools to reduce memory footprint
- Remove request for huge pages for internal buffer pools
- Add optimized tagged message path over tcp provider, removing need for
  rxm header overhead
- Several optimizations around supporting rxm over tcp provider

## SHM

- Use signal to reduce lock contention between processes
- Fix communication with a peer that was restarted
- Code cleanup to handle issues reported by coverity
- Add check that IPC protocol is accessing device only memory
- Fix interface selection used for IPC transfers
- Change address to use a global ep index to support apps that open
  multiple fabrics
- Add environment variable to disable CMA transfers, to handle environments
  where CMA checks may succeed, but CMA may not be usable
- Add missing lock in ofi_av_insert_addr
- Add support for GPU memory in inject operations.

## Sockets

- Fix possible ring buffer overflow calculating atomic lengths
- Use correct address length (IPv6 vs 4) walking through address array

## TCP

- Add send side coalescing buffer to improve small message handling
- Add receive side prefetch buffer to reduce kernel transitions
- Fix initializing the mr_iov_limit domain attribute
- Add support for zero copy transfers, with configurable threshold settings.
  Disable zero copy by default due to negative impact on overall performance
- Add environment variable overrides for default tx/rx sizes
- Simplify and optimize handling of protocol headers
- Add a priority transmit queue for internally generated messages (e.g. ACKs)
- Check that the endpoint state is valid before attempting to drive progress
  on the underlying socket
- Limit the number of outstanding transmit and receive operations that a
  user may post to an ep
- Remove limitations on allocating internally generated messages to prevent
  application hangs
- Combine multiple internal buffer pools to one to reduce memory footprint
- Optimize socket progress based on signaled events
- Optimize pollfd abstraction to replace linear searches with direct indexing
- Update both rx and tx cq's socket poll list to prevent application hangs
- Optimize reading in extra headers to reduce loop overhead
- Continue progressing transmit data until socket is full to reduce progress
  overhead
- Add msg id field to protocol headers (debug only) for protocol debugging
- Drive rx progress when there's an unmatched 0-byte received message to
  avoid application hangs
- Avoid kernel transitions that are likely to do not work (return EAGAIN)
- Fail try_wait call if there's data already queued in user space prefetch
  buffers to avoid possible hangs
- Fix possible access to freed tx entry
- Optimize socket receive calls in progress function to skip progress loop
  and immediately handle a received header.  This also fixes an application
  hang handling 0-byte messages
- Broad code cleanups, rework, and simplifications aimed at reducing
  overhead and improving code stability
- Improve handling of socket disconnect or fatal protocol errors
- Fix reporting failures of internal messages to the user
- Disable endpoints on fatal protocol errors
- Validate response messages are what is expected
- Simplify and align transmit, receive, and response handling to improve code
  maintainability and simplify related data structures
- Copy small messages through a coalescing buffer to avoid passing SGL to
  the kernel
- Fix race handling a disconnected event during the CM handshake
- Report default attributes that are lower than the supported maximums
- Remove use of huge pages, which aren't needed by tcp, to reserve them for
  the user
- Increase default inject size to be larger than the rxm header
- Add tagged message protocol header for sending tagged messages using the
  tcp headers only
- Separate definition of maximum header size from maximum inject size

## Util

- Added lock validation checks to debug builds
- Fix MR cache flush LRU behavior
- Always remove dead memory regions from the MR cache immediately
- Update buffer pools to handle an alignment of 0
- Fail memory registration calls for HMEM if the interface isn't available
- Pass through failures when a requested memory monitor fails to start
- Always process deferred work list from pollfd wait abstraction

## Verbs

- Fixed checks setting CQ signaling vector
- Internal code cleanups and clarifications
- Fixed XRC MOFED 5.2 incompatibility
- Add dmabuf MR support for GPU P2P transfers

v1.12.1, Thu Apr 1, 2021
========================

## Core

- Fix initialization checks for CUDA HMEM support
- Fail if a memory monitor is requested but not available
- Adjust priority of psm3 provider to prefer HW specific providers,
  such as efa and psm2

## EFA
- Adjust timing clearing the deferred MR list to fix memory leak
- Repost handshake packets on EAGAIN failure
- Enable mr cache for CUDA memory
- Support FI_HMEM and FI_LOCAL_COMM when used together
- Skip using shm provider when FI_HMEM is requested

## PSM3
- Fix AVX2 configure check
- Fix conflict with with-psm2-src build option to prevent duplicate
  symbols
- Fix checksum generation to support different builddir
- Remove dependency on librdmacm header files
- Use AR variable instead of calling ar directly in automake tools
- Add missing PACK_SUFFIX to header

v1.12.0, Mon Mar 8, 2021
=========================

## Core

- Added re-entrant version of fi_tostr
- Added fi_control commands for accessing fid-specific attributes
- Added Ze (level-0) HMEM API support
- Fixed RoCR memory checks
- Minor code cleanups, restructuring, and fixes
- Fix possible stack buffer overflow with address string conversion
- Handle macOS socket API size limitations
- Verify and improve support for CUDA devices
- Update internal string functions to protect against buffer overflow
- Support gdrcopy in addition to cudaMemcpy to avoid deadlocks
- Properly mark if addresses support only local communication
- Prevent providers from layering over each other non-optimally
- Fix pollfds abstraction to fix possible use after free

## EFA
- Added support for FI_DELIVERY_COMPLETE via an acknowledgment packet in the
  provider. Applications that request FI_DELIVERY_COMPLETE will see a
  performance impact from this release onward. The default delivery semantic
  for EFA is still FI_TRANSMIT_COMPLETE and acknowledgment packets will not be
  sent in this mode.
- Added ability for the provider to notify device that it can correctly handle
  receiver not ready (RNR) errors. There are still known issues so this is
  currently turned off by default; the device is still configured to retry
  indefinitely.
- Disable FI_HMEM when FI_LOCAL_COMM is requested due to problems in the
  provider with loopback support for FI_HMEM buffers.
- Use a loopback read to copy from host memory to FI_HMEM buffers in the
  receive path. This has a performance impact, but using the native copy API
  for CUDA can cause a deadlock when the EFA provider is used with NCCL.
- Only allow fork support when the cache is disabled, i.e. the application
  handles registrations (FI_MR_LOCAL) to prevent potential data corruption.
  General fork support will be addressed in a future release.
- Moved EFA fork handler check to only trigger when an EFA device is present
  and EFA is selected by an application.
- Changed default memory registration cache monitor back to userfaultfd due to
  a conflict with the memory hooks installed by Open MPI.
- Fixed an issue where packets were incorrectly queued which caused message
  ordering issues for messages the EFA provider sent via SHM provider.
- Fixed a bug where bounce buffers were used instead of application provided
  memory registration descriptors.
- Various fixes for AV and FI_HMEM capability checks in the getinfo path.
- Fix bug in the GPUDirect support detection path.
- Various fixes and refactoring to the protocol implementation to resolve some
  memory leaks and hangs.

## PSM3

- New core provider for psm3.x protocol over verbs UD interfaces, with
  additional features over Intel E810 RoCEv2 capable NICs
- See fi_psm3.7 man page for more details

## RxD

- Added missing cleanup to free peer endpoint data with AV
- Add support for FI_SYNC_ERR flag

## RxM

- Cleanup atomic buffer pool lock resources
- Fix unexpected message handling when using multi-recv buffers
- Handle SAR and rendezvous messages received into multi-recv buffers
- Give application entire size of eager buffer region
- Minor code cleanups based on static code analysis
- Simplify rendezvous message code paths
- Avoid passing internal errors handling progress directly to applications
- Limit fi_cancel to canceling at most 1 receive operation
- Remove incorrect handling if errors occur writing to a CQ
- Only write 1 CQ entry if a SAR message fails
- Continue processing if the receive buffer pool is full and reposting delayed
- Add support for dynamic receive buffering when layering over tcp
- Add support for direct send to avoid send bounce buffers in certain cases
- Prioritize credit messages to avoid deadlock
- Fix conversion to message provider's mr access flags
- Reduce inject size by the minimum packet header needed by rxm
- Fix checks to enable shared rx when creating an endpoint
- Minor code restructuring
- Fix trying to access freed memory in error handling case
- Use optimized inject limits to avoid bounce buffer copies
- Fix possible invalid pointer access handling rx errors
- Add support for HMEM if supported by msg provider
- Add missing locks around progress to silence thread-sanitizer
- Support re-connecting to peers if peer disconnects (client-server model)
- Cleanup rendezvous protocol handling
- Add support for RMA write rendezvous protocol

## SHM

- Add support for Ze IPC protocol
- Only perform IPC protocol related cleanup when using IPC
- Disable cross-memory attach protocol when HMEM is enabled
- Fix cross-memory attach support when running in containers
- Always call SAR protocol's progress function
- Enable cross-memory attach protocol when sending to self
- Minor code cleanups and restructuring for maintenance

## Sockets

- Verify CM data size is less than supported value
- Handle FI_SYNC_ERR flag on AV insert
- Improve destination IP address checks
- Minor coding cleanups based on static code analysis
- Fix possible use after free access in Rx progress handling

## TCP

- Fix hangs on windows during connection setup
- Relax CQ checks when enabling EP to handle send/recv only EPs
- Fix possible use of unset return value in EP enable
- Minor coding cleanups based on static code analysis
- Handle EAGAIN during CM message exchanges
- Set sockets to nonblocking on creation to avoid possible hangs at scale
- Improve CM state tracking and optimize CM message flows
- Make passive endpoints nonblocking to avoid hangs
- Allow reading buffered data from disconnected endpoints
- Implement fi_cancel for receive queues
- Flush outstanding operations to user when an EP is disabled
- Support dynamic receive buffering - removes need for bounce buffers
- Add direct send feature - removes need for bounce buffers
- Minor code cleanups and restructuring to improve maintenance
- Add support for fo_domain_bind

## Util

- Improve checks that EPs are bound to necessary CQs
- Fix mistaking the AV's total size with current count to size properly
- Fix CQ buffer overrun protection mechanisms to avoid lost events

## Verbs

- Add SW credit flow control to improve performance over Ethernet
- Skip verbs devices that report faulty information
- Limit inline messages to iov = 1 to support more devices
- Minor code improvements and restructuring to improve maintenance
- Enable caching of device memory (RoCR, CUDA, Ze) registrations
- Add HMEM support, including proprietary verbs support for P2P
- Add support for registering device memory
- Support GIDs at any GID index, not just 0
- Fix macro definitions to cleanup build warnings
- Support GID based connection establishment, removes ipoib requirement
- Reduce per peer memory footprint for large scale fabrics

v1.11.2, Tue Dec 15, 2020
=========================

## Core

- Handle data transfers > 4GB on OS X over tcp sockets
- Fixed spelling and syntax in man pages
- Fix pmem instruction checks

## EFA

- Use memory registration for emulated read protocol
- Update send paths to use app memory descriptor if available
- Remove unneeded check for local memory registration
- Do not install fork handler if EFA is not used
- Fix medium message RTM protocol
- Fix memory registration leak in error path
- Fix posting of REQ packets when using shm provider

## RxM

- Fix provider initialization when built as a dynamic library

## SHM

- Reverts SAR buffer locking patch
- Include correct header file for process_vm_readv/writev syscalls
- Skip atomic fetch processing for non-fetch operations

## TCP

- Fix swapping of address and CQ data in RMA inject path

## Util

- Fix error code returned for invalid AV flags
- Fix a bug finding the end of a page when the address is aligned

## Verbs

- Fix build warning in XRC CM log messages
- Fix build warnings in debug macros

v1.11.1, Fri Oct 9, 2021
========================

## Core

- Remove calls to cuInit to prevent indirect call to fork
- Ignore case when comparing provider names
- Prevent layering util providers over EFA
- Fix segfault if passed a NULL address to print
- Fail build if CUDA is requested but not available

## EFA

- Switch to memhooks monitor
- Avoid potential deadlock copying data to GPU buffers
- Allow creating packet pools with non-huge pages
- Check return value when processing data packets
- Minor code restructuring and bug fixes
- Check if outstanding TX limit has been reached prior to sending
- Move RDMA read registration to post time
- Do not overwrite a packet's associated MR when copying packets
- Pass in correct packet when determining the header size
- Do not release rx_entry in EAGAIN case
- Disable MR cache if fork support is requested
- Turn off MR cache if user supports FI_MR_LOCAL
- Add FI_REMOTE_READ to shm registrations
- Remove use_cnt assert closing domain to allow driver cleanup
- Fix off by 1 returned AV address when using AV map
- Ensure setting FI_HMEM capability is backwards compatible

## RxD

- Fix bug that prevents sending timely ACKs for segmented messages
- Remove calls that recursively try to acquire the EP lock

## RxM

- Allow re-connecting to peers

## SHM

- Create duplicate fi_info's when reporting FI_HMEM support
- Handle transfers larger than 2GB
- Register for signal using SA_ONSTACK
- Fix segfault if peer has not been inserted into local AV
- Fix command/buffer tracking for sending connection requests
- Return proper errno on AV lookup failures
- Remove duplicate call to ofi_hmem_init
- Fix using incorrect peer id for mid-sized message transfers
- Fix addressing race conditions
- Fix mixing of shm AV index values with fi_addr_t values
- Fix initialization synchronization
- Ensure progress is invoked for mid-sized message transfers
- Always use CMA when sending data to self
- Fix hang using SAR protocol

## Sockets

- Retry address lookup for messages received during CM setup

## TCP

- Fix possible deadlock during EP shutdown due lock inversion
- Rework CM state machine to fix lock inversion handling disconnect

## Util

- Correctly mark if addresses support local/remote communication
- Check madvise memhook advice
- Update mmap intercept hook function
- Replace memhooks implementation to intercept syscalls
- Fix shmat intercept hook handling
- Fix error handling obtaining page sizes
- Fix incorrect locking in MR cache
- Fix memory leak in rbtree cleanup

## Verbs

- Fix XRC transport shared INI QP locking
- Account for off-by-one flow control credit issue
- Fix disabling of receive queue flow control
- Reduce overall memory footprint on fully connected apps
- Skip reporting native IB addresses when network interface is requested

v1.11.0, Fri Aug 14, 2020
=========================

## Core

- Add generalized hmem_ops interface for device ops
- Add FI_HMEM_CUDA, FI_HMEM_ROCR, and FI_HMEM_ZE interfaces and device support
- Add CUDA and ROCR memory monitors and support for multiple monitors
- Add fi_tostr for FI_HMEM_* interfaces
- Add utility interface and device support
- Add documentation for hmem override ops
- Save mr_map mem_desc as ofi_mr
- Rework and reorganize memory monitor code
- Add mr_cache argument flush_lru to ofi_mr_cache_flush
- Fix 1.1 ABI domain, EP, and tx attributes
- Add loading of DL providers by name
- Add CMA wrappers and define CMA for OSX
- Fix util getinfo: use base fi_info caps, altering mr_mode properly,
  FI_MR_HMEM support, NULL hints, set CQ FI_MSG flag, query FI_COLLECTIVE,
  list FI_MATCH_COMPLETE, select and request specific core provider
- Add rbmap interface to get root node
- Add support of AF_IB to addr manipulation functions
- Windows: Map strtok_r() to strtok_s()
- Define OFI_IB_IP_{PORT,PS}_MASK
- Make fi_addr_format() public
- Remove mr_cache entry subscribed field
- Update memhooks brk and implement sbrk intercepts
- Fix vrb_speed units
- Fix possible null dereference in ofi_create_filter
- Add ofi_idx_ordered_remove
- Add functions ofi_generate_seed() and ofi_xorshift_random_r()
- Call correct close fd call in util_wait_fd_close
- Set a libfabric default universe size
- Add compatibility with SUSE packaging
- Windows: Handle socket API size limitations
- Fix UBSAN warnings
- Save and restore the errno in FI_LOG
- Ensure that access to atomic handlers are in range
- Ensure ifa_name is null terminated in ofi_get_list_of_addr
- Buffer pools fallback to normal allocations when hugepage allocations fail

## EFA

- Add support to use user posted receive buffers with RDM EP when requested
- Various fixes to FI_HMEM support
- Added fork handler and abort if rdma-core is incorrectly configured
- Fix bandwidth regression due to increased structure size
- Reuse verbs protection domain when in same process address space
- Periodically flush MR cache to reduce MR usage
- Properly handle setting/unsetting RDMAV_HUGEPAGES_SAFE
- Fix provider_version reported by EFA
- Populate additional fields in fid_nic
- Fix various bugs in the completion, info, and domain paths
- Fix various memory leaks

## PSM2

- Treat dynamic connection errors as fatal
- Add missing return status checking for PSM2 AM calls

## RxD

- updated AV design to be dynamically extensible using indexer and index map.
- updated static allocation of peers with runtime allocation during rts.
- added wrapper to fetch pointer to a peer from the peers data structure.
- Updated to show correct msg_ordering.
- Check datatype size when handling atomic ops.
- Verify atomic opcode in range for fixing Klocwork issue.
- Corrected use of addr in rxd_atomic_inject for retrieving rxd_addr.

## RxM

- Align reporting of FI_COLLECTIVE with man pages
- Show correct ordering of atomic operations
- Fix error handling inserting IP addresses into an AV
- Minor code cleanups and bug fixes
- Select different optimizations based on running over tcp vs verbs
- Use SRX by default when using tcp to improve scaling
- Correct CQ size calculation when using SRX
- Fix MR registration error path when handling iov's
- Allow selecting tcp wait objects separate from verbs
- Only repost Rx buffers if necessary

## SHM

- Fix a CMA check bug
- Fix shm provider signal handler calling the original handler
- Add initial framework for IPC device copies
- Add FI_HMEM support and integrate hmem_ops
- Fix error handling path in smr_create
- Fix AV insertion error handling
- Verify atomic op value
- Redefine shm addrlen to not use NAME_MAX
- Fix snprintf to exclude byte for null terminator
- Mark smr_region as volatile
- Fix memory leaks

## Sockets

- Fix backwards compatibility accessing struct fi_mr_attr
- Fix use after free error in CM threads
- Free unclaimed messages during endpoint cleanup to avoid memory leaks
- Improve handling of socket disconnection
- Limit time spent in progress when expected list is long
- Avoid thread starvation by converting spinlocks to mutex

## TCP

- Minor bug fixes
- Verify received opcode values are valid
- Avoid possible receive buffer overflow from malformed packets
- Fix fi_cq_sread failing with ECANCELED
- Optimize receive progress handling
- Do not alter pseudo random sequence numbers
- Increase default listen backlog size to improve scaling
- Handle processing of NACK packets during connection setup
- Fix wrong error handling during passive endpoint creation
- Add logging messages during shutdown handling
- Improve logging and error handling
- Fix possible use after free issues during CM setup
- Minor code restructuring

## Util

- Use internal flags in place of epoll flags for portability
- Support HMEM with the mr_cache
- Verify application requested FI_HMEM prior to accessing fi_mr_attr fields
- Fix memory leak when using POLLFD wait sets
- Ensure AV data is aligned even if address length is not
- Fix handling of mr mode bits for API < 1.5
- Allow user to force use of userfaultfd memory monitor

## Verbs

- Add support for AF_IB and native IB addressing
- Minor code cleanups
- Avoid possible string overrun parsing interface names
- Fix memory leak handling duplication interface names
- Add XRC shared Rx CQ credit reservation
- Fix possible segfault when closing an XRC SRQ
- Fix verbs speed units to MBps
- Add flow control support to avoid RQ overruns
- Fix memory leak of address data when creating endpoints

v1.10.1, Fri May 8, 2020
========================

## Core

- Fixed library version

## EFA

- Allow endpoint to choose shm usage
- Fix handling of REQ packets
- Fix logic writing a Tx completion entry
- Use correct Tx operation flags for msg sends

## Fabtests

- Use pax tar format when creating source packages

## RxD

- Use correct peer address for atomic_inject calls

## SHM

- Fix BSD build failure

## TCP

- Add locking around signaling a wait fd

v1.10.0, Fri Apr 24, 2020
=========================

## Core

- Added new pollfd wait object to API
- Added ability to query for a fid's wait object type
- Updated most providers to a new provider versioning format
- Support using multiple fds for blocking calls, in place of epoll
- Fix memory leak when destroying rbtrees
- Record interface names and network names for IP addressable providers
- Improved performance of timing calculations
- Improvements to MR caching mechanism

## EFA

- Replaces custom admin commands with native use of rdma-core APIs
- Added support for FI_RMA using RDMA Reads
- Added rendezvous protocol for long messages using RDMA Reads
- Added support for CUDA buffers (FI_HMEM)
- Added medium-message protocol
- Added support for atomic operations
- Added randomized Queue Key assignment to endpoints
- Improved support for client-server applications
- Disables use of shared-memory if FI_LOCAL_COMM is not required
- Updated protocol to v4
- Refactor packet handling functions and headers for better extensibility
- Added handshake protocol to negotiate protocol features with peers
- Refactor send/recv paths for improved memory descriptor handling
- Use inlined device sends for FI_INJECT
- Removes fork() to detect CMA support from the init path
- Better reuse of MRs keys across EFA and SHM control path
- Squashes the MR functions in the RxR and EFA layers
- Squashes the AV functions in the RxR and EFA layers
- Use 0-based offset if FI_MR_VIRT_ADDR not set
- Retries memory registration in MR cache error paths
- Fixes to addr_format handling in the RDM endpoint
- Fixes memory leaks
- Fixes AV error handling paths
- Fixes shm error handling paths
- Fixes compiler warnings

## PSM2

- Improve source address translation for scalable endpoints

## RxM

- Add support for pollfd wait objects
- Fix double free in error path
- Report CQ errors for failed RMA transfers
- Fixing locking in tagged receive path
- Remove incorrect rx_attr capability bits
- Handle unexpected messages when posting multi-recv buffers
- Repost multi-recv buffers to the receive queue head
- Fix unexpected message handling
- Fix stall in collective progress caused by lost receive buffers
- Add support for collection operations

## RxD

- Replace rxd_ep_wait_fd_add with direct call to ofi_wait_fd_add
- Reorganize attr caps
- Add rxd to fi_provider man page

## SHM

- Fix pointer ofi_cq_init progress pointer
- Add CQ wait object support with new FI_WAIT_YIELD wait type
- Include string terminator in addrlen
- Fix av_insert address cast
- Fix unexpected messaging processing on empty receive queue
- Fix unexpected messaging locking
- Progress unexpected queue for non-tagged receives
- Move ep_name_list initialization/cleanup and fix signal handling
- Reorganize attr caps
- Warn once on peer mapping failures
- Add FI_DELIVERY_COMPLETE support
- Fix FI_MULTI_RECV reporting and allow writing to overflow CQ for unexpected MULTI_RECV
- Refactor and simplify msg processing, formating, and recv posting
- Rename ep_entry to rx_entry and add tx_entry for pending outgoing messages
- Properly align cmd data
- Return correct addrlen on av lookup
- Fix id passed into rma fast path
- Fix typo
- Fix potential data ordering issue in atomic fetch path
- Add proper RMA read protocol without CMA
- Add runtime CMA check during mapping
- Add mmap-based fallback protocol for large messages without CMA
- Add large message segmentation fallback protocol for large messages without CMA and
  add FI_SHM_SAR_THRESHOLD to control switching between segmentation and mmap
- Define macros for address translation
- Allow building of shm provider on older kernels with x86 arch
- Rename peer_addr to peer_data
- Change locking when progressing response entries
- Fix cmd_cnt increment on RMA ops
- Add error handling when inserting more than SMR_MAX_PEERS
- Add shm size space check
- Fix locking when processing response from self
- Add locking around the ep_name_list

## TCP

- Fix incorrect signaling of fd waking up thread in fi_cq_sread
- Switch to using pollfd wait object instead of epoll as default
- Add missing ep lock to fix possible ep list corruption
- Remove incorrectly reported CQ events posted to EQ
- Update domain name to IP network name
- Improved socket processing to improve scalability and performance
- Remove incorrect implementation of FI_MULTI_RECV support
- Report error completions even if successful completion is suppressed
- Report correct EQ event for aborted connections

## Verbs

- Fix XRC request identification
- Fix small memory leak for XRC connections
- Add retry logic for XRC connections
- Fix mapping of domains to NICs when multiple NICs are present
- Allow filtering of device names via environment variable
- Fix compilation with -fno-common option
- Code restructuring to improve maintenance

v1.9.1, Fri Mar 6, 2020
=======================

## Core

- Fix gcc 9.2 warnings
- Fix thread hangs in MR cache when using userfaultfd monitor
- Add missing header for FreeBSD build
- Allow a core provider to discover and use filtered providers

## EFA

- Change MR cache count and size limits
- Fixes to 32-bit msg_id wraparound handling
- Adds address map to look up EFA address from shm address
- Remove unnecessary EFA device name check
- Detect availability of CMA directly from EFA provider
- Use OFI_GETINFO_HIDDEN flag when querying for shm
- Allow use of EFA when shm is unavailable
- Fixes info and domain capabilities for RDM endpoint
- Fixes to dest_addr returned with info objects
- Fixes segfault in efa_mr_cache_entry_dereg()
- Fixes compilation warning in DSO build of the provider
- Fixes compilation errors with -fno-common
- Fixes to send-side control path

## PSM2

- Clean up of AV entries that have been removed

## RxM

- Fix multi-recv buffer handling to use entire buffer
- Consume entire multi-recv buffer before using buffer
- Continue execution after handling transfer errors
- Properly cleanup CM progress thread
- Minor code cleanups and restructuring

## SHM

- Properly restore captured signals
- Track ptrace_scope globally, and allow disabling
- Properly initialize endpoint name list
- Fix potential deadlock resulting from missed handling of unexpected messages
- Fix multi-threading issue accessing unexpected messages
- Handle multiple addresses passed to fi_av_insert
- NULL terminate address strings
- Pass correct pointer to ofi_cq_init

## TCP

- Removed incorrect implementation for multi-recv buffer support
- Always report error completions
- Report correct EQ event for aborted connection requests
- Improve connection data corner cases

## Verbs

- Fix segfault handling error completions
- Avoid null dereference handling EQ events
- Remove possible deadlock in XRC error path
- Enable credit tracking to avoid SQ, RQ, and CQ overruns
- Verify that CQ space is available for bound EPs
- Minor code cleanups and restructuring


v1.9.0, Fri Nov 22, 2019
========================

## Core

- Add generic implementation for collective operations
- Add support for traffic class selection
- Fixes and enhancements to memory registration cache
- Add support for older kernels to the MR cache (hook malloc related calls)
- Fix setting loopback address byte ordering
- Fix MR cache locking from spinlock to a mutex to avoid starvation
- Add API enhancements for heterogeneous memory (e.g. GPUs)
- Limit default size of MR cache to avoid out of memory errors
- Fix g++ compile error
- Enhanced the hooking provider infrastructure
- Enhanced windows support for IPv6 and NIC selection
- Fix timeout calculation in wait operations
- Add simple spell checker for FI_PROVIDER
- Fix red-black tree possible use after free issue
- Fix segfault running libfabric within a linux container
- Minor cleanups and bug fixes
- Work-around possible long delay in getaddrinfo()

## EFA

- Introduce support for shared-memory communication using shm provider
- Enable Memory Registration caching by default
- Refactor TX and CQ handling functions to reduce branching
- Use application-provided MR descriptors when available
- Optimize progress engine polling loop for shm and EFA completions
- Enable inline registration for emulated RMA reads
- Inherit FI_UNIVERSE_SIZE for AV sizing
- Increase default min AV size to 16K
- Fix uninitialized objects with DSO build of the provider
- Fix handling of FI_AV_UNSPEC
- Fix crash and resource leak with fi_cancel() implementation
- Fix issues with EFA's registration cache under efa;ofi_rxd
- Fix MR allocation handlers to use correct pointer and size
- Fix error handling in multi-recv completion code
- Fix compilation errors when built with valgrind annotations
- Fix compilation errors when packet poisoning was enabled
- Fix incorrect parameter definitions
- Fix leaks of internal resources
- Miscellaneous cleanups and bug fixes

## MRail

- Renamed address control environment variable
- Implement large message striping using rendezvous
- Properly set tx/rx op flags

## PSM2

- Fix memory leaks
- Add fi_nic support
- Report correct value for max_order_raw_size
- Report max_msg_size as a page aligned value
- Fix potential multi-threaded race condition
- Avoid potential deadlock in disconnect protocol

## RxD

- Fix default AV count
- Minor cleanups and optimizations
- Handle errors unpacking packets
- Report all failures when inserting addresses into AV
- Remove unneeded posted buffer tracking

## RxM

- Fix inject completion semantics
- Fix MR key handling when mismatched with core provider
- Add basic support for some collective operations
- Fix senddata desc parameter mismatch
- Serialize EQ processing to avoid use after free issue
- Minor cleanup and optimizations
- Remove atomic buffer limitations
- Provide mechanism to force auto-progress for poorly designed apps
- Fix high memory usage when using RMA
- Fix segfault handling memory deregistration
- Discard canceled receive buffers when closing msg ep
- Fix memory leaks in connection management

## SHM

- Cleanup tmpfs after unclean shutdown
- Increase the size of endpoint names
- Align endpoint count attribute with maximum supported peer count
- Add user ID to shared memory name
- Only support small transfers if ptrace is restricted
- Fix incorrect reporting of completion buffer
- Return correct addrlen on fi_getname
- Round tx/rx sizes up in case sizes are not already a power of two
- Skip utility providers for shm provider

## TCP

- Report aborted requests as canceled
- Fixed support for 0-length transfers
- Return positive error code for CQ entries
- Bind ports using SO_REUSEADDR
- Properly check for correct recv completion length
- Fix potential deadlock due to lock ordering issue

## Verbs

- Enable on-demand paging memory registration option
- Enable send queue overflow optimization for mlx devices
- Cleanup EQ when closing an associated endpoint
- Minor optimizations and code restructuring
- Avoid potential deadlock accessing EQ and EP
- Speedup XRC connection setup
- Handle IPv6 link local address scope id
- Updates to support new versions of rdma-core libraries
- XRC connection optimizations, cleanups, and error handling improvements
- Fix possible segfault in error handling path
- Remove support for vendor specific and experimental verbs
- Handle 0-length memory registrations
- Fix EQ trywait behavior to check for software events


v1.8.1, Mon Sep 30, 2019
========================

## Core

- Limit default size of memory registration cache
- Verify that correct entry is removed from MR cache

## EFA

- Fixes to fi_cancel() when used with multi-recv buffers
- Fixes to registered memory handling after a fork()
- Fixes to the long message flow-control protocol
- Use FI_AV_TABLE as the preferred AV type
- Fixes to the bufpool allocation handlers
- Fixes to RTS handler
- Fix to use correct arch detection preprocessor macro
- Expose fid_nic information
- Fix memory leaks

## PSM2

- Fix incorrect value of max_order_raw_size
- Report page aligned max_msg_size
- Always enable the lock accessed by the disconnection thread
- Fix race condition with progress thread and FI_THREAD_DOMAIN
- Avoid a potential deadlock in disconnection protocol

## RxD
- Fix default AV count with environment variable FI_OFI_RXD_MAX_PEERS

## RxM

- Fix connection handle shutdown/CQ processing race
- Fix RMA ordering bits for FI_ATOMIC

## SHM
- Add correct reporting of FI_MR_BASIC
- Add correct reporting and proper support of FI_DIRECTED_RECV

## Verbs

- Allow zero length memory registrations
- Improve connection scale up by removing synchronous calls in fi_getinfo
- Fix missing serialization to event channel during CM ID migration
- Protect XRC EQ processing from EP API connect/accept calls
- Fix XRC connection tag to EP return value in error case
- return EAGAIN to user if an unhandled rdmacm event is received
- handle IPv6 link local addresses correctly


v1.8.0, Fri Jun 28, 2019
========================

## Core

- Reworked memory registration cache to use userfaultfd
- Allow disabling atomic support as build option
- Updated default provider priority
- Define new data ordering bits to separate atomic from RMA ordering
- Improved support for huge page allocations
- Convert all python scripts to version 3
- Add logging to report atomic implementation in use
- Fix timeout calculation in util wait abstraction.
- Fix hang when multiple threads wait on util counter.

## EFA

- New core provider for Amazon EC2 Elastic Fabric Adapter (EFA)

## GNI

- Fix handling of incorrect fi_addr_t value
- Fix several problems when using FI_ADDR_STR format
- Fix several problems when using multi-receive buffers
- Fix problem with possible receive truncation
- Fix possible overrunning of receive buffers
- Implement fi_getopt/fi_setopt for scalable endpoints
- Only generate FI_EADDRNOTAVAIL if FI_SOURCE_ERR enabled

## MRAIL

- Several performance enhancements and optimizations

## PSM2

- Disable some optional psm2 features due to instabilities
- Work around possibly long delays in getaddrinfo()

## RxD

- Ensure that peers are initialized before sending data packets
- Maintain a full posting of receive buffers to the DGRAM EP
- Minor bug fixes and cleanups
- Verify protocol versions and reject incompatible versions
- Update and simplify protocol headers
- Optimize packet initialization
- Various improvements handling protocol messages
- Remove pending_cnt tracking (moved to verbs provider)
- Fix setting of mr_mode on getinfo
- Handle error unpacking packet headers properly

## RxM

- Improved responsiveness to CM events with manual progress mode
- Add proper versioning to CM data exchange protocol
- Refactor how CQ and EQ errors are handled and reported
- Minor code cleanups and bug fixes
- Support atomic operations in auto progress mode
- Fix high memory usage when a large number of RMA ops are posted without
  FI_COMPLETION

## SHM

- Fix possible segfault
- Fix smr_freestack_pop to properly remove entry from stack

## TCP

- Enable multi-recv support
- Allow restricting tcp to specific port range for firewall purposes

## Verbs

- Add NIC info (fi_nic) in fi_info
- Improved support for UD QPs over RoCE
- Added CQ resource tracking to avoid CQ overrun on HFI1
- Enable memory registration cache by default
- Reduce unnecessary log message noise
- Bug fixes for EQ trywait, QP creation attributes.

## Fabtests

- Add new test to test memory registration caching correctness
- Fix segfault with fi_pingpong
- Allow specifying flags as command line arguments
- Add threading level option to ubertest
- Enable support for all endpoint types with multi_recv test
- Add new AV insertion unit test
- Allow for out-of-band address exchange with in-band test synchronization
- Check for and use exclusion files by default if found
- Use regular expressions for test exclusion files
- Have test scripts use full test names
- Replace socket provider with tcp and udp for default testing

v1.7.2, Fri Jun 14, 2019
========================

## Core

- Rename variables that shadow global symbols
- Set slist tail to NULL to handle iterators correctly
- Add new locking to AV EP list to avoid potential deadlock
- Add threadsafe AV implementation

## GNI

- Fix possible overrunning of receive buffers
- Fix compile issue on CLE 7.0.UP01
- Implement fi_getopt/fi_setopt for scalable endpoints
- Only generate FI_EADDRNOTAVAIL if FI_SOURCE_ERR enabled

## RxD

- Align packet type declarations with debug prints
- Track current unexpected messages per peer, rather than globally
- Remove unneeded RXD_CANCELED flag
- Remove unnecessary check of unexpected list
- Support FI_CLAIM, FI_PEEK, and FI_DISCARD flags
- Avoid double free on CQ error destruction path
- Fix message windowing
- Limit number of transfer entries that can be active
- Use utility CQ calls to handle CQ overflow
- Set correct opcode when completing read completions
- Preset and fix tx and rx transfer flags
- Fix segfault

## RxM

- Add missing serialization for RMA and atomics
- Reject connection requests in shutdown state
- Rework cmap and ep lock synchronization
- Add CM events to improve debugging
- Remove incorrect assertion
- Progress EQ events from app thread to drive progress
- Handle message segment ordering when buffering receives
- Generate completions for claimed buffered messages
- Minor other fixes and cleanups

## TCP

- Support FI_SELECTIVE_COMPLETION correctly
- Fix transmit and delivery complete semantics
- Verify fi_info when creating passive EP

## Verbs

- Remove XRC target QP from RDMA CM control
- Fix XRC QP allocation failure return code
- Fix EQ readerr locking
- Add FI_ADDR_IB_UD to known address print format
- Fix addressing return on fi_getifo for native IB addresses
- Serialize access to EQ when destroying connections
- Add serialization to XRC EQ/CM handling
- Fix serialization between AV and cmap
- Fix XRC connection tags
- Remove racy fork support from provider
- Minor other fixes

## Fabtests

- Exclude tests from OS X that require epoll support

v1.7.1, Mon Apr 8, 2019
========================

## Core

- Support layered provider names with FI_PROVIDER filter
- Minor cleanups to man pages
- Add missing header for FreeBSD support
- Fix built-in atomic tests
- Do not overwrite CFLAGS during build

## Fabtests

- Fix memory leaks in fi_getinfo_test
- Fix memory leak in fi_av_xfer error paths
- Fix memory leaks in fi_cm_data
- Fix memory leak in fi_poll
- Add test configuration and exclude files for shm

## GNI

- Fix setting supported tag mask
- Fall back to normal allocations when huge page support is unavailable

## PSM2

- Inline address translation function for performance
- Create tagged ops specialization for FI_AV_MAP
- Bring back true FI_AV_MAP support under certain conditions
- Use psm2_epaddr_to_epid() for epaddr to epid conversion
- Add runtime parameter for connection timeout

## MXM

- Disable provider by default as it is not actively maintained

## RxD

- Fix completion generation
- Use correct flags for *msg APIs
- Return correct error code if Tx is unable to be accepted
- Add missing EP attributes
- Fix use after free error opening an endpoint

## RxM

- Avoid unnecessary fi_getinfo calls to avoid possible failures
- Discard Rx buffers for closed EPs to avoid segfaults
- Repost receive buffers to avoid possible fabric deadlock
- Fix crash accessing invalid Tx CQ
- Use correct flags for *msg APIs
- Report available inject size, rather than reducing value based on user hints
- Fix endpoint configuration checks when enabling endpoint
- Keep eager protocol limit separate from inject size
- Increment correct atomic counter
- Discard CQ entries generated by canceled receives
- Initialize maximum atomic payload size
- Add connection event progress
- Fix SAR protocol truncation error
- Fix setting FI_MSG and fi_RMA caps

## TCP

- Fix synchronization verifying MRs
- Remove duplicate Rx queue removal
- Return internal buffers to correct buffer pools to avoid data corruption
- Free internal buffer pools and fix memory leaks
- Handle peer socket disconnects properly
- Pass signals through to application threads
- Cleanup pending events when closing endpoints
- Only adjust endianness when peers' endianness mismatches
- Optimize header sizes based on message types to improve performance

## Sockets

- Fix acquiring the same lock twice
- Fix accessing an uninitialized pointer handling CM events

## SHM

- Add support for selective completions
- Fix addressing
- Set MR key size
- Fix memory corruption and memory leak

## Verbs

- Cleanup from use of memory registration cache
- Do not update minimum RNR timer for XRC initiator QPs
- Fix possible CQ overrun issues with hfi1 and qib devices
- Fix synchronization issue accessing MR cache from multiple threads
- Fix double free in XRC accept path
- Add missing include file
- Make fi_getinfo call thread safe
- Fix CQ busy issue in MPI finalize when using XRC
- Return correct attribute values when XRC is enabled
- Fix sending atomic response protocol message
- Report flushed receive operations (reverts to behavior in 1.4 and earlier)
- Fix segfault reading CQ error entries
- Fix double free of XRC connection request data
- Set FI_RX_CQ_DATA mode bit correctly

v1.7.0, Mon Jan 7, 2019
=======================

The 1.7 release provides a few enhancements to the libfabric API.
Notably, it extends the fi_info structure in order to report NIC
attributes for domains that have a direct association with network
hardware.  The NIC attributes include details about the device, the
system bus it's attached to, and link state.  NIC attributes are
automatically reported by the fi_info utility application.  See the
fi_nic.3 man page for additional details.

An experimental capability bit is added to optimize receive side
processing.  This is known as variable messages, and targets applications
that do now know what size message a peer with send prior to the
message arriving.  Variable messages can be used to avoid receive
side data copies and eliminate the need for applications to implement
their own rendezvous protocol.  See the fi_msg.3 man page for details on
variable messages and it's sister, buffered messages.

Specific details on changes since the 1.6.2 release are outlined below.

## Core

- Add ability to report NIC details with fi_info data
- Improve MR cache notification mechanisms
- Set sockaddr address format correctly
- Avoid possible null dereference in eq_read
- Handle FI_PEEK in CQ/EQ readerr
- Add debug messages to name server
- Feature and performance enhancements added to internal buffer pool
- Add support for huge pages
- Decrease memory use for idle buffer pools
- Refactor utility AV functionality
- Generic counter support enhancements
- Optimize EP and CQ locking based on application threading level
- Enhance common support for EQ error handling
- Add free/alloc memory notification hooks for MR cache support
- Fix memory monitor unsubscribe handling
- Add CQ fd wait support
- Add CQ overflow protection
- Enhance IPv6 addressing support for AVs
- Enhancements to support for AV address lookup
- Fixes for emulated epoll support
- Allow layering of multiple utility providers
- Minor bug fixes and optimization

## Hook

- Improved hooking infrastructure
- Add support for installing multiple hooks
- Support hooks provided by external libraries.

## GNI

- Fix CQ readfrom overwriting src_addr in case of multiple events
- Signal wait set if error entry is added to CQ
- Fix state data issue with SMSG buffers
- Enhance and fix possible misuse of default authorization key
- Add cancel support for SEP
- Rework SEP setup
- Suppress huge page counting for ARM
- Fix incorrect check of FI_SYNC_ERR flag

## MRAIL

- Initial release of mrail provider. The current status is experimental: not all
  features are supported and performance is not guaranteed.
- Enables increased bandwidth for an underlying provider by utilizing multiple
  network ports (rails).

## NetDir

- Fix crash in initialization code
- Update references to NetworkDirect header packaged

## PSM2

- Requires PSM2 library version 10.2.260 or later
- Clean up connection state in fi_av_remove
- Use psm2_info_query to read HFI device info
- Clean up CQ/counter poll list when endpoint is closed
- Support shared address vector
- Optimize CQ event conversion with psm2_mq_ipeek_dequeue_multi
- Lock optimization for FI_THREAD_DOMAIN
- Use new PSM2 fast path isend/irecv functions for large size RMA
- Support building with latest PSM2 source code (version 11.2.68)
- Support fabric direct

## RxD

- Initial release of RxD provider
- Provides reliable datagram semantics over unreliable datagram EPs
- Target is to improve scalability for very large clusters relative to RxM

## RxM

- Decrease memory use needed to maintain large number of connections
- Set correct op_context and flags on CQ error completions
- Fix file descriptor memory leaks
- Introduce new protocol optimized for medium message transfers
- Improve Rx software performance path
- Use shared receive contexts if required by underlying provider
- Handle addresses inserted multiple times into AV (for AV map)
- Performance optimizations for single-thread applications
- Rework deferred transmit processing
- Separate and optimize eager and rendezvous protocol processing.
- Fix passing incorrect addresses for AV insert/remove
- Fix CM address handling
- Fix race condition accessing connection handles
- Simplify small RMA code path
- Increment correct counter when processing FI_READ events
- Dynamically grow the number of connections that can be supported
- Fix padding in wire protocol structures
- Report correct fi_addr when FI_SOURCE is requested
- Fix truncating rendezvous messages
- Fix use after free error in Rx buffer processing
- Add support for manual progress
- Make Tx/Rx queue sizes independent of MSG EP sizes
- Decrease time needed to repost buffers to the MSG EP Rx queue.
- Miscellaneous bug fixes

## Sockets

- Enable MSG EPs when user calls fi_accept
- Fix fabric names to be underlying IP address
- Add connection timeout environment variable.
- Use size of addresses, not structures
- Add debug messages to display selected addresses
- Use loopback address in place of localhost
- Simplify listen paths
- Add support for IPv6
- Code restructuring
- Avoid unneeded address to string to address translations
- Check length of iovec entries prior to access buffers
- Fix segfault
- Avoid acquiring nested spinlocks resulting in hangs
- Fix use after free error in triggered op handling
- New connection manager for MSG EPs to reduce number of threads
- Avoid retrying recv operations if connection has been broken
- Fixes for Windows socket support

## TCP

- Initial release of optimized socket based tcp provider
- Supports MSG EPs, to be used in conjunction with RxM provider
- Targets eventual replacement of sockets provider

## Verbs

- Remove RDM EP support.  Use RxM and RxD for RDM EPs.
- Improve address handling and report in fi_getinfo
- Handle FI_PEER when calling CQ/EQ readerr functions
- Add support for XRC QPs.
- Ignore destination address when allocating a PEP
- Add workaround for i40iw incorrect return values when posting sends
- Fix completion handling for FI_SELECTIVE_COMPLETION EP setting
- Change format of fabric name to use hex instead of decimal values
- Fix handling of err_data with EQ readerr
- Report correct size of max_err_data
- Fast path performance improvements
- Improve progress under high system load
- Optimize completion processing when handling hidden completions
- Optimize RMA and MSG transfers by pre-formatting work requests
- Remove locks based on application threading model
- Add overflow support for CQ error events
- Minor cleanups and bug fixes

v1.6.2, Fri Sep 28, 2018
========================

## Core

- Cleanup of debug messages

## GNI

- Fix problems with Scalable Endpoint creation
- Fix interoperability problem with HPC toolkit
- Improve configuration check for kdreg

## PSM

- Enforce FI_RMA_EVENT checking when updating counters
- Fix race condition in fi_cq_readerr()
- Always try to make progress when fi_cntr_read is called

## PSM2

- Revert "Avoid long delay in psm2_ep_close"
- Fix memory corruption related to sendv
- Performance tweak for bi-directional send/recv on KNL
- Fix CPU detection
- Enforce FI_RMA_EVENT checking when updating counters
- Remove stale info from address vector when disconnecting
- Fix race condition in fi_cq_readerr()
- Adjust reported context numbers for special cases
- Always try to make progress when fi_cntr_read is called
- Support control functions related to MR mode
- Unblock fi_cntr_wait on errors
- Properly update error counters
- Fix irregular performance drop for aggregated RMA operations
- Reset Tx/Rx context counter when fabric is initialized
- Fix incorrect completion event for iov send

## RXM

- Fix incorrect increments of error counters for small messages
- Increment write completion counter for small transfers
- Use FI_UNIVERSE_SIZE when defining MSG provider CQ size
- Make TX, RX queue sizes independent of MSG provider
- Make deferred requests opt-in
- Fill missing rxm_conn in rx_buf when shared context is not used
- Fix an issue where MSG endpoint recv queue got empty resulting
  in a hang
- Set FI_ORDER_NONE for tx and rx completion ordering
- Serialize access to repost_ready_list
- Reprocess unexpected messages on av update
- Fix a bug in matching directed receives
- Fix desc field when postponing RMA ops
- Fix incorrect reporting of mem_tag format
- Don't include FI_DIRECTED_RECV, FI_SOURCE caps if they're not needed
- Fix matching for RMA I/O vectors

## Sockets

- Increase maximum messages size as MPICH bug work-around

## Verbs

- Detect string format of wildcard address in node argument
- Don't report unusable fi_info (no source IP address)
- Don't assert when a verbs device exposes unsupported MTU types
- Report correct rma_iov_limit
- Add new variable - FI_VERBS_MR_CACHE_MERGE_REGIONS
- eq->err.err must return a positive error code

v1.6.1, Wed May 8, 2018
===========================

## Core

- Fix compile issues with older compilers
- Check that all debug compiler flags are supported by compiler

## PSM2

- Fix occasional assertion failure in psm2_ep_close
- Avoid long delay in psm2_ep_close
- Fix potential duplication of iov send completion
- Replace some parameter checking with assertions
- Check iov limit in sendmsg
- Avoid adding FI_TRIGGER caps automatically
- Avoid unnecessary calls to psmx2_am_progress()

## RXM

- Fix reading pointer after freeing it.
- Avoid reading invalid AV entry
- Handle deleting the same address multiple times
- Fix crash in fi_av_remove if FI_SOURCE wasn't enabled

## Sockets

- Fix use after free error handling triggered ops.


v1.6.0, Wed Mar 14, 2018
========================

## Core

- Introduces support for performing RMA operations to persistent memory
  See FI_RMA_PMEM capability in fi_getinfo.3
- Define additional errno values
- General code cleanups and restructuring
- Force provider ordering when using dynamically loaded providers
- Add const to fi_getinfo() hints parameter
- Improve use of epoll for better scalability
- Fixes to generic name service

## GNI

- Fix a problem with the GNI waitset implementation
- Enable use of XPMEM for intra node data transfers
- Fix a problem with usage of Crays UDREG registration cache
- Fix a problem with an assert statement
- Fix several memory leaks

## PSM

- Move environment variable reading out from fi_getinfo()
- Shortcut obviously unsuccessful fi_getinfo() calls
- Remove excessive name sever implementation
- Enable ordering of RMA operations

## PSM2

- Requires psm2 library version 10.2.235 or later
- Skip inactive units in round-robin context allocation
- Allow contexts be shared by Tx-only and Rx-only endpoints
- Use utility functions to check provider attributes
- Turn on FI_THREAD_SAFE support
- Make address vector operations thread-safe
- Move environment variable reading out from fi_getinfo()
- Reduce noise when optimizing tagged message functions
- Shortcut obviously unsuccessful fi_getinfo() calls
- Improve how Tx/Rx context limits are handled
- Support auto selection from two different tag layout schemes
- Add provider build options to debug output
- Support remote CQ data for tagged messages, add specialization.
- Support opening multiple domains
- Put trigger implementation into a separate file
- Update makefile and configure script
- Replace allocated context with reserved space in psm2_mq_req
- Limit exported symbols for DSO provider
- Reduce HW context usage for certain TX only endpoints
- Remove unnecessary dependencies from the configure script
- Refactor the handling of op context type
- Optimize the conversion between 96-bit and 64-bit tags
- Code refactoring for completion generation
- Remove obsolete feature checking code
- Report correct source address for scalable endpoints
- Allow binding any number of endpoints to a CQ/counter
- Add shared Tx context support
- Add alternative implementation for completion polling
- Change the default value of FI_PSM2_DELAY to 0
- Add an environment variable for automatic connection cleanup
- Abstract the completion polling mechanism
- Use the new psm2_am_register_handlers_2 function when available
- Allow specialization when FI_COMPLETION op_flag is set.
- Put Tx/Rx context related functions into a separate file
- Enable PSM2 multi-ep feature by default
- Add option to build with PSM2 source included
- Simplify the code for checking endpoint capabilities
- Simplify the handling of self-targeted RMA operations
- Allow all free contexts be used for scalable endpoints
- Enable ordering of RMA operations
- Enable multiple endpoints over PSM2 multi-ep support
- Support multiple Tx/Rx contexts in address vector
- Remove the virtual lane mechanism
- Less code duplication in tagged, add more specialization.
- Allow PSM2 epid be reused within the same session
- Turn on user adjustable inject size for all operations
- Use pre-allocated memory pool for RMA requests
- Add support for lazy connection
- Various bug fixes

## RXM

- Add support for completion counters
- Fix MR mode handling
- Add support for FI_MULTI_RECV
- Considerable performance optimizations
- Report correct MR key size based on core provider's size
- Fixes to endpoint address reporting to avoid wildcard addresses
- Ensure progress after core provider returns EAGAIN on transfers
- Fix crash when running of sockets provider
- Bug fixes handling large message transfers
- Report data ordering and limits based on core provider's
- Set mode bits and capabilities correctly
- General code restructuring and cleanups
- Various additional bug fixes
- Handle different API versions correctly
- Expand support for tagged message transfers
- Add support for auto progress on data transfers

## SHM

- Initial release of shared memory provider
- See the fi_shm.7 man page for details on available features and limitations

## Sockets

- Scalability enhancements
- Fix issue associating a connection with an AV entry that could result in
   application hangs
- Add support for new persistent memory capabilities
- Fix fi_cq_signal to unblock threads waiting on cq sread calls
- Fix epoll_wait loop handling to avoid out of memory errors
- Add support for TCP keepalives, controllable via environment variables
- Reduce the number of threads allocated for handling connections
- Several code cleanups in response to static code analysis reports
- Fix reporting multiple completion events for the same request in error cases

## usNIC

- Minor adjustments to match new core MR mode bits functionality
- Several code cleanups in response to static code analysis reports

## Verbs

- Code cleanups and simplifications
- General code optimizations to improve performance
- Fix handling of wildcard addresses
- Check for fatal errors during connection establishment
- Support larger inject sizes
- Fix double locking issue
- Add support for memory registration caching (disabled by default)
- Enable setting thread affinity for CM threads
- Fix hangs in MPI closing RDM endpoints
- Add support for different CQ formats
- Fix RMA read operations over iWarp devices
- Optimize CM progress handling
- Several bug fixes

v1.5.3, Wed Dec 20, 2017
========================

## Core

- Handle malloc failures
- Ensure global lock is initialized on Windows
- Fix spelling and formatting errors in man pages

## GNI

- Fix segfault when using FI_MULTI_RECV
- Fix rcache issue handling overlapping memory regions

## NetDir

- Fix fi_getname
- Remove FI_LOCAL_MR mode bit, being reported erronously
- Avoid crashing in fi_join

## PSM

- Fix print format mismatches
- Remove 15 second startup delay when no hardware is installed
- Preserve FI_MR_SCALABLE mode bit for backwards compatability

## PSM2

- Fix print format mismatches
- Allow all to all communication between scalable endpoints
- Preserve FI_MR_SCALABLE mode bit for backwards compatability
- Fix reference counting issue with opened domains
- Fix segfault for RMA/atomic operations to local scalable endpoints
- Fix resource counting related issues for Tx/Rx contexts
- Allow completion suppression when fi_context is non-NULL
- Use correct queue for triggered operations with scalable endpoints

## RXM

- Fix out of bounds access to receive IOVs
- Serialize access to connection map
- Fix CQ error handling
- Fix issue being unable to associate an fi_addr with a connection
- Fix bug matching unexpected tagged messages
- Indicate that FI_RMA is supported
- Return correct r/w ordering size limits

## Sockets

- Fix check for invalid connection handle
- Fix crash in fi_av_remove

## Util

- Fix number of bits used for connection index

## Verbs

- Fix incorrect CQ entry data for MSG endpoints
- Properly check for errors from getifaddrs
- Retry getifaddr on failure because of busy netlink sockets
- Ack CM events on error paths

v1.5.2, Wed Nov 8, 2017
=======================

## Core

- Fix Power PC 32-bit build

## RXM

- Remove dependency on shared receive contexts
- Switch to automatic data progress
- Fix removing addresses from AV

## Sockets

- Fix incorrect reporting of counter attributes

## Verbs

- Fix reporting attributes based on device limits
- Fix incorrect CQ size reported for iWarp NICs
- Update man page with known issues for specific NICs
- Fix FI_RX_CQ_DATA mode check
- Disable on-demand paging by default (can cause data corruption)
- Disable loopback (localhost) addressing (causing failures in MPI)

v1.5.1, Wed Oct 4, 2017
=======================

## Core

- Fix initialization used by DL providers to avoid crash
- Add checks for null hints and improperly terminated strings
- Check for invalid core names passed to fabric open
- Provide consistent provider ordering when using DL providers
- Fix OFI_LIKELY definitions when GNUC is not present

## GNI

- Add ability to detect local PE rank
- Fix compiler/config problems
- Fix CQ read error corruption
- Remove tests of deprecated interfaces

## PSM

- Fix CQ corruption reporting errors
- Always generate a completion on error

## PSM2

- Fix CQ corruption reporting errors
- Always generate a completion on error
- Add checks to handle out of memory errors
- Add NULL check for iov in atomic readv/writev calls
- Fix FI_PEEK src address matching
- Fix bug in scalable endpoint address resolution
- Fix segfault bug in RMA completion generation

## Sockets

- Fix missing FI_CLAIM src address data on completion
- Fix CQ corruption reporting errors
- Fix serialization issue wrt out of order CPU writes to Tx ring buffer

## Verbs

- Allow modifying rnr retry timout to improve performance
- Add checks to handle out of memory errors
- Fix crash using atomic operations for MSG EPs

v1.5.0, Wed Aug 9, 2017
============================

The 1.5 release includes updates to the libfabric API and ABI.  As a
result, the ABI bumps from 1.0 to 1.1.  All changes are backwards
compatible with previous versions of the interface.  The following
features were added to the libfabric API.  (Note that individual
providers may not support all new features).  For full details
see the man pages.

- Authorization keys
  Authorization keys, commonly referred to as job keys, are used to
  isolate processes from communicating with other processes for security
  purposes.
- Multicast support
  Datagram endpoints can now support multicast communication.
- (Experimental) socket-like endpoint types
  New FI_SOCK_STREAM and FI_SOCK_DGRAM endpoint types are introduced.
  These endpoint types target support of cloud and enterprise based
  middleware and applications.
- Tagged atomic support
  Atomic operations can now target tagged receive buffers, in
  addition to RMA buffers.
- (Experimental) deferred work queues
  Deferred work queues are enhanced triggerred operations.  They
  target support for collective-based operations.
- New mode bits: FI_RESTRICTED_COMP and FI_NOTIFY_FLAGS_ONLY
  These mode bits support optimized completion processing to
  minimize software overhead.
- Multi-threaded error reporting
  Reading CQ and EQ errors now allow the application to provide the
  error buffer, eliminating the need for the application to synchronize
  between multiple threads when handling errors.
- FI_SOURCE_ERR capability
  This feature allows the provider to validate and report the source
  address for any received messages.
- FI_ADDR_STR string based addressing
  Applications can now request and use addresses provided using a
  standardized string format.  This makes it easier to pass full
  addressing data through a command line, or handle address exchange
  through text files.
- Communication scope capabilities: FI_LOCAL_COMM and FI_REMOTE_COMM
  Used to indicate if an application requires communication with
  peers on the same node and/or remote nodes.
- New memory registration modes
  The FI_BASIC_MR and FI_SCALABLE_MR memory registration modes have
  been replaced by more refined registration mode bits.  This allows
  applications to make better use of provider hardware capabilities
  when dealing with registered memory regions.
- New mode bit: FI_CONTEXT2
  Some providers need more than the size provided by the FI_CONTEXT
  mode bit setting.  To accomodate such providers, an FI_CONTEXT2
  mode bit was added.  This mode bit doubles the amount of context
  space that an application allocates on behalf of the provider.

## BGQ provider notes

- The OFI 1.5 BGQ provider officially supports the Open Fabrics Interfaces
  utilized by the MPICH implementation of the MPI-3 standard.  In addition
  to the MPICH test suite it has been tested by several scientific applications
  running MPICH on BGQ at scale, and several bugs in the provider and MPICH
  have been identified and fixed.  At least one INCITE project is attempting
  to use it for production science.  Support of this provider is a high-
  priority for ALCF, and MPICH users on BGQ are encouraged to utilize it to
  compare function and performance against the PAMI-based default toolchain.
  Any discovered bugs will be quickly addressed with high priority.  Results
  so far have shown significant point-to-point and RMA latency improvements
  over PAMI as well as RMA functional improvements at scale.  The only
  potential drawback is collective performance degradation against the PAMI
  optimizations, but at certain message and partition sizes, performance has
  been observed to be comparable or even better.

## MLX provider notes

- New provider to replace the deprecated mxm provider.
- Targets Mellanox InfiniBand fabrics, through the UCX library.
- Supports RDM endpoints with the tagged interfaces.
- Requires FI_CONTEXT mode support.
- See fi_mlx.7 man page for more details.

## NetDir provider nodes

- New provider for Windows that runs over the NetworkDirect API.
- Supports FI_EP_MSG endpoints, with FI_MSG and FI_RMA interfaces.
- Supports shared receive contexts
- Supports FI_SOCKADDR, FI_SOCKADDR_IN, and FI_SOCKADDR_IN6 addressing
- Asynchronous operations make forward progress automatically

## PSM provider notes

- Improve the name server functionality and move to the utility code
- Handle updated mr_mode definitions
- Add support of 32 and 64 bit atomic values

## PSM2 provider notes

- Add option to adjust the locking level
- Improve the name server functionality and move to the utility code
- Add support for string address format
- Add an environment vaiable for message inject size
- Handle FI_DISCARD in tagged receive functions
- Handle updated mr_mode definitions
- Add support for scalable endpoint
- Add support of 32 and 64 bit atomic values
- Add FI_SOURCE_ERR to the supported caps
- Improve the method of checking device existence

## Sockets provider notes

- Updated and enhanced atomic operation support.
- Add support for experimental deferred work queue operations.
- Fixed counter signaling when used with wait sets.
- Improved support on Windows.
- Cleaned up event reporting for destroyed endpoints.
- Fixed several possible crash scenarios.
- Fixed handling socket disconnect events which could hang the provider.

## RxM provider notes

- Add OFI RxM provider. It is an utility provider that supports RDM
  endpoints emulated over a base provider that supports only MSG end-
  points.
- The provider was earlier experimental. It's functional from this
  release onwards.
- Please refer to the man page of the provider for more info.

## UDP provider notes

- Add support for multicast data transfers

## usNIC provider notes

- Only requires libibverbs when necessary
- Updated to handle 1.5 interface changes.

## Verbs provider notes

- Fix an issue where if the user requests higher values for tx, rx
  context sizes than default it wasn't honored.
- Introduce env variables for setting default tx, rx context sizes
  and iov limits.
- Report correct completion ordering supported by MSG endpoints.


v1.4.2, Fri May 12, 2017
========================

## Core

- Fix for OS X clock_gettime() portability issue.

## PSM provider notes

- Updated default counter wait object for improved performance
- Fix multi-threaded RMA progress stalls

## PSM2 provider notes

- Updated default counter wait object for improved performance
- Fix multi-threaded RMA progress stalls

## Sockets provider notes

- Fix error in fi_cq_sreadfrom aborting before timeout expires
- Set atomic iov count correct correctly inside fi_atomicv

## Verbs provider notes

- Fix handling of apps that call fork.  Move ibv_fork_init() before
  calling any other verbs call.
- Fix crash in fi_write when connection is not yet established and
  write data size is below inline threshold.
- Fix issues not handling multiple ipoib interfaces
- Reduce lock contention on buffer pools in send/completion handling
  code.
- To see verbs provider in fi_info output, configure the corresponding
  IPoIB interface with an IP address. This is a change in behavior from
  previous versions. Please refer fi_verbs man page for more info.

v1.4.1, Fri Feb  3, 2017
========================

## PSM provider notes

- Defer initialization of the PSM library to allow runtime selection from
  different versions of the same provider before fi_getinfo is called.

## PSM2 provider notes

- Defer initialization of the PSM2 library to allow runtime selection from
  different versions of the same provider before fi_getinfo is called.
- General bug fixes.

## UDP provider notes

- Fix setting address format in fi_getinfo call.

## usNIC provider notes

- Fixed compilation issues with newer versions of libibverbs.

v1.4.0, Fri Oct 28, 2016
========================

- Add new options, `-f` and `-d`, to fi_info that can be used to
  specify hints about the fabric and domain name. Change port to `-P`
  and provider to `-p` to be more in line with fi_pingpong.

## GNI provider notes

- General bug fixes, plugged memory leaks, performance improvements,
  improved error handling and warning messages, etc.
- Additional API support:
  - FI_THREAD_COMPLETION
  - FI_RMA_EVENT
  - iov length up to 8 for messaging data transfers
- Provider-specific API support:
  - Aries native AXOR atomic operation
  - Memory registation cache flush operation
- Memory registration cache improvements:
  - IOMMU notifier support
  - Alternatives to the internal cache
  - Additional tuning knobs
- On-node optimization for rendezvous message communication via XPMEM
- Internal fixes to support accelerators and KNL processors
- Better support for running in CCM mode (in support for fabtests)

## MXM provider

- The mxm provider has been deprecated and will be replaced in a
  future release.

## PSM provider notes

- General bug fixes
- Use utility provider for EQ, wait object, and poll set
- Allow multi-recv to post buffer larger than message size limit

## PSM2 provider notes

- General bug fixes
- Add support for multi-iov RMA read and aromic operations
- Allow multi-recv to post buffer larger than message size limit

## Sockets provider notes

- General code cleanup and bug fixes
- Set tx/rx op_flags correctly to be consistent with manpage
- Restructure struct sock_ep to support alias ep
- Refactor CQ/Cntr bindings, CQ completion generation, and counter
  increments
- Copy compare data to internal buffer when FI_INJECT is set in
  fi_compare_atomic
- Correctly handle triggered operation when FI_INJECT is set or
  triggered op is enqueued or counter is incremented. Initialize
  counter threshold to INT_MAX
- Refactor and cleanup connection management code, add locks to avoid
  race between main thread and progress thread, add logic to correctly
  handle FI_SHUTDOWN and FI_REJECT
- Set fabric name as network address in the format of a.b.c.d/e and
  domain name as network interface name
- Remove sock_compare_addr and add two utility functions ofi_equals_ipaddr
  and ofi_equals_sockaddr in fi.h
- Refactor fi_getinfo to handle corner cases and add logic if a given
  src_addr matches to any local interface addr
- Restructure acquiring/releasing the list_lock in progress thread so
  that it is only acquired once per iteration
- Refactor connection management of MSG ep so that it uses TCP instead
  of UDP for connection management msg and new port for every MSG
  endpoint
- Add sock_cq_sanitize_flags function to make sure only flags returned in
  CQ events are the ones that are listed on the manpage
- Update fi_poll semantics for counters so that it returns success if the
  counter value is different from the last-read-value
- Allow multiple threads to wait on one counter
- Update code to use ofi_util_mr - the new MR structure added to util code
- Fix fi_av_insert not to report error when the number of inserted addr
  exceeds the count attribute in fi_av_attr
- Add garbage collection of AV indices after fi_av_remove, add ep list
  in AV and cleanup conn map during fi_av_remove
- Use correct fi_tx_attr/fi_rx_attr for scalable ep

## UDP provider notes

- Enhance parameter checks for several function calls.
- Fix memory leak freeing CQ structure.
- Bind to a source address when enabling endpoint.
- Reduce reported resource limits (domain attributes).

## usNIC provider notes

- Fix handling of EP_MSG peers on different IP subnets [PR #1988]
- Fix handling of CM data. Fixes a bug where data received would
  overwrite parts of the connection management structure [PR #1991]
- Fix bug in CM connect/accept handling that would cause a seg fault
  if data was sent as part of a connection request [PR #1991]
- Fix invalid completion lengths in the MSG and RDM endpoint
  implementations of fi_recvv and fi_recvmsg [PR #2026]
- Implement the FI_CM_DATA_SIZE option for fi_getopt on passive
  endpoints [PR #2033]
- Add fi_reject implementation that supports data exchange [PR #2038]
- Fix fi_av_straddr bug that reported port in network order [PR #2244]
- Report -FI_EOPBADSTATE if the size left functions are used on an
  endpoint which has not been enabled [PR #2266]
- Change the domain/fabric naming. The fabric is now represented as
  the network address in the form of a.b.c.d/e and the domain name is
  the usNIC device name. For more information see fi_usnic(7) [PR
  #2287]
- Fix the domain name matching in fi_getinfo/fi_domain [PR #2298]
- Fix issue with AV where it is fully closed before pending
  asynchronous inserts can finish leading to invalid data accesses [PR
  #2397]
- Free all data associated with AV when fi_av_close is called [PR
  #2397]
- Fail with -FI_EINVAL if a value of FI_ADDR_NOTAVAIL is given to
  fi_av_lookup.  [PR #2397]
- Verify AV attributes and return an error if anything that is
  unsupported is requested (FI_AV_TABLE, named AVs, FI_READ, etc.) [PR
  #2397]

## Verbs provider notes

- Add fork support. It is enabled by default and can be turned off by
  setting the FI_FORK_UNSAFE variable to "yes". This can improve
  performance of memory registrations but also makes fork unsafe. The
  following are the limitations of fork support:
  - Fabric resources like endpoint, CQ, EQ, etc. should not be used in
    the forked process.
  - The memory registered using fi_mr_reg has to be page aligned since
    ibv_reg_mr marks the entire page that a memory region belongs to
    as not to be re-mapped when the process is forked (MADV_DONTFORK).
- Fix a bug where source address info was not being returned in
  fi_info when destination node is specified.

- verbs/MSG
  - Add fi_getopt for passive endpoints.
  - Add support for shared RX contexts.
- verbs/RDM
  - General bug fixes
  - Add FI_MSG capability
  - Add FI_PEEK and FI_CLAIM flags support
  - Add completion flags support
  - Add selective completion support
  - Add fi_cq_readerr support
  - Add possibility to set IPoIB network interface via FI_VERBS_IFACE
    environment variable
  - Add large data transfer support (> 1 GB)
  - Add FI_AV_TABLE support
  - Add fi_cntr support
  - Add environment variables for the provider tuning:
    FI_VERBS_RDM_BUFFER_NUM, FI_VERBS_RDM_BUFFER_SIZE,
    FI_VERBS_RDM_RNDV_SEG_SIZE, FI_VERBS_RDM_CQREAD_BUNCH_SIZE,
    FI_VERBS_RDM_THREAD_TIMEOUT, FI_VERBS_RDM_EAGER_SEND_OPCODE
  - Add iWarp support

v1.3.0, Mon Apr 11, 2016
========================

## General notes

* [See a list of provider features for this
  release](https://github.com/ofiwg/libfabric/wiki/Provider-Feature-Matrix-v1.3.0)

## GNI provider notes

- CLE 5.2UP04 required for building GNI provider
- General bug fixes, plugged memory leaks, etc.
- Improved error handling, warning messages, etc.
- Added support for the following APIs:
  - fi_endpoint: fi_getopt, fi_setopt, fi_rx_size_left, fi_tx_size_left, fi_stx_context
  - fi_cq: fi_sread, fi_sreadfrom
  - fi_msg: FI_MULTI_RECV (flag)
  - fi_domain: FI_PROGRESS_AUTO (flag)
  - fi_direct: FI_DIRECT
- Added support for FI_EP_DGRAM (datagram endpoint):
  - fi_msg, fi_tagged, fi_rma
- Memory registration improvements:
  - Improved performance
  - Additional domain open_ops
- Initial support for Cray Cluster Compatibility Mode (CCM)
- Implemented strict API checking
- Added hash list implementation for tag matching (available by domain open_ops)

Note: The current version of fabtests does not work with the GNI
provider due to the job launch mechanism on Cray XC systems.  Please
see the [GNI provider
wiki](https://github.com/ofi-cray/libfabric-cray/wiki) for
alternatives to validating your installation.

## MXM provider notes

- Initial release

## PSM provider notes

- Remove PSM2 related code.

## PSM2 provider notes

- Add support for multi-iov send, tagged send, and RMA write.
- Use utility provider for EQ, wait object, and poll set.

## Sockets provider notes

- General code cleanup
- Enable FABRIC_DIRECT
- Enable sockets-provider to run on FreeBSD
- Add support for fi_trywait
- Add support for map_addr in shared-av creation
- Add shared-av support on OSX
- Allow FI_AV_UNSPEC type during av_open
- Use loop-back address as source address if gethostname fails
- Disable control-msg ack for inject operations that do not expect completions
- Increase max_atomic_msg_size to 4096 bytes
- Remove check for cq_size availability while calculating tx/rx_size_left
- Use util-buffer pool for overflow entries in progress engine.
- Synchronize accesses to memory-registration operations
- Fix an issue that caused out-of-order arrival of messages
- Fix a bug in processing RMA access error
- Fix a bug that caused starvation in processing receive operations
- Add reference counting for pollset
- Fix a bug in connection port assignment

## UDP provider notes

- Initial release

## usNIC provider notes

- Implement fi_recvv and fi_recvmsg for FI_EP_RDM. [PR #1594]
- Add support for FI_INJECT flag in the FI_EP_RDM implementation of fi_sendv.
  [PR #1594]
- Fix crashes that occur in the FI_EP_RDM and the FI_EP_MSG implementations
  when messages are posted with the maximum IOV count.  [PR #1784]
- Fix crashes that occur in the FI_EP_RDM and the FI_EP_MSG implementations
  when posting messages with IOVs of varying lengths.  [PR #1784]
- Handle FI_PEEK flag in fi_eq_sread. [PR #1758]
- Return -FI_ENOSYS if a named AV is requested. [PR #1749]
- The ethernet header does not count against the MTU. Update reported
  max_msg_size when using FI_EP_DGRAM to reflect this. [PR #1738]
- Set the DF (do not fragment) bit in the IP header. [PR #1665]
- Fix crashes that may occur from improper handling of receive state tracking
  [PR #1809]
- Fortify the receive side of libnl communication [PR #1655]
- Fix handling of fi_info with passive endpoints. Connections opened on a
  passive endpoint now inherit the properties of the fi_info struct used to
  open the passive endpoint. [PR #1806]
- Implement pollsets. [PR #1835]
- Add version 2 of the usnic getinfo extension [PR #1866]
- Implement waitsets [PR #1893]
- Implement fi_trywait [PR #1893]
- Fix progress thread deadlock [PR #1893]
- Implement FD based CQ sread [PR #1893]

## Verbs provider notes

- Add support for fi_trywait
- Support building on OSes which have older versions of librdmacm (v1.0.16 or
  lesser). The functionality of the provider when the user passes AF_IB
  addresses is not guaranteed though.
- Added a workaround to support posting more than 'verbs send work queue length'
  number of fi_inject calls at a time.
- Make CQ reads thread safe.
- Support the case where the user creates only a send or recv queue for the
  endpoint.
- Fix an issue where RMA reads were not working on iWARP cards.
- verbs/RDM
  - Add support for RMA operations.
  - Add support for fi_cq_sread and fi_cq_sreadfrom
  - Rework connection management to make it work with fabtests and also allow
    connection to self.
  - Other bug fixes and performance improvements.

v1.2.0, Thu Jan 7, 2016
=======================

## General notes

- Added GNI provider
- Added PSM2 provider

## GNI provider notes
- Initial release

## PSM provider notes
- General bug fixes
- Support auto progress mode
- Support more threading modes
- Only set FI_CONTEXT mode if FI_TAGGED or FI_MSG is used
- Support Intel Omni-Path Fabric via the psm2-compat library

## PSM2 provider notes
- Initial addition

## Sockets provider notes

- General bug fixes and code cleanup
- Update memory registration to support 32-bit builds and fix build warnings
- Initiate conn-msg on the same tx_ctx as the tx operation for scalable ep
- Fix av mask calculation for scalable ep
- Mask out context-id during connection lookup for scalable ep
- Increase buffered receive limit
- Ignore FI_INJECT flag for atomic read operation
- Return -FI_EINVAL instead of -FI_ENODATA for fi_endpoint for invalid attributes
- Set default tag format to FI_TAG_GENERIC
- Set src/dest iov len correctly for readv operations
- Fix random crashes while closing shared contexts
- Fix an out of bound access when large multi-recv limit is specified by user
- Reset tag field in CQ entry for send completion
- Do not set prov_name in fabric_attr
- Validate flags in CQ/Cntr bind operations
- Scalability enhancements
- Increase mr_key size to 64 bit
- Use red-black tree for mr_key lookup

## usNIC provider notes
- The usNIC provider does not yet support asynchronous memory registration.
  Return -FI_EOPNOTSUPP if an event queue is bound to a domain with FI_REG_MR.
- Set fi_usnic_info::ui_version correctly in calls to
  fi_usnic_ops_fabric::getinfo().
- Improve fi_cq_sread performance.
- Return -FI_EINVAL from av_open when given invalid paramters.
- Fix bug in fi_av_remove that could lead to a seg fault.
- Implement fi_av_insertsvc.
- Report FI_PROTO_RUDP as protocol for EP_RDM.

## Verbs provider notes

- Add support for RDM EPs. Currently only FI_TAGGED capability is supported.
  RDM and MSG EPs would be reported in seperate domains since they don't share
  CQs. The RDM enpoint feature is currently experimental and no guarantees are
  given with regard to its functionality.
- Refactor the code into several files to enable adding RDM support.
- Consolidate send code paths to improve maintainability.
- Fix a bug in fi_getinfo where wild card address was not used when service
  argument is given.
- Fix fi_getinfo to always return -FI_ENODATA in case of failure.
- Add support for fi_eq_write.
- Other misc bug fixes.

v1.1.1, Fri Oct 2, 2015
=======================

## General notes

## PSM provider notes

- General bug fixes
- Proper termination of the name server thread
- Add UUID and PSM epid to debug output
- Add environment variable to control psm_ep_close timeout
- Code refactoring of AM-based messaging
- Check more fields of the hints passed to fi_getinfo
- Generate error CQ entries for empty result of recv with FI_SEEK flag
- Correctly handle overlapped local buffers in atomics
- Handle duplicated addresses in fi_av_insert
- Fix the return value of fi_cq_readerr
- Call AM progress function only when AM is used
- Detect MPI runs and turns off name server thread automatically

## Sockets provider notes

- General clean-up and restructuring
- Add fallback mechanism for getting source address
- Fix fi_getinfo to use user provided capabilities from hints
- Fix hostname and port number and added checks in sock_av_insertsym
- Add retry for connection timeout
- Release av resources in the error path
- Remove separate read/write CQ to be consistent with the man page
- Increase default connection map size and added environment variable to specify
  AV, CQ, EQ and connection map size to run large scale tests
- Fix FI_PEEK operation to be consistent with the man page
- Fix remote write event not to generate CQ event
- Fix CSWAP operation to return initial value
- Use size_t for min_multi_recv and buffered_len
- Set address size correctly in fi_getname/fi_getpeer

## usNIC provider notes

- Fix EP_RDM reassembly issue for large messages
- Return correct number of read completions on error
- Fix EP_RDM and EP_MSG data corruption issue when packets are actually
  corrupted on the wire
- Fix EP_RDM and EP_MSG fi_tx_size_left/fi_rx_size_left functions

## Verbs provider notes

- Add more logging for errors
- Bug fixes

v1.1.0, Wed Aug 5, 2015
=======================

## General notes

- Added fi_info utility tool
- Added unified global environment variable support
- Fixed configure issues with the clang/llvm compiler suite

## PSM provider notes

- General bug fixes
- Move processing of triggered ops outside of AM handlers
- Generate CQ entries for cancelled operations
- Remove environment variable FI_PSM_VERSION_CHECK
- Fix multi-recv completion generation
- Environment variable tweaks

## Sockets provider notes

- General bug fixes and code cleanup
- Add triggered operation suppport
- Generate error completion event for successful fi_cancel
- Support fi_cancel for tx operations
- Enable option for setting affinity to progress thread
- Improve error handling during connection management
- Avoid reverse lookup for every received message
- Avoid polling all connections while checking for incoming message
- Use fast_lock for progress engine's list_lock
- Handle disconnected sockets
- Add rx entry pool
- Mark tx entry as completed only if data is sent out to wire
- Add rx control context for every tx context for progressing control messages
- Set source address when addressing information is not passed by the application
- Reset return value after polling CQ ring buffer
- Reset FI_TRIGGER flag while triggering triggered operations
- Ensure progress of control context

## usNIC provider notes

- General bug fixes
- Add support for fi_getname/fi_setname, fi_cq_sread
- Change FI_PREFIX behavior per fi_getinfo(3)
- Fix to report correct lengths in all completions
- Support fi_inject() with FI_PREFIX
- Properly support iov_limit
- Support FI_MORE
- Fixed fi_tx_size_left() and fi_rx_size_left() usage
- Fixed obscure error when posting cq_size operations without reading
  a completion

## Verbs provider notes

- AF_IB addreses can now be passed as node argument to fi_getinfo
- Added support for fi_setname and migrating passive EP to active EP
- Detect and report multiple verbs devices if present
- Bug fixes

v1.0.0, Sun May 3, 2015
=======================

Initial public release, including the following providers:

- PSM
- Sockets
- usNIC
- Verbs