File: compute.rst

package info (click to toggle)
apache-arrow 23.0.1-1
  • links: PTS
  • area: main
  • in suites: sid
  • size: 76,220 kB
  • sloc: cpp: 654,608; python: 70,522; ruby: 45,964; ansic: 18,742; sh: 7,365; makefile: 669; javascript: 125; xml: 41
file content (2018 lines) | stat: -rw-r--r-- 147,345 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
.. Licensed to the Apache Software Foundation (ASF) under one
.. or more contributor license agreements.  See the NOTICE file
.. distributed with this work for additional information
.. regarding copyright ownership.  The ASF licenses this file
.. to you under the Apache License, Version 2.0 (the
.. "License"); you may not use this file except in compliance
.. with the License.  You may obtain a copy of the License at

..   http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
.. software distributed under the License is distributed on an
.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
.. KIND, either express or implied.  See the License for the
.. specific language governing permissions and limitations
.. under the License.

.. default-domain:: cpp
.. highlight:: cpp
.. cpp:namespace:: arrow::compute

.. _compute-cpp:

=================
Compute Functions
=================

The generic Compute API
=======================

.. seealso::
   :doc:`Compute Functions API reference <api/compute>`

Functions and function registry
-------------------------------

Functions represent compute operations over inputs of possibly varying
types.  Internally, a function is implemented by one or several
"kernels", depending on the concrete input types (for example, a function
adding values from two inputs can have different kernels depending on
whether the inputs are integral or floating-point).

Functions are stored in a global :class:`FunctionRegistry` where
they can be looked up by name.

Compute Initialization
----------------------

The compute library requires a call to :func:`arrow::compute::Initialize`
in order to register the individual functions into the global :class:`FunctionRegistry`,
otherwise only the functions required for Arrow core functionality will be available.

.. note::
   The set of functions required for Arrow core functionality are an implementation detail
   of the library, and should not be considered stable.

Input shapes
------------

Computation inputs are represented as a general :class:`Datum` class,
which is a tagged union of several shapes of data such as :class:`Scalar`,
:class:`Array` and :class:`ChunkedArray`.  Many compute functions support
both array (chunked or not) and scalar inputs, however some will mandate
particular input types. For example, while ``array_sort_indices`` requires its
first and only input to be an array, the generalized ``sort_indices``
function accepts an array, chunked array, record batch or table.

.. _invoking-compute-functions:

Invoking functions
------------------

Compute functions can be invoked by name using
:func:`arrow::compute::CallFunction`::

   std::shared_ptr<arrow::Array> numbers_array = ...;
   std::shared_ptr<arrow::Scalar> increment = ...;
   arrow::Datum incremented_datum;

   ARROW_ASSIGN_OR_RAISE(incremented_datum,
                         arrow::compute::CallFunction("add", {numbers_array, increment}));
   std::shared_ptr<Array> incremented_array = std::move(incremented_datum).make_array();

(note this example uses implicit conversion from ``std::shared_ptr<Array>``
to ``Datum``)

Many compute functions are also available directly as concrete APIs, here
:func:`arrow::compute::Add`::

   std::shared_ptr<arrow::Array> numbers_array = ...;
   std::shared_ptr<arrow::Scalar> increment = ...;
   arrow::Datum incremented_datum;

   ARROW_ASSIGN_OR_RAISE(incremented_datum,
                         arrow::compute::Add(numbers_array, increment));
   std::shared_ptr<Array> incremented_array = std::move(incremented_datum).make_array();

Some functions accept or require an options structure that determines the
exact semantics of the function::

   ScalarAggregateOptions scalar_aggregate_options;
   scalar_aggregate_options.skip_nulls = false;

   std::shared_ptr<arrow::Array> array = ...;
   arrow::Datum min_max;

   ARROW_ASSIGN_OR_RAISE(min_max,
                         arrow::compute::CallFunction("min_max", {array},
                                                      &scalar_aggregate_options));

   // Unpack struct scalar result (a two-field {"min", "max"} scalar)
   std::shared_ptr<arrow::Scalar> min_value, max_value;
   min_value = min_max.scalar_as<arrow::StructScalar>().value[0];
   max_value = min_max.scalar_as<arrow::StructScalar>().value[1];

However, :ref:`Grouped Aggregations <grouped-aggregations-group-by>` are
not invocable via ``CallFunction``.

.. seealso::
   :doc:`Compute API reference <api/compute>`

Implicit casts
==============

Functions may require conversion of their arguments before execution if a
kernel does not match the argument types precisely. For example comparison
of dictionary encoded arrays is not directly supported by any kernel, but an
implicit cast can be made allowing comparison against the decoded array.

Each function may define implicit cast behaviour as appropriate. For example
comparison and arithmetic kernels require identically typed arguments, and
support execution against differing numeric types by promoting their arguments
to numeric type which can accommodate any value from either input.

.. _common-numeric-type:

Common numeric type
-------------------

The common numeric type of a set of input numeric types is the smallest numeric
type which can accommodate any value of any input. If any input is a floating
point type the common numeric type is the widest floating point type among the
inputs. Otherwise the common numeric type is integral and is signed if any input
is signed. For example:

+-------------------+----------------------+------------------------------------------------+
| Input types       | Common numeric type  | Notes                                          |
+===================+======================+================================================+
| int32, int32      | int32                |                                                |
+-------------------+----------------------+------------------------------------------------+
| int16, int32      | int32                | Max width is 32, promote LHS to int32          |
+-------------------+----------------------+------------------------------------------------+
| uint16, int32     | int32                | One input signed, override unsigned            |
+-------------------+----------------------+------------------------------------------------+
| uint32, int32     | int64                | Widen to accommodate range of uint32           |
+-------------------+----------------------+------------------------------------------------+
| uint16, uint32    | uint32               | All inputs unsigned, maintain unsigned         |
+-------------------+----------------------+------------------------------------------------+
| int16, uint32     | int64                |                                                |
+-------------------+----------------------+------------------------------------------------+
| uint64, int16     | int64                | int64 cannot accommodate all uint64 values     |
+-------------------+----------------------+------------------------------------------------+
| float32, int32    | float32              | Promote RHS to float32                         |
+-------------------+----------------------+------------------------------------------------+
| float32, float64  | float64              |                                                |
+-------------------+----------------------+------------------------------------------------+
| float32, int64    | float32              | int64 is wider, still promotes to float32      |
+-------------------+----------------------+------------------------------------------------+

In particular, note that comparing a ``uint64`` column to an ``int16`` column
may emit an error if one of the ``uint64`` values cannot be expressed as the
common type ``int64`` (for example, ``2 ** 63``).

.. _compute-function-list:

Available functions
===================

Type categories
---------------

To avoid exhaustively listing supported types, the tables below use a number
of general type categories:

* "Numeric": Integer types (Int8, etc.) and Floating-point types (Float32,
  Float64, sometimes Float16).  Some functions also accept Decimal input.

* "Temporal": Date types (Date32, Date64), Time types (Time32, Time64),
  Timestamp, Duration, Interval.

* "Binary-like": Binary, LargeBinary, sometimes also FixedSizeBinary.

* "String-like": String, LargeString.

* "List-like": List, LargeList, ListView, LargeListView, and sometimes also
  FixedSizeList.

* "Nested": List-likes (including FixedSizeList), Struct, Union, and
  related types like Map.

If you are unsure whether a function supports a concrete input type, we
recommend you try it out.  Unsupported input types return a ``TypeError``
:class:`Status`.

.. _aggregation-option-list:

Aggregations
------------

Scalar aggregations operate on a (chunked) array or scalar value and reduce
the input to a single output value.

+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| Function name      | Arity   | Input types                                   | Output type            | Options class                    | Notes      |
+====================+=========+===============================================+========================+==================================+============+
| all                | Unary   | Boolean                                       | Scalar Boolean         | :struct:`ScalarAggregateOptions` | \(1)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| any                | Unary   | Boolean                                       | Scalar Boolean         | :struct:`ScalarAggregateOptions` | \(1)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| approximate_median | Unary   | Numeric                                       | Scalar Float64         | :struct:`ScalarAggregateOptions` |            |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| count              | Unary   | Any                                           | Scalar Int64           | :struct:`CountOptions`           | \(2)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| count_all          | Nullary |                                               | Scalar Int64           |                                  |            |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| count_distinct     | Unary   | Non-nested types                              | Scalar Int64           | :struct:`CountOptions`           | \(2)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| first              | Unary   | Numeric, Binary                               | Scalar Input type      | :struct:`ScalarAggregateOptions` | \(3)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| first_last         | Unary   | Numeric, Binary                               | Scalar Struct          | :struct:`ScalarAggregateOptions` | \(3)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| index              | Unary   | Any                                           | Scalar Int64           | :struct:`IndexOptions`           | \(4)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| kurtosis           | Unary   | Numeric                                       | Scalar Float64         | :struct:`SkewOptions`            | \(12)      |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| last               | Unary   | Numeric, Binary                               | Scalar Input type      | :struct:`ScalarAggregateOptions` | \(3)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| max                | Unary   | Non-nested types                              | Scalar Input type      | :struct:`ScalarAggregateOptions` |            |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| mean               | Unary   | Numeric                                       | Scalar Decimal/Float64 | :struct:`ScalarAggregateOptions` | \(5)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| min                | Unary   | Non-nested types                              | Scalar Input type      | :struct:`ScalarAggregateOptions` |            |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| min_max            | Unary   | Non-nested types                              | Scalar Struct          | :struct:`ScalarAggregateOptions` | \(6)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| mode               | Unary   | Numeric                                       | Struct                 | :struct:`ModeOptions`            | \(7)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| pivot_wider        | Binary  | Binary, String, Integer (Arg 0); Any (Arg 1)  | Scalar Struct          | :struct:`PivotWiderOptions`      | \(8)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| product            | Unary   | Numeric                                       | Scalar Numeric         | :struct:`ScalarAggregateOptions` | \(9)       |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| quantile           | Unary   | Numeric                                       | Scalar Numeric         | :struct:`QuantileOptions`        | \(11)      |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| skew               | Unary   | Numeric                                       | Scalar Float64         | :struct:`SkewOptions`            | \(12)      |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| stddev             | Unary   | Numeric                                       | Scalar Float64         | :struct:`VarianceOptions`        | \(12)      |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| sum                | Unary   | Numeric                                       | Scalar Numeric         | :struct:`ScalarAggregateOptions` | \(9) \(10) |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| tdigest            | Unary   | Numeric                                       | Float64                | :struct:`TDigestOptions`         | \(13)      |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+
| variance           | Unary   | Numeric                                       | Scalar Float64         | :struct:`VarianceOptions`        | \(12)      |
+--------------------+---------+-----------------------------------------------+------------------------+----------------------------------+------------+

* \(1) If null values are taken into account, by setting the
  ScalarAggregateOptions parameter skip_nulls = false, then `Kleene logic`_
  logic is applied. The min_count option is not respected.

* \(2) CountMode controls whether only non-null values are counted (the
  default), only null values are counted, or all values are counted.

* \(3) Result is based on the ordering of input data.

* \(4) Returns -1 if the value is not found. The index of a null value
  is always -1, regardless of whether there are nulls in the input.

* \(5) For decimal inputs, the resulting decimal will have the same
  precision and scale. The result is rounded away from zero.

* \(6) Output is a ``{"min": input type, "max": input type}`` Struct.

  Of the interval types, only the month interval is supported, as the day-time
  and month-day-nano types are not sortable.

* \(7) Output is an array of ``{"mode": input type, "count": Int64}`` Struct.
  It contains the *N* most common elements in the input, in descending
  order, where *N* is given in :member:`ModeOptions::n`.
  If two values have the same count, the smallest one comes first.
  Note that the output can have less than *N* elements if the input has
  less than *N* distinct values.

* \(8) The first input contains the pivot key, while the second input contains
  the values to be pivoted. The output is a Struct with one field for each key
  in :member:`PivotOptions::key_names`.

* \(9) Output is Int64, UInt64, Float64, or Decimal128/256, depending on the
  input type.

* \(10) For Decimal input, the output precision is increased to the maximum
  precision for the input type's width. For instance, an array of
  ``decimal128(3, 2)`` will return a ``decimal128(38, 2)`` scalar.

* \(11) Output is Float64 or input type, depending on QuantileOptions.

* \(12) Decimal arguments are cast to Float64 first.

* \(13) tdigest/t-digest computes approximate quantiles, and so only needs a
  fixed amount of memory. See the `reference implementation
  <https://github.com/tdunning/t-digest>`_ for details.

  Decimal arguments are cast to Float64 first.

.. _grouped-aggregations-group-by:

Grouped Aggregations ("group by")
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Grouped aggregations are not directly invocable, but are used as part of a
SQL-style "group by" operation. Like scalar aggregations, grouped aggregations
reduce multiple input values to a single output value. Instead of aggregating
all values of the input, however, grouped aggregations partition the input
values on some set of "key" columns, then aggregate each group individually,
emitting one output value per input group.

As an example, for the following table:

+------------------+-----------------+
| Column ``key``   | Column ``x``    |
+==================+=================+
| "a"              | 2               |
+------------------+-----------------+
| "a"              | 5               |
+------------------+-----------------+
| "b"              | null            |
+------------------+-----------------+
| "b"              | null            |
+------------------+-----------------+
| null             | null            |
+------------------+-----------------+
| null             | 9               |
+------------------+-----------------+

we can compute a sum of the column ``x``, grouped on the column ``key``.
This gives us three groups, with the following results. Note that null is
treated as a distinct key value.

+------------------+-----------------------+
| Column ``key``   | Column ``sum(x)``     |
+==================+=======================+
| "a"              | 7                     |
+------------------+-----------------------+
| "b"              | null                  |
+------------------+-----------------------+
| null             | 9                     |
+------------------+-----------------------+

The supported aggregation functions are as follows. All function names are
prefixed with ``hash_``, which differentiates them from their scalar
equivalents above and reflects how they are implemented internally.

+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| Function name           | Arity   | Input types                                  | Output type            | Options class                    | Notes     |
+=========================+=========+==============================================+========================+==================================+===========+
| hash_all                | Unary   | Boolean                                      | Boolean                | :struct:`ScalarAggregateOptions` | \(1)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_any                | Unary   | Boolean                                      | Boolean                | :struct:`ScalarAggregateOptions` | \(1)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_approximate_median | Unary   | Numeric                                      | Float64                | :struct:`ScalarAggregateOptions` |           |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_count              | Unary   | Any                                          | Int64                  | :struct:`CountOptions`           | \(2)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_count_all          | Nullary |                                              | Int64                  |                                  |           |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_count_distinct     | Unary   | Any                                          | Int64                  | :struct:`CountOptions`           | \(2)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_distinct           | Unary   | Any                                          | List of input type     | :struct:`CountOptions`           | \(2) \(3) |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_first              | Unary   | Numeric, Binary                              | Input type             | :struct:`ScalarAggregateOptions` | \(11)     |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_first_last         | Unary   | Numeric, Binary                              | Struct                 | :struct:`ScalarAggregateOptions` | \(11)     |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_kurtosis           | Unary   | Numeric                                      | Float64                | :struct:`SkewOptions`            | \(9)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_last               | Unary   | Numeric, Binary                              | Input type             | :struct:`ScalarAggregateOptions` | \(11)     |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_list               | Unary   | Any                                          | List of input type     |                                  | \(3)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_max                | Unary   | Non-nested, non-binary/string-like           | Input type             | :struct:`ScalarAggregateOptions` |           |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_mean               | Unary   | Numeric                                      | Decimal/Float64        | :struct:`ScalarAggregateOptions` | \(4)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_min                | Unary   | Non-nested, non-binary/string-like           | Input type             | :struct:`ScalarAggregateOptions` |           |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_min_max            | Unary   | Non-nested types                             | Struct                 | :struct:`ScalarAggregateOptions` | \(5)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_one                | Unary   | Any                                          | Input type             |                                  | \(6)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_pivot_wider        | Binary  | Binary, String, Integer (Arg 0); Any (Arg 1) | Struct                 | :struct:`PivotWiderOptions`      | \(7)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_product            | Unary   | Numeric                                      | Numeric                | :struct:`ScalarAggregateOptions` | \(8)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_skew               | Unary   | Numeric                                      | Float64                | :struct:`SkewOptions`            | \(9)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_stddev             | Unary   | Numeric                                      | Float64                | :struct:`VarianceOptions`        | \(9)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_sum                | Unary   | Numeric                                      | Numeric                | :struct:`ScalarAggregateOptions` | \(8)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_tdigest            | Unary   | Numeric                                      | FixedSizeList[Float64] | :struct:`TDigestOptions`         | \(10)     |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+
| hash_variance           | Unary   | Numeric                                      | Float64                | :struct:`VarianceOptions`        | \(9)      |
+-------------------------+---------+----------------------------------------------+------------------------+----------------------------------+-----------+

* \(1) If null values are taken into account, by setting the
  :member:`ScalarAggregateOptions::skip_nulls` to false, then `Kleene logic`_
  logic is applied. The min_count option is not respected.

* \(2) CountMode controls whether only non-null values are counted
  (the default), only null values are counted, or all values are
  counted. For hash_distinct, it instead controls whether null values
  are emitted. This never affects the grouping keys, only group values
  (i.e. you may get a group where the key is null).

* \(3) ``hash_distinct`` and ``hash_list`` gather the grouped values
  into a list array.

* \(4) For decimal inputs, the resulting decimal will have the same
  precision and scale. The result is rounded away from zero.

* \(5) Output is a ``{"min": input type, "max": input type}`` Struct array.

  Of the interval types, only the month interval is supported, as the day-time
  and month-day-nano types are not sortable.

* \(6) ``hash_one`` returns one arbitrary value from the input for each
  group. The function is biased towards non-null values: if there is at least
  one non-null value for a certain group, that value is returned, and only if
  all the values are ``null`` for the group will the function return ``null``.

* \(7) The first input contains the pivot key, while the second input contains
  the values to be pivoted. The output is a Struct with one field for each key
  in :member:`PivotOptions::key_names`.

* \(8) Output is Int64, UInt64, Float64, or Decimal128/256, depending on the
  input type.

* \(9) Decimal arguments are cast to Float64 first.

* \(10) T-digest computes approximate quantiles, and so only needs a
  fixed amount of memory. See the `reference implementation
  <https://github.com/tdunning/t-digest>`_ for details.

* \(11) Result is based on ordering of the input data.


Element-wise ("scalar") functions
---------------------------------

All element-wise functions accept both arrays and scalars as input.  The
semantics for unary functions are as follow:

* scalar inputs produce a scalar output
* array inputs produce an array output

Binary functions have the following semantics (which is sometimes called
"broadcasting" in other systems such as NumPy):

* ``(scalar, scalar)`` inputs produce a scalar output
* ``(array, array)`` inputs produce an array output (and both inputs must
  be of the same length)
* ``(scalar, array)`` and ``(array, scalar)`` produce an array output.
  The scalar input is handled as if it were an array of the same length N
  as the other input, with the same value repeated N times.

Arithmetic functions
~~~~~~~~~~~~~~~~~~~~

These functions expect inputs of numeric type and apply a given arithmetic
operation to each element(s) gathered from the input(s).  If any of the
input element(s) is null, the corresponding output element is null.
For binary functions, input(s) will be cast to the
:ref:`common numeric type <common-numeric-type>`
(and dictionary decoded, if applicable) before the operation is applied.

The default variant of these functions does not detect overflow (the result
then typically wraps around).  Most functions are also available in an
overflow-checking variant, suffixed ``_checked``, which returns
an ``Invalid`` :class:`Status` when overflow is detected.

For functions which support decimal inputs (currently ``add``, ``subtract``,
``multiply``, and ``divide`` and their checked variants), decimals of different
precisions/scales will be promoted appropriately. Mixed decimal and
floating-point arguments will cast all arguments to floating-point, while mixed
decimal and integer arguments will cast all arguments to decimals.
Mixed time resolution temporal inputs will be cast to finest input resolution.

+------------------+--------+-------------------------+-------------------------------+-------+
| Function name    | Arity  | Input types             | Output type                   | Notes |
+==================+========+=========================+===============================+=======+
| abs              | Unary  | Numeric/Duration        | Numeric/Duration              |       |
+------------------+--------+-------------------------+-------------------------------+-------+
| abs_checked      | Unary  | Numeric/Duration        | Numeric/Duration              |       |
+------------------+--------+-------------------------+-------------------------------+-------+
| add              | Binary | Numeric/Temporal        | Numeric/Temporal              | \(1)  |
+------------------+--------+-------------------------+-------------------------------+-------+
| add_checked      | Binary | Numeric/Temporal        | Numeric/Temporal              | \(1)  |
+------------------+--------+-------------------------+-------------------------------+-------+
| divide           | Binary | Numeric/Temporal        | Numeric/Temporal              | \(1)  |
+------------------+--------+-------------------------+-------------------------------+-------+
| divide_checked   | Binary | Numeric/Temporal        | Numeric/Temporal              | \(1)  |
+------------------+--------+-------------------------+-------------------------------+-------+
| exp              | Unary  | Numeric                 | Float32/Float64               |       |
+------------------+--------+-------------------------+-------------------------------+-------+
| expm1            | Unary  | Numeric                 | Float32/Float64               |       |
+------------------+--------+-------------------------+-------------------------------+-------+
| multiply         | Binary | Numeric/Temporal        | Numeric/Temporal              | \(1)  |
+------------------+--------+-------------------------+-------------------------------+-------+
| multiply_checked | Binary | Numeric/Temporal        | Numeric/Temporal              | \(1)  |
+------------------+--------+-------------------------+-------------------------------+-------+
| negate           | Unary  | Numeric/Duration        | Numeric/Duration              |       |
+------------------+--------+-------------------------+-------------------------------+-------+
| negate_checked   | Unary  | Signed Numeric/Duration | Signed Numeric/Duration       |       |
+------------------+--------+-------------------------+-------------------------------+-------+
| power            | Binary | Numeric                 | Numeric                       |       |
+------------------+--------+-------------------------+-------------------------------+-------+
| power_checked    | Binary | Numeric                 | Numeric                       |       |
+------------------+--------+-------------------------+-------------------------------+-------+
| sign             | Unary  | Numeric/Duration        | Int8/Float16/Float32/Float64  | \(2)  |
+------------------+--------+-------------------------+-------------------------------+-------+
| sqrt             | Unary  | Numeric                 | Numeric                       |       |
+------------------+--------+-------------------------+-------------------------------+-------+
| sqrt_checked     | Unary  | Numeric                 | Numeric                       |       |
+------------------+--------+-------------------------+-------------------------------+-------+
| subtract         | Binary | Numeric/Temporal        | Numeric/Temporal              | \(1)  |
+------------------+--------+-------------------------+-------------------------------+-------+
| subtract_checked | Binary | Numeric/Temporal        | Numeric/Temporal              | \(1)  |
+------------------+--------+-------------------------+-------------------------------+-------+

* \(1) Precision and scale of computed DECIMAL results

  +------------+---------------------------------------------+
  | Operation  | Result precision and scale                  |
  +============+=============================================+
  | | add      | | scale = max(s1, s2)                       |
  | | subtract | | precision = max(p1-s1, p2-s2) + 1 + scale |
  +------------+---------------------------------------------+
  | multiply   | | scale = s1 + s2                           |
  |            | | precision = p1 + p2 + 1                   |
  +------------+---------------------------------------------+
  | divide     | | scale = max(4, s1 + p2 - s2 + 1)          |
  |            | | precision = p1 - s1 + s2 + scale          |
  +------------+---------------------------------------------+

  It's compatible with Redshift's decimal promotion rules. All decimal digits
  are preserved for ``add``, ``subtract`` and ``multiply`` operations. The result
  precision of ``divide`` is at least the sum of precisions of both operands with
  enough scale kept. Error is returned if the result precision is beyond the
  decimal value range.

* \(2) Output is any of (-1,1) for nonzero inputs and 0 for zero input.  NaN
  values return NaN.  Integral and decimal values return signedness as Int8 and
  floating-point values return it with the same type as the input values.

Bit-wise functions
~~~~~~~~~~~~~~~~~~

+--------------------------+------------+--------------------+---------------------+
| Function name            | Arity      | Input types        | Output type         |
+==========================+============+====================+=====================+
| bit_wise_and             | Binary     | Numeric            | Numeric             |
+--------------------------+------------+--------------------+---------------------+
| bit_wise_not             | Unary      | Numeric            | Numeric             |
+--------------------------+------------+--------------------+---------------------+
| bit_wise_or              | Binary     | Numeric            | Numeric             |
+--------------------------+------------+--------------------+---------------------+
| bit_wise_xor             | Binary     | Numeric            | Numeric             |
+--------------------------+------------+--------------------+---------------------+
| shift_left               | Binary     | Numeric            | Numeric             |
+--------------------------+------------+--------------------+---------------------+
| shift_left_checked       | Binary     | Numeric            | Numeric (1)         |
+--------------------------+------------+--------------------+---------------------+
| shift_right              | Binary     | Numeric            | Numeric             |
+--------------------------+------------+--------------------+---------------------+
| shift_right_checked      | Binary     | Numeric            | Numeric (1)         |
+--------------------------+------------+--------------------+---------------------+

* \(1) An error is emitted if the shift amount (i.e. the second input) is
  out of bounds for the data type.  However, an overflow when shifting the
  first input is not error (truncated bits are silently discarded).

Rounding functions
~~~~~~~~~~~~~~~~~~

Rounding functions displace numeric inputs to an approximate value with a simpler
representation based on the rounding criterion.

+-------------------+------------+-------------+-------------------------+----------------------------------+--------+
| Function name     | Arity      | Input types | Output type             | Options class                    | Notes  |
+===================+============+=============+=========================+==================================+========+
| ceil              | Unary      | Numeric     | Float32/Float64/Decimal |                                  |        |
+-------------------+------------+-------------+-------------------------+----------------------------------+--------+
| floor             | Unary      | Numeric     | Float32/Float64/Decimal |                                  |        |
+-------------------+------------+-------------+-------------------------+----------------------------------+--------+
| round             | Unary      | Numeric     | Input Type              | :struct:`RoundOptions`           | (1)(2) |
+-------------------+------------+-------------+-------------------------+----------------------------------+--------+
| round_to_multiple | Unary      | Numeric     | Input Type              | :struct:`RoundToMultipleOptions` | (1)(3) |
+-------------------+------------+-------------+-------------------------+----------------------------------+--------+
| round_binary      | Binary     | Numeric     | Input Type              | :struct:`RoundBinaryOptions`     | (1)(4) |
+-------------------+------------+-------------+-------------------------+----------------------------------+--------+
| trunc             | Unary      | Numeric     | Float32/Float64/Decimal |                                  |        |
+-------------------+------------+-------------+-------------------------+----------------------------------+--------+

* \(1)  By default rounding functions change a value to the nearest
  integer using HALF_TO_EVEN to resolve ties.  Options are available to control
  the rounding criterion.  All ``round`` functions have the
  ``round_mode`` option to set the rounding mode.
* \(2) Round to a number of digits where the ``ndigits`` option of
  :struct:`RoundOptions` specifies the rounding precision in terms of number
  of digits.  A negative value corresponds to digits in the non-fractional
  part.  For example, -2 corresponds to rounding to the nearest multiple of
  100 (zeroing the ones and tens digits).  Default value of ``ndigits`` is 0
  which rounds to the nearest integer. For integer inputs a non-negative
  ``ndigits`` value is ignored and the input is returned unchanged. For integer
  inputs, if ``-ndigits`` is larger than the maximum number of digits the
  input type can hold, an error is returned.
* \(3) Round to a multiple where the ``multiple`` option of
  :struct:`RoundToMultipleOptions` specifies the rounding scale.  The rounding
  multiple has to be a positive value and can be casted to input type.
  For example, 100 corresponds to rounding to the nearest multiple of 100
  (zeroing the ones and tens digits). Default value of ``multiple`` is 1 which
  rounds to the nearest integer.
* \(4) Round the first input to multiple of the second input. The rounding
  multiple has to be a positive value and can be casted to the first input type.
  For example, 100 corresponds to rounding to the nearest multiple of 100
  (zeroing the ones and tens digits).

For ``round`` functions, the following rounding modes are available.
Tie-breaking modes are prefixed with HALF and round non-ties to the nearest integer.
The example values are given for default values of ``ndigits`` and ``multiple``.

+-----------------------+--------------------------------------------------------------+---------------------------+
| ``round_mode``        | Operation performed                                          | Example values            |
+=======================+==============================================================+===========================+
| DOWN                  | Round to nearest integer less than or equal in magnitude;    | 3.2 -> 3, 3.7 -> 3,       |
|                       | also known as ``floor(x)``                                   | -3.2 -> -4, -3.7 -> -4    |
+-----------------------+--------------------------------------------------------------+---------------------------+
| UP                    | Round to nearest integer greater than or equal in magnitude; | 3.2 -> 4, 3.7 -> 4,       |
|                       | also known as ``ceil(x)``                                    | -3.2 -> -3, -3.7 -> -3    |
+-----------------------+--------------------------------------------------------------+---------------------------+
| TOWARDS_ZERO          | Get the integral part without fractional digits;             | 3.2 -> 3, 3.7 -> 3,       |
|                       | also known as ``trunc(x)``                                   | -3.2 -> -3, -3.7 -> -3    |
+-----------------------+--------------------------------------------------------------+---------------------------+
| TOWARDS_INFINITY      | Round negative values with ``DOWN`` rule,                    | 3.2 -> 4, 3.7 -> 4,       |
|                       | round positive values with ``UP`` rule                       | -3.2 -> -4, -3.7 -> -4    |
+-----------------------+--------------------------------------------------------------+---------------------------+
| HALF_DOWN             | Round ties with ``DOWN`` rule                                | 3.5 -> 3, 4.5 -> 4,       |
|                       |                                                              | -3.5 -> -4, -4.5 -> -5    |
+-----------------------+--------------------------------------------------------------+---------------------------+
| HALF_UP               | Round ties with ``UP`` rule                                  | 3.5 -> 4, 4.5 -> 5,       |
|                       |                                                              | -3.5 -> -3, -4.5 -> -4    |
+-----------------------+--------------------------------------------------------------+---------------------------+
| HALF_TOWARDS_ZERO     | Round ties with ``TOWARDS_ZERO`` rule                        | 3.5 -> 3, 4.5 -> 4,       |
|                       |                                                              | -3.5 -> -3, -4.5 -> -4    |
+-----------------------+--------------------------------------------------------------+---------------------------+
| HALF_TOWARDS_INFINITY | Round ties with ``TOWARDS_INFINITY`` rule                    | 3.5 -> 4, 4.5 -> 5,       |
|                       |                                                              | -3.5 -> -4, -4.5 -> -5    |
+-----------------------+--------------------------------------------------------------+---------------------------+
| HALF_TO_EVEN          | Round ties to nearest even integer                           | 3.5 -> 4, 4.5 -> 4,       |
|                       |                                                              | -3.5 -> -4, -4.5 -> -4    |
+-----------------------+--------------------------------------------------------------+---------------------------+
| HALF_TO_ODD           | Round ties to nearest odd integer                            | 3.5 -> 3, 4.5 -> 5,       |
|                       |                                                              | -3.5 -> -3, -4.5 -> -5    |
+-----------------------+--------------------------------------------------------------+---------------------------+

The following table gives examples of how ``ndigits`` (for the ``round``
and ``round_binary`` functions) and ``multiple`` (for ``round_to_multiple``)
influence the operation performed, respectively.

+--------------------+-------------------+---------------------------+
| Round ``multiple`` | Round ``ndigits`` | Operation performed       |
+====================+===================+===========================+
| 1                  | 0                 | Round to integer          |
+--------------------+-------------------+---------------------------+
| 0.001              | 3                 | Round to 3 decimal places |
+--------------------+-------------------+---------------------------+
| 10                 | -1                | Round to multiple of 10   |
+--------------------+-------------------+---------------------------+
| 2                  | NA                | Round to multiple of 2    |
+--------------------+-------------------+---------------------------+

Logarithmic functions
~~~~~~~~~~~~~~~~~~~~~

Logarithmic functions are also supported, and also offer ``_checked``
variants that check for domain errors if needed.

Decimal values are accepted, but are cast to Float64 first.

+--------------------------+------------+-------------------------+---------------------+
| Function name            | Arity      | Input types             | Output type         |
+==========================+============+=========================+=====================+
| ln                       | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| ln_checked               | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| log10                    | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| log10_checked            | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| log1p                    | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| log1p_checked            | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| log2                     | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| log2_checked             | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| logb                     | Binary     | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| logb_checked             | Binary     | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+

Trigonometric functions
~~~~~~~~~~~~~~~~~~~~~~~

Trigonometric functions are also supported, and also offer ``_checked``
variants that check for domain errors if needed.

Decimal values are accepted, but are cast to Float64 first.

+--------------------------+------------+-------------------------+---------------------+
| Function name            | Arity      | Input types             | Output type         |
+==========================+============+=========================+=====================+
| acos                     | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| acos_checked             | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| asin                     | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| asin_checked             | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| atan                     | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| atan2                    | Binary     | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| cos                      | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| cos_checked              | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| sin                      | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| sin_checked              | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| tan                      | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| tan_checked              | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+

Hyperbolic trigonometric functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Hyperbolic trigonometric functions are also supported, and, where applicable, also offer
``_checked`` variants that check for domain errors if needed.

Decimal values are accepted, but are cast to Float64 first.

+--------------------------+------------+-------------------------+---------------------+
| Function name            | Arity      | Input types             | Output type         |
+==========================+============+=========================+=====================+
| acosh                    | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| acosh_checked            | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| asinh                    | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| atanh                    | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| atanh_checked            | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| cosh                     | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| sinh                     | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+
| tanh                     | Unary      | Float32/Float64/Decimal | Float32/Float64     |
+--------------------------+------------+-------------------------+---------------------+


Comparisons
~~~~~~~~~~~

These functions expect two inputs of numeric type (in which case they will be
cast to the :ref:`common numeric type <common-numeric-type>` before comparison),
or two inputs of Binary- or String-like types, or two inputs of Temporal types.
If any input is dictionary encoded it will be expanded for the purposes of
comparison. If any of the input elements in a pair is null, the corresponding
output element is null. Decimal arguments will be promoted in the same way as
for ``add`` and ``subtract``.

+----------------+------------+---------------------------------------------+---------------------+
| Function names | Arity      | Input types                                 | Output type         |
+================+============+=============================================+=====================+
| equal          | Binary     | Numeric, Temporal, Binary- and String-like  | Boolean             |
+----------------+------------+---------------------------------------------+---------------------+
| greater        | Binary     | Numeric, Temporal, Binary- and String-like  | Boolean             |
+----------------+------------+---------------------------------------------+---------------------+
| greater_equal  | Binary     | Numeric, Temporal, Binary- and String-like  | Boolean             |
+----------------+------------+---------------------------------------------+---------------------+
| less           | Binary     | Numeric, Temporal, Binary- and String-like  | Boolean             |
+----------------+------------+---------------------------------------------+---------------------+
| less_equal     | Binary     | Numeric, Temporal, Binary- and String-like  | Boolean             |
+----------------+------------+---------------------------------------------+---------------------+
| not_equal      | Binary     | Numeric, Temporal, Binary- and String-like  | Boolean             |
+----------------+------------+---------------------------------------------+---------------------+

These functions take any number of inputs of numeric type (in which case they
will be cast to the :ref:`common numeric type <common-numeric-type>` before
comparison) or of temporal types. If any input is dictionary encoded it will be
expanded for the purposes of comparison.

+------------------+------------+---------------------------------------------+---------------------+---------------------------------------+-------+
| Function names   | Arity      | Input types                                 | Output type         | Options class                         | Notes |
+==================+============+=============================================+=====================+=======================================+=======+
| max_element_wise | Varargs    | Numeric, Temporal, Binary- and String-like  | Numeric or Temporal | :struct:`ElementWiseAggregateOptions` | \(1)  |
+------------------+------------+---------------------------------------------+---------------------+---------------------------------------+-------+
| min_element_wise | Varargs    | Numeric, Temporal, Binary- and String-like  | Numeric or Temporal | :struct:`ElementWiseAggregateOptions` | \(1)  |
+------------------+------------+---------------------------------------------+---------------------+---------------------------------------+-------+

* \(1) By default, nulls are skipped (but the kernel can be configured to propagate nulls).
  For floating point values, NaN will be taken over null but not over any other value.
  For binary- and string-like values, only identical type parameters are supported.

Logical functions
~~~~~~~~~~~~~~~~~~

The normal behaviour for these functions is to emit a null if any of the
inputs is null (similar to the semantics of ``NaN`` in floating-point
computations).

Some of them are also available in a `Kleene logic`_ variant (suffixed
``_kleene``) where null is taken to mean "undefined".  This is the
interpretation of null used in SQL systems as well as R and Julia,
for example.

For the Kleene logic variants, therefore:

* "true AND null", "null AND true" give "null" (the result is undefined)
* "true OR null", "null OR true" give "true"
* "false AND null", "null AND false" give "false"
* "false OR null", "null OR false" give "null" (the result is undefined)

+--------------------------+------------+--------------------+---------------------+
| Function name            | Arity      | Input types        | Output type         |
+==========================+============+====================+=====================+
| and                      | Binary     | Boolean            | Boolean             |
+--------------------------+------------+--------------------+---------------------+
| and_kleene               | Binary     | Boolean            | Boolean             |
+--------------------------+------------+--------------------+---------------------+
| and_not                  | Binary     | Boolean            | Boolean             |
+--------------------------+------------+--------------------+---------------------+
| and_not_kleene           | Binary     | Boolean            | Boolean             |
+--------------------------+------------+--------------------+---------------------+
| invert                   | Unary      | Boolean            | Boolean             |
+--------------------------+------------+--------------------+---------------------+
| or                       | Binary     | Boolean            | Boolean             |
+--------------------------+------------+--------------------+---------------------+
| or_kleene                | Binary     | Boolean            | Boolean             |
+--------------------------+------------+--------------------+---------------------+
| xor                      | Binary     | Boolean            | Boolean             |
+--------------------------+------------+--------------------+---------------------+

.. _Kleene logic: https://en.wikipedia.org/wiki/Three-valued_logic#Kleene_and_Priest_logics

String predicates
~~~~~~~~~~~~~~~~~

These functions classify the input string elements according to their character
contents.  An empty string element emits false in the output.  For ASCII
variants of the functions (prefixed ``ascii_``), a string element with non-ASCII
characters emits false in the output.

The first set of functions operates on a character-per-character basis,
and emit true in the output if the input contains only characters of a
given class:

+--------------------+-------+-------------+-------------+-------------------------+-------+
| Function name      | Arity | Input types | Output type | Matched character class | Notes |
+====================+=======+=============+=============+=========================+=======+
| ascii_is_alnum     | Unary | String-like | Boolean     | Alphanumeric ASCII      |       |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| ascii_is_alpha     | Unary | String-like | Boolean     | Alphabetic ASCII        |       |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| ascii_is_decimal   | Unary | String-like | Boolean     | Decimal ASCII           | \(1)  |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| ascii_is_lower     | Unary | String-like | Boolean     | Lowercase ASCII         | \(2)  |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| ascii_is_printable | Unary | String-like | Boolean     | Printable ASCII         |       |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| ascii_is_space     | Unary | String-like | Boolean     | Whitespace ASCII        |       |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| ascii_is_upper     | Unary | String-like | Boolean     | Uppercase ASCII         | \(2)  |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| utf8_is_alnum      | Unary | String-like | Boolean     | Alphanumeric Unicode    |       |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| utf8_is_alpha      | Unary | String-like | Boolean     | Alphabetic Unicode      |       |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| utf8_is_decimal    | Unary | String-like | Boolean     | Decimal Unicode         |       |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| utf8_is_digit      | Unary | String-like | Boolean     | Unicode digit           | \(3)  |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| utf8_is_lower      | Unary | String-like | Boolean     | Lowercase Unicode       | \(2)  |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| utf8_is_numeric    | Unary | String-like | Boolean     | Numeric Unicode         | \(4)  |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| utf8_is_printable  | Unary | String-like | Boolean     | Printable Unicode       |       |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| utf8_is_space      | Unary | String-like | Boolean     | Whitespace Unicode      |       |
+--------------------+-------+-------------+-------------+-------------------------+-------+
| utf8_is_upper      | Unary | String-like | Boolean     | Uppercase Unicode       | \(2)  |
+--------------------+-------+-------------+-------------+-------------------------+-------+

* \(1) Also matches all numeric ASCII characters and all ASCII digits.

* \(2) Non-cased characters, such as punctuation, do not match.

* \(3) This is currently the same as ``utf8_is_decimal``.

* \(4) Unlike ``utf8_is_decimal``, non-decimal numeric characters also match.

The second set of functions also consider the character order in a string
element:

+--------------------------+------------+--------------------+---------------------+---------+
| Function name            | Arity      | Input types        | Output type         | Notes   |
+==========================+============+====================+=====================+=========+
| ascii_is_title           | Unary      | String-like        | Boolean             | \(1)    |
+--------------------------+------------+--------------------+---------------------+---------+
| utf8_is_title            | Unary      | String-like        | Boolean             | \(1)    |
+--------------------------+------------+--------------------+---------------------+---------+

* \(1) Output is true iff the input string element is title-cased, i.e. any
  word starts with an uppercase character, followed by lowercase characters.
  Word boundaries are defined by non-cased characters.

The third set of functions examines string elements on a byte-per-byte basis:

+--------------------------+------------+--------------------+---------------------+---------+
| Function name            | Arity      | Input types        | Output type         | Notes   |
+==========================+============+====================+=====================+=========+
| string_is_ascii          | Unary      | String-like        | Boolean             | \(1)    |
+--------------------------+------------+--------------------+---------------------+---------+

* \(1) Output is true iff the input string element contains only ASCII characters,
  i.e. only bytes in [0, 127].

String transforms
~~~~~~~~~~~~~~~~~

+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| Function name           | Arity  | Input types                             | Output type            | Options class                     | Notes |
+=========================+========+=========================================+========================+===================================+=======+
| ascii_capitalize        | Unary  | String-like                             | String-like            |                                   | \(1)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| ascii_lower             | Unary  | String-like                             | String-like            |                                   | \(1)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| ascii_reverse           | Unary  | String-like                             | String-like            |                                   | \(2)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| ascii_swapcase          | Unary  | String-like                             | String-like            |                                   | \(1)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| ascii_title             | Unary  | String-like                             | String-like            |                                   | \(1)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| ascii_upper             | Unary  | String-like                             | String-like            |                                   | \(1)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| binary_length           | Unary  | Binary- or String-like                  | Int32 or Int64         |                                   | \(3)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| binary_repeat           | Binary | Binary/String (Arg 0); Integral (Arg 1) | Binary- or String-like |                                   | \(4)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| binary_replace_slice    | Unary  | String-like                             | Binary- or String-like | :struct:`ReplaceSliceOptions`     | \(5)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| binary_reverse          | Unary  | Binary                                  | Binary                 |                                   | \(6)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| replace_substring       | Unary  | String-like                             | String-like            | :struct:`ReplaceSubstringOptions` | \(7)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| replace_substring_regex | Unary  | String-like                             | String-like            | :struct:`ReplaceSubstringOptions` | \(8)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| utf8_capitalize         | Unary  | String-like                             | String-like            |                                   | \(9)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| utf8_length             | Unary  | String-like                             | Int32 or Int64         |                                   | \(10) |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| utf8_lower              | Unary  | String-like                             | String-like            |                                   | \(9)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| utf8_replace_slice      | Unary  | String-like                             | String-like            | :struct:`ReplaceSliceOptions`     | \(7)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| utf8_reverse            | Unary  | String-like                             | String-like            |                                   | \(11) |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| utf8_swapcase           | Unary  | String-like                             | String-like            |                                   | \(9)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| utf8_title              | Unary  | String-like                             | String-like            |                                   | \(9)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+
| utf8_upper              | Unary  | String-like                             | String-like            |                                   | \(9)  |
+-------------------------+--------+-----------------------------------------+------------------------+-----------------------------------+-------+

* \(1) Each ASCII character in the input is converted to lowercase or
  uppercase.  Non-ASCII characters are left untouched.

* \(2) ASCII input is reversed to the output. If non-ASCII characters
  are present, ``Invalid`` :class:`Status` will be returned.

* \(3) Output is the physical length in bytes of each input element.  Output
  type is Int32 for Binary/String, Int64 for LargeBinary/LargeString.

* \(4) Repeat the input binary string a given number of times.

* \(5) Replace the slice of the substring from :member:`ReplaceSliceOptions::start`
  (inclusive) to :member:`ReplaceSliceOptions::stop` (exclusive) by
  :member:`ReplaceSubstringOptions::replacement`. The binary kernel measures the
  slice in bytes, while the UTF8 kernel measures the slice in codeunits.

* \(6) Perform a byte-level reverse.

* \(7) Replace non-overlapping substrings that match to
  :member:`ReplaceSubstringOptions::pattern` by
  :member:`ReplaceSubstringOptions::replacement`. If
  :member:`ReplaceSubstringOptions::max_replacements` != -1, it determines the
  maximum number of replacements made, counting from the left.

* \(8) Replace non-overlapping substrings that match to the regular expression
  :member:`ReplaceSubstringOptions::pattern` by
  :member:`ReplaceSubstringOptions::replacement`, using the Google RE2 library. If
  :member:`ReplaceSubstringOptions::max_replacements` != -1, it determines the
  maximum number of replacements made, counting from the left. Note that if the
  pattern contains groups, backreferencing can be used.

* \(9) Each UTF8-encoded character in the input is converted to lowercase or
  uppercase.

* \(10) Output is the number of characters (not bytes) of each input element.
  Output type is Int32 for String, Int64 for LargeString.

* \(11) Each UTF8-encoded code unit is written in reverse order to the output.
  If the input is not valid UTF8, then the output is undefined (but the size of output
  buffers will be preserved).

String padding
~~~~~~~~~~~~~~

These functions append/prepend a given padding byte (ASCII) or codepoint (UTF8) in
order to center (center), right-align (lpad), or left-align (rpad) a string.

+--------------------------+------------+-------------------------+---------------------+----------------------------------------+
| Function name            | Arity      | Input types             | Output type         | Options class                          |
+==========================+============+=========================+=====================+========================================+
| ascii_center             | Unary      | String-like             | String-like         | :struct:`PadOptions`                   |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+
| ascii_lpad               | Unary      | String-like             | String-like         | :struct:`PadOptions`                   |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+
| ascii_rpad               | Unary      | String-like             | String-like         | :struct:`PadOptions`                   |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+
| utf8_center              | Unary      | String-like             | String-like         | :struct:`PadOptions`                   |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+
| utf8_lpad                | Unary      | String-like             | String-like         | :struct:`PadOptions`                   |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+
| utf8_rpad                | Unary      | String-like             | String-like         | :struct:`PadOptions`                   |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+
| utf8_zero_fill           | Unary      | String-like             | String-like         | :struct:`ZeroFillOptions`              |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+

String trimming
~~~~~~~~~~~~~~~

These functions trim off characters on both sides (trim), or the left (ltrim) or right side (rtrim).

+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+
| Function name            | Arity      | Input types             | Output type         | Options class                          | Notes   |
+==========================+============+=========================+=====================+========================================+=========+
| ascii_ltrim              | Unary      | String-like             | String-like         | :struct:`TrimOptions`                  | \(1)    |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+
| ascii_ltrim_whitespace   | Unary      | String-like             | String-like         |                                        | \(2)    |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+
| ascii_rtrim              | Unary      | String-like             | String-like         | :struct:`TrimOptions`                  | \(1)    |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+
| ascii_rtrim_whitespace   | Unary      | String-like             | String-like         |                                        | \(2)    |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+
| ascii_trim               | Unary      | String-like             | String-like         | :struct:`TrimOptions`                  | \(1)    |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+
| ascii_trim_whitespace    | Unary      | String-like             | String-like         |                                        | \(2)    |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+
| utf8_ltrim               | Unary      | String-like             | String-like         | :struct:`TrimOptions`                  | \(3)    |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+
| utf8_ltrim_whitespace    | Unary      | String-like             | String-like         |                                        | \(4)    |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+
| utf8_rtrim               | Unary      | String-like             | String-like         | :struct:`TrimOptions`                  | \(3)    |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+
| utf8_rtrim_whitespace    | Unary      | String-like             | String-like         |                                        | \(4)    |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+
| utf8_trim                | Unary      | String-like             | String-like         | :struct:`TrimOptions`                  | \(3)    |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+
| utf8_trim_whitespace     | Unary      | String-like             | String-like         |                                        | \(4)    |
+--------------------------+------------+-------------------------+---------------------+----------------------------------------+---------+

* \(1) Only characters specified in :member:`TrimOptions::characters` will be
  trimmed off. Both the input string and the ``characters`` argument are
  interpreted as ASCII characters.

* \(2) Only trim off ASCII whitespace characters (``'\t'``, ``'\n'``, ``'\v'``,
  ``'\f'``, ``'\r'``  and ``' '``).

* \(3) Only characters specified in :member:`TrimOptions::characters` will be
  trimmed off.

* \(4) Only trim off Unicode whitespace characters.

String splitting
~~~~~~~~~~~~~~~~

These functions split strings into lists of strings.  All kernels can optionally
be configured with a ``max_splits`` and a ``reverse`` parameter, where
``max_splits == -1`` means no limit (the default).  When ``reverse`` is true,
the splitting is done starting from the end of the string; this is only relevant
when a positive ``max_splits`` is given.

+--------------------------+------------+-------------------------+-------------------+----------------------------------+---------+
| Function name            | Arity      | Input types             | Output type       | Options class                    | Notes   |
+==========================+============+=========================+===================+==================================+=========+
| ascii_split_whitespace   | Unary      | String-like             | List-like         | :struct:`SplitOptions`           | \(1)    |
+--------------------------+------------+-------------------------+-------------------+----------------------------------+---------+
| split_pattern            | Unary      | Binary- or String-like  | List-like         | :struct:`SplitPatternOptions`    | \(2)    |
+--------------------------+------------+-------------------------+-------------------+----------------------------------+---------+
| split_pattern_regex      | Unary      | Binary- or String-like  | List-like         | :struct:`SplitPatternOptions`    | \(3)    |
+--------------------------+------------+-------------------------+-------------------+----------------------------------+---------+
| utf8_split_whitespace    | Unary      | String-like             | List-like         | :struct:`SplitOptions`           | \(4)    |
+--------------------------+------------+-------------------------+-------------------+----------------------------------+---------+

* \(1) A non-zero length sequence of ASCII defined whitespace bytes
  (``'\t'``, ``'\n'``, ``'\v'``, ``'\f'``, ``'\r'``  and ``' '``) is seen
  as separator.

* \(2) The string is split when an exact pattern is found (the pattern itself
  is not included in the output).

* \(3) The string is split when a regex match is found (the matched
  substring itself is not included in the output).

* \(4) A non-zero length sequence of Unicode defined whitespace codepoints
  is seen as separator.

String component extraction
~~~~~~~~~~~~~~~~~~~~~~~~~~~

+--------------------+-------+------------------------+-------------+-----------------------------------+-------+
| Function name      | Arity | Input types            | Output type | Options class                     | Notes |
+====================+=======+========================+=============+===================================+=======+
| extract_regex      | Unary | Binary- or String-like | Struct      | :struct:`ExtractRegexOptions`     | \(1)  |
+--------------------+-------+------------------------+-------------+-----------------------------------+-------+
| extract_regex_span | Unary | Binary- or String-like | Struct      | :struct:`ExtractRegexSpanOptions` | \(2)  |
+--------------------+-------+------------------------+-------------+-----------------------------------+-------+

* \(1) Extract substrings defined by a regular expression using the Google RE2
  library.  The output struct field names refer to the named capture groups,
  e.g. 'letter' and 'digit' for the regular expression
  ``(?P<letter>[ab])(?P<digit>\\d)``.

* \(2) Extract the offset and length of substrings defined by a regular expression
  using the Google RE2 library.  The output struct field names refer to the named
  capture groups, e.g. 'letter' and 'digit' for the regular expression
  ``(?P<letter>[ab])(?P<digit>\\d)``. Each output struct field is a fixed size list
  of two integers: the index to the start of the captured group and the length
  of the captured group, respectively.

String joining
~~~~~~~~~~~~~~

These functions do the inverse of string splitting.

+--------------------------+---------+----------------------------------+------------------------+------------------------+-----------------------+---------+
| Function name            | Arity   | Input type 1                     | Input type 2           | Output type            | Options class         | Notes   |
+==========================+=========+==================================+========================+========================+=======================+=========+
| binary_join              | Binary  | List of Binary- or String-like   | String-like            | String-like            |                       | \(1)    |
+--------------------------+---------+----------------------------------+------------------------+------------------------+-----------------------+---------+
| binary_join_element_wise | Varargs | Binary- or String-like (varargs) | Binary- or String-like | Binary- or String-like | :struct:`JoinOptions` | \(2)    |
+--------------------------+---------+----------------------------------+------------------------+------------------------+-----------------------+---------+

* \(1) The first input must be an array, while the second can be a scalar or array.
  Each list of values in the first input is joined using each second input
  as separator.  If any input list is null or contains a null, the corresponding
  output will be null.

* \(2) All arguments are concatenated element-wise, with the last argument treated
  as the separator (scalars are recycled in either case). Null separators emit
  null. If any other argument is null, by default the corresponding output will be
  null, but it can instead either be skipped or replaced with a given string.

String Slicing
~~~~~~~~~~~~~~

This function transforms each sequence of the array to a subsequence, according
to start and stop indices, and a non-zero step (defaulting to 1).  Slicing
semantics follow Python slicing semantics: the start index is inclusive,
the stop index exclusive; if the step is negative, the sequence is followed
in reverse order.

+--------------------------+------------+-------------------------+-------------------------+--------------------------+---------+
| Function name            | Arity      | Input types             | Output type             | Options class            | Notes   |
+==========================+============+=========================+=========================+==========================+=========+
| binary_slice             | Unary      | Binary-like             | Binary-like             | :struct:`SliceOptions`   | \(1)    |
+--------------------------+------------+-------------------------+-------------------------+--------------------------+---------+
| utf8_slice_codeunits     | Unary      | String-like             | String-like             | :struct:`SliceOptions`   | \(2)    |
+--------------------------+------------+-------------------------+-------------------------+--------------------------+---------+

* \(1) Slice string into a substring defined by (``start``, ``stop``, ``step``)
  as given by :struct:`SliceOptions` where ``start`` and ``stop`` are measured
  in bytes. Null inputs emit null.
* \(2) Slice string into a substring defined by (``start``, ``stop``, ``step``)
  as given by :struct:`SliceOptions` where ``start`` and ``stop`` are measured
  in codeunits. Null inputs emit null.

Containment tests
~~~~~~~~~~~~~~~~~

+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+
| Function name         | Arity | Input types                       | Output type    | Options class                   | Notes |
+=======================+=======+===================================+================+=================================+=======+
| count_substring       | Unary | Binary- or String-like            | Int32 or Int64 | :struct:`MatchSubstringOptions` | \(1)  |
+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+
| count_substring_regex | Unary | Binary- or String-like            | Int32 or Int64 | :struct:`MatchSubstringOptions` | \(1)  |
+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+
| ends_with             | Unary | Binary- or String-like            | Boolean        | :struct:`MatchSubstringOptions` | \(2)  |
+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+
| find_substring        | Unary | Binary- and String-like           | Int32 or Int64 | :struct:`MatchSubstringOptions` | \(3)  |
+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+
| find_substring_regex  | Unary | Binary- and String-like           | Int32 or Int64 | :struct:`MatchSubstringOptions` | \(3)  |
+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+
| index_in              | Unary | Boolean, Null, Numeric, Temporal, | Int32          | :struct:`SetLookupOptions`      | \(4)  |
|                       |       | Binary- and String-like           |                |                                 |       |
+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+
| is_in                 | Unary | Boolean, Null, Numeric, Temporal, | Boolean        | :struct:`SetLookupOptions`      | \(5)  |
|                       |       | Binary- and String-like           |                |                                 |       |
+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+
| match_like            | Unary | Binary- or String-like            | Boolean        | :struct:`MatchSubstringOptions` | \(6)  |
+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+
| match_substring       | Unary | Binary- or String-like            | Boolean        | :struct:`MatchSubstringOptions` | \(7)  |
+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+
| match_substring_regex | Unary | Binary- or String-like            | Boolean        | :struct:`MatchSubstringOptions` | \(8)  |
+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+
| starts_with           | Unary | Binary- or String-like            | Boolean        | :struct:`MatchSubstringOptions` | \(2)  |
+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+

* \(1) Output is the number of occurrences of
  :member:`MatchSubstringOptions::pattern` in the corresponding input
  string. Output type is Int32 for Binary/String, Int64
  for LargeBinary/LargeString.

* \(2) Output is true iff :member:`MatchSubstringOptions::pattern`
  is a suffix/prefix of the corresponding input.

* \(3) Output is the index of the first occurrence of
  :member:`MatchSubstringOptions::pattern` in the corresponding input
  string, otherwise -1. Output type is Int32 for Binary/String, Int64
  for LargeBinary/LargeString.

* \(4) Output is the index of the corresponding input element in
  :member:`SetLookupOptions::value_set`, if found there.  Otherwise,
  output is null.

* \(5) Output is true iff the corresponding input element is equal to one
  of the elements in :member:`SetLookupOptions::value_set`.

* \(6) Output is true iff the SQL-style LIKE pattern
  :member:`MatchSubstringOptions::pattern` fully matches the
  corresponding input element. That is, ``%`` will match any number of
  characters, ``_`` will match exactly one character, and any other
  character matches itself. To match a literal percent sign or
  underscore, precede the character with a backslash.

* \(7) Output is true iff :member:`MatchSubstringOptions::pattern`
  is a substring of the corresponding input element.

* \(8) Output is true iff :member:`MatchSubstringOptions::pattern`
  matches the corresponding input element at any position.

Categorizations
~~~~~~~~~~~~~~~

+-------------------+------------+-------------------------+---------------------+------------------------+---------+
| Function name     | Arity      | Input types             | Output type         | Options class          | Notes   |
+===================+============+=========================+=====================+========================+=========+
| is_finite         | Unary      | Null, Numeric           | Boolean             |                        | \(1)    |
+-------------------+------------+-------------------------+---------------------+------------------------+---------+
| is_inf            | Unary      | Null, Numeric           | Boolean             |                        | \(2)    |
+-------------------+------------+-------------------------+---------------------+------------------------+---------+
| is_nan            | Unary      | Null, Numeric           | Boolean             |                        | \(3)    |
+-------------------+------------+-------------------------+---------------------+------------------------+---------+
| is_null           | Unary      | Any                     | Boolean             | :struct:`NullOptions`  | \(4)    |
+-------------------+------------+-------------------------+---------------------+------------------------+---------+
| is_valid          | Unary      | Any                     | Boolean             |                        | \(5)    |
+-------------------+------------+-------------------------+---------------------+------------------------+---------+
| true_unless_null  | Unary      | Any                     | Boolean             |                        | \(6)    |
+-------------------+------------+-------------------------+---------------------+------------------------+---------+

* \(1) Output is true iff the corresponding input element is finite (neither Infinity,
  -Infinity, nor NaN). Hence, for Decimal and integer inputs this always returns true.

* \(2) Output is true iff the corresponding input element is Infinity/-Infinity.
  Hence, for Decimal and integer inputs this always returns false.

* \(3) Output is true iff the corresponding input element is NaN.
  Hence, for Decimal and integer inputs this always returns false.

* \(4) Output is true iff the corresponding input element is null. NaN values
  can also be considered null by setting :member:`NullOptions::nan_is_null`.

* \(5) Output is true iff the corresponding input element is non-null, else false.

* \(6) Output is true iff the corresponding input element is non-null, else null.
       Mostly intended for expression simplification/guarantees.

.. _cpp-compute-scalar-selections:

Selecting / multiplexing
~~~~~~~~~~~~~~~~~~~~~~~~

For each "row" of input values, these functions emit one of the input values,
depending on a condition.

+------------------+------------+---------------------------------------------------+---------------------+---------+
| Function name    | Arity      | Input types                                       | Output type         | Notes   |
+==================+============+===================================================+=====================+=========+
| case_when        | Varargs    | Struct of Boolean (Arg 0), Any (rest)             | Input type          | \(1)    |
+------------------+------------+---------------------------------------------------+---------------------+---------+
| choose           | Varargs    | Integral (Arg 0), Fixed-width/Binary-like (rest)  | Input type          | \(2)    |
+------------------+------------+---------------------------------------------------+---------------------+---------+
| coalesce         | Varargs    | Any                                               | Input type          | \(3)    |
+------------------+------------+---------------------------------------------------+---------------------+---------+
| if_else          | Ternary    | Boolean (Arg 0), Any (rest)                       | Input type          | \(4)    |
+------------------+------------+---------------------------------------------------+---------------------+---------+

* \(1) This function acts like a SQL "case when" statement or switch-case. The
  input is a "condition" value, which is a struct of Booleans, followed by the
  values for each "branch". There must be either exactly one value argument for
  each child of the condition struct, or one more value argument than children
  (in which case we have an "else" or "default" value). The output is of the
  same type as the value inputs; each row will be the corresponding value from
  the first value datum for which the corresponding Boolean is true, or the
  corresponding value from the "default" input, or null otherwise.

  Note that currently, while all types are supported, dictionaries will be
  unpacked.

* \(2) The first input must be an integral type. The rest of the arguments can be
  any type, but must all be the same type or promotable to a common type. Each
  value of the first input (the 'index') is used as a zero-based index into the
  remaining arguments (i.e. index 0 is the second argument, index 1 is the third
  argument, etc.), and the value of the output for that row will be the
  corresponding value of the selected input at that row. If the index is null,
  then the output will also be null.

* \(3) Each row of the output will be the corresponding value of the first
  input which is non-null for that row, otherwise null.

* \(4) First input must be a Boolean scalar or array. Second and third inputs
  could be scalars or arrays and must be of the same type. Output is an array
  (or scalar if all inputs are scalar) of the same type as the second/ third
  input. If the nulls present on the first input, they will be promoted to the
  output, otherwise nulls will be chosen based on the first input values.

  Also see: :ref:`replace_with_mask <cpp-compute-vector-replace-functions>`.

Structural transforms
~~~~~~~~~~~~~~~~~~~~~

+---------------------+------------+-------------+------------------+------------------------------+--------+
| Function name       | Arity      | Input types | Output type      | Options class                | Notes  |
+=====================+============+=============+==================+==============================+========+
| list_value_length   | Unary      | List-like   | Int32 or Int64   |                              | \(1)   |
+---------------------+------------+-------------+------------------+------------------------------+--------+
| make_struct         | Varargs    | Any         | Struct           | :struct:`MakeStructOptions`  | \(2)   |
+---------------------+------------+-------------+------------------+------------------------------+--------+

* \(1) Each output element is the length of the corresponding input element
  (null if input is null).  Output type is Int32 for List, ListView, and
  FixedSizeList, Int64 for LargeList and LargeListView.

* \(2) The output struct's field types are the types of its arguments. The
  field names are specified using an instance of :struct:`MakeStructOptions`.
  The output shape will be scalar if all inputs are scalar, otherwise any
  scalars will be broadcast to arrays.

Conversions
~~~~~~~~~~~

A general conversion function named ``cast`` is provided which accepts a large
number of input and output types.  The type to cast to can be passed in a
:struct:`CastOptions` instance.  As an alternative, the same service is
provided by a concrete function :func:`~arrow::compute::Cast`.

+-----------------+------------+--------------------+------------------+--------------------------------+-------+
| Function name   | Arity      | Input types        | Output type      | Options class                  | Notes |
+=================+============+====================+==================+================================+=======+
| ceil_temporal   | Unary      | Temporal           | Temporal         | :struct:`RoundTemporalOptions` |       |
+-----------------+------------+--------------------+------------------+--------------------------------+-------+
| floor_temporal  | Unary      | Temporal           | Temporal         | :struct:`RoundTemporalOptions` |       |
+-----------------+------------+--------------------+------------------+--------------------------------+-------+
| round_temporal  | Unary      | Temporal           | Temporal         | :struct:`RoundTemporalOptions` |       |
+-----------------+------------+--------------------+------------------+--------------------------------+-------+
| cast            | Unary      | Many               | Variable         | :struct:`CastOptions`          |       |
+-----------------+------------+--------------------+------------------+--------------------------------+-------+
| strftime        | Unary      | Temporal           | String           | :struct:`StrftimeOptions`      | \(1)  |
+-----------------+------------+--------------------+------------------+--------------------------------+-------+
| strptime        | Unary      | String-like        | Timestamp        | :struct:`StrptimeOptions`      |       |
+-----------------+------------+--------------------+------------------+--------------------------------+-------+

The conversions available with ``cast`` are listed below.  In all cases, a
null input value is converted into a null output value.

* \(1) Output precision of ``%S`` (seconds) flag depends on the input timestamp
  precision. Timestamps with second precision are represented as integers while
  milliseconds, microsecond and nanoseconds are represented as fixed floating
  point numbers with 3, 6 and 9 decimal places respectively. To obtain integer
  seconds, cast to timestamp with second resolution.
  The character for the decimal point is localized according to the locale.
  See `detailed formatting documentation`_ for descriptions of other flags.

.. _detailed formatting documentation: https://howardhinnant.github.io/date/date.html#to_stream_formatting

**Truth value extraction**

+-----------------------------+------------------------------------+--------------+
| Input type                  | Output type                        | Notes        |
+=============================+====================================+==============+
| Binary- and String-like     | Boolean                            | \(1)         |
+-----------------------------+------------------------------------+--------------+
| Numeric                     | Boolean                            | \(2)         |
+-----------------------------+------------------------------------+--------------+

* \(1) Output is true iff the corresponding input value has non-zero length.

* \(2) Output is true iff the corresponding input value is non-zero.

**Same-kind conversion**

+-----------------------------+------------------------------------+--------------+
| Input type                  | Output type                        | Notes        |
+=============================+====================================+==============+
| Int32                       | 32-bit Temporal                    | \(1)         |
+-----------------------------+------------------------------------+--------------+
| Int64                       | 64-bit Temporal                    | \(1)         |
+-----------------------------+------------------------------------+--------------+
| (Large)Binary               | (Large)String                      | \(2)         |
+-----------------------------+------------------------------------+--------------+
| (Large)String               | (Large)Binary                      | \(3)         |
+-----------------------------+------------------------------------+--------------+
| Numeric                     | Numeric                            | \(4) \(5)    |
+-----------------------------+------------------------------------+--------------+
| 32-bit Temporal             | Int32                              | \(1)         |
+-----------------------------+------------------------------------+--------------+
| 64-bit Temporal             | Int64                              | \(1)         |
+-----------------------------+------------------------------------+--------------+
| Temporal                    | Temporal                           | \(4) \(5)    |
+-----------------------------+------------------------------------+--------------+

* \(1) No-operation cast: the raw values are kept identical, only
  the type is changed.

* \(2) Validates the contents if :member:`CastOptions::allow_invalid_utf8`
  is false.

* \(3) No-operation cast: only the type is changed.

* \(4) Overflow and truncation checks are enabled depending on
  the given :struct:`CastOptions`.

* \(5) Not all such casts have been implemented.

**String representations**

+-----------------------------+------------------------------------+---------+
| Input type                  | Output type                        | Notes   |
+=============================+====================================+=========+
| Boolean                     | String-like                        |         |
+-----------------------------+------------------------------------+---------+
| Numeric                     | String-like                        |         |
+-----------------------------+------------------------------------+---------+

**Generic conversions**

+-----------------------------+------------------------------------+---------+
| Input type                  | Output type                        | Notes   |
+=============================+====================================+=========+
| Dictionary                  | Dictionary value type              | \(1)    |
+-----------------------------+------------------------------------+---------+
| Extension                   | Extension storage type             |         |
+-----------------------------+------------------------------------+---------+
| Struct                      | Struct                             | \(2)    |
+-----------------------------+------------------------------------+---------+
| List-like                   | List-like or (Large)ListView       | \(3)    |
+-----------------------------+------------------------------------+---------+
| (Large)ListView             | List-like or (Large)ListView       | \(4)    |
+-----------------------------+------------------------------------+---------+
| Map                         | Map or List of two-field struct    | \(5)    |
+-----------------------------+------------------------------------+---------+
| Null                        | Any                                |         |
+-----------------------------+------------------------------------+---------+
| Any                         | Extension                          | \(6)    |
+-----------------------------+------------------------------------+---------+

* \(1) The dictionary indices are unchanged, the dictionary values are
  cast from the input value type to the output value type (if a conversion
  is available).

* \(2) Fields are casted primarily by matching field names between the input
  and output type. For duplicate field names, their relative order is preserved,
  with each input field used for one or zero output fields. This allows casting
  to a subset or re-ordered field names. If a nullable field in the output type
  has no matching field name in the input type, it will be filled with nulls.
  Casting a field from nullable to not null is supported if the input data
  contains zero nulls.

* \(3) The list offsets are unchanged, the list values are cast from the
  input value type to the output value type (if a conversion is
  available). If the output type is (Large)ListView, then sizes are
  derived from the offsets.

* \(4) If output type is list-like, offsets (consequently, the values array)
  might have to be rebuilt to be sorted and spaced adequately. If output type is
  a list-view type, the offsets and sizes are unchanged. In any case, the list
  values are cast from the input value type to the output value type (if a
  conversion is available).

* \(5) Offsets are unchanged, the keys and values are cast from respective input
  to output types (if a conversion is available). If output type is a list of
  struct, the key field is output as the first field and the value field the
  second field, regardless of field names chosen.

* \(6) Any input type that can be cast to the resulting extension's storage type.
  This excludes extension types, unless being cast to the same extension type.

Temporal component extraction
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

These functions extract datetime components (year, month, day, etc) from temporal types.
For timestamps inputs with non-empty timezone, localized timestamp components will be returned.

+--------------------+------------+-------------------+---------------+----------------------------+-------+
| Function name      | Arity      | Input types       | Output type   | Options class              | Notes |
+====================+============+===================+===============+============================+=======+
| day                | Unary      | Temporal          | Int64         |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| day_of_week        | Unary      | Temporal          | Int64         | :struct:`DayOfWeekOptions` | \(1)  |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| day_of_year        | Unary      | Temporal          | Int64         |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| hour               | Unary      | Timestamp, Time   | Int64         |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| is_dst             | Unary      | Timestamp         | Boolean       |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| iso_week           | Unary      | Temporal          | Int64         |                            | \(2)  |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| iso_year           | Unary      | Temporal          | Int64         |                            | \(2)  |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| iso_calendar       | Unary      | Temporal          | Struct        |                            | \(3)  |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| is_leap_year       | Unary      | Timestamp, Date   | Boolean       |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| microsecond        | Unary      | Timestamp, Time   | Int64         |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| millisecond        | Unary      | Timestamp, Time   | Int64         |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| minute             | Unary      | Timestamp, Time   | Int64         |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| month              | Unary      | Temporal          | Int64         |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| nanosecond         | Unary      | Timestamp, Time   | Int64         |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| quarter            | Unary      | Temporal          | Int64         |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| second             | Unary      | Timestamp, Time   | Int64         |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| subsecond          | Unary      | Timestamp, Time   | Float64       |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| us_week            | Unary      | Temporal          | Int64         |                            | \(4)  |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| us_year            | Unary      | Temporal          | Int64         |                            | \(4)  |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| week               | Unary      | Timestamp         | Int64         | :struct:`WeekOptions`      | \(5)  |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| year               | Unary      | Temporal          | Int64         |                            |       |
+--------------------+------------+-------------------+---------------+----------------------------+-------+
| year_month_day     | Unary      | Temporal          | Struct        |                            | \(6)  |
+--------------------+------------+-------------------+---------------+----------------------------+-------+

* \(1) Outputs the number of the day of the week. By default week begins on Monday
  represented by 0 and ends on Sunday represented by 6. Day numbering can start with 0 or 1 based on
  :member:`DayOfWeekOptions::count_from_zero` parameter. :member:`DayOfWeekOptions::week_start` can be
  used to set the starting day of the week using ISO convention (Monday=1, Sunday=7).
  :member:`DayOfWeekOptions::week_start` parameter is not affected by :member:`DayOfWeekOptions::count_from_zero`.

* \(2) First ISO week has the majority (4 or more) of it's days in January. ISO year
  starts with the first ISO week. ISO week starts on Monday.
  See `ISO 8601 week date definition`_ for more details.

* \(3) Output is a ``{"iso_year": output type, "iso_week": output type, "iso_day_of_week":  output type}`` Struct.

* \(4) First US week has the majority (4 or more) of its days in January. US year
  starts with the first US week. US week starts on Sunday.

* \(5) Returns week number allowing for setting several parameters.
  If :member:`WeekOptions::week_starts_monday` is true, the week starts with Monday, else Sunday if false.
  If :member:`WeekOptions::count_from_zero` is true, dates from the current year that fall into the last ISO week
  of the previous year are numbered as week 0, else week 52 or 53 if false.
  If :member:`WeekOptions::first_week_is_fully_in_year` is true, the first week (week 1) must fully be in January;
  else if false, a week that begins on December 29, 30, or 31 is considered the first week of the new year.

* \(6) Output is a ``{"year": int64(), "month": int64(), "day": int64()}`` Struct.

.. _ISO 8601 week date definition: https://en.wikipedia.org/wiki/ISO_week_date#First_week

Temporal difference
~~~~~~~~~~~~~~~~~~~

These functions compute the difference between two timestamps in the
specified unit. The difference is determined by the number of
boundaries crossed, not the span of time. For example, the difference
in days between 23:59:59 on one day and 00:00:01 on the next day is
one day (since midnight was crossed), not zero days (even though less
than 24 hours elapsed). Additionally, if the timestamp has a defined
timezone, the difference is calculated in the local timezone. For
instance, the difference in years between "2019-12-31 18:00:00-0500"
and "2019-12-31 23:00:00-0500" is zero years, because the local year
is the same, even though the UTC years would be different.

+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| Function name                   | Arity      | Input types       | Output type           | Options class              |
+=================================+============+===================+=======================+============================+
| day_time_interval_between       | Binary     | Temporal          | DayTime interval      |                            |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| days_between                    | Binary     | Timestamp, Date   | Int64                 |                            |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| hours_between                   | Binary     | Temporal          | Int64                 |                            |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| microseconds_between            | Binary     | Temporal          | Int64                 |                            |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| milliseconds_between            | Binary     | Temporal          | Int64                 |                            |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| minutes_between                 | Binary     | Temporal          | Int64                 |                            |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| month_day_nano_interval_between | Binary     | Temporal          | MonthDayNano interval |                            |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| month_interval_between          | Binary     | Timestamp, Date   | Month interval        |                            |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| nanoseconds_between             | Binary     | Temporal          | Int64                 |                            |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| quarters_between                | Binary     | Timestamp, Date   | Int64                 |                            |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| seconds_between                 | Binary     | Temporal          | Int64                 |                            |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| weeks_between                   | Binary     | Timestamp, Date   | Int64                 | :struct:`DayOfWeekOptions` |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+
| years_between                   | Binary     | Timestamp, Date   | Int64                 |                            |
+---------------------------------+------------+-------------------+-----------------------+----------------------------+

Timezone handling
~~~~~~~~~~~~~~~~~

``assume_timezone`` function is meant to be used when an external system produces
"timezone-naive" timestamps which need to be converted to "timezone-aware"
timestamps (see for example the `definition
<https://docs.python.org/3/library/datetime.html#aware-and-naive-objects>`__
in the Python documentation).

Input timestamps are assumed to be relative to the timezone given in
:member:`AssumeTimezoneOptions::timezone`. They are converted to
UTC-relative timestamps with the timezone metadata set to the above value.
An error is returned if the timestamps already have the timezone metadata set.

``local_timestamp`` function converts UTC-relative timestamps to local "timezone-naive"
timestamps. The timezone is taken from the timezone metadata of the input
timestamps. This function is the inverse of ``assume_timezone``. Please note:
**all temporal functions already operate on timestamps as if they were in local
time of the metadata provided timezone**. Using ``local_timestamp`` is only meant to be
used when an external system expects local timestamps.

+-----------------+-------+-------------+---------------+---------------------------------+-------+
| Function name   | Arity | Input types | Output type   | Options class                   | Notes |
+=================+=======+=============+===============+=================================+=======+
| assume_timezone | Unary | Timestamp   | Timestamp     | :struct:`AssumeTimezoneOptions` | \(1)  |
+-----------------+-------+-------------+---------------+---------------------------------+-------+
| local_timestamp | Unary | Timestamp   | Timestamp     |                                 | \(2)  |
+-----------------+-------+-------------+---------------+---------------------------------+-------+

* \(1) In addition to the timezone value, :struct:`AssumeTimezoneOptions`
  allows choosing the behaviour when a timestamp is ambiguous or nonexistent
  in the given timezone (because of DST shifts).

Random number generation
~~~~~~~~~~~~~~~~~~~~~~~~

This function generates an array of uniformly-distributed double-precision numbers
in range [0, 1). The options provide the length of the output and the algorithm for
generating the random numbers, using either a seed or a system-provided, platform-specific
random generator.

+--------------------+------------+---------------+-------------------------+
| Function name      | Arity      | Output type   | Options class           |
+====================+============+===============+=========================+
| random             | Nullary    | Float64       | :struct:`RandomOptions` |
+--------------------+------------+---------------+-------------------------+


Array-wise ("vector") functions
-------------------------------

Cumulative Functions
~~~~~~~~~~~~~~~~~~~~

Cumulative functions are vector functions that perform a running accumulation on
their input using a given binary associative operation with an identity element
(a monoid) and output an array containing the corresponding intermediate running
values. The input is expected to be of numeric type. By default these functions
do not detect overflow. They are also available in an overflow-checking variant,
suffixed ``_checked``, which returns an ``Invalid`` :class:`Status` when
overflow is detected.

+-------------------------+-------+-------------+-------------+--------------------------------+-----------+
| Function name           | Arity | Input types | Output type | Options class                  | Notes     |
+=========================+=======+=============+=============+================================+===========+
| cumulative_sum          | Unary | Numeric     | Numeric     | :struct:`CumulativeOptions`    | \(1)      |
+-------------------------+-------+-------------+-------------+--------------------------------+-----------+
| cumulative_sum_checked  | Unary | Numeric     | Numeric     | :struct:`CumulativeOptions`    | \(1)      |
+-------------------------+-------+-------------+-------------+--------------------------------+-----------+
| cumulative_prod         | Unary | Numeric     | Numeric     | :struct:`CumulativeOptions`    | \(1)      |
+-------------------------+-------+-------------+-------------+--------------------------------+-----------+
| cumulative_prod_checked | Unary | Numeric     | Numeric     | :struct:`CumulativeOptions`    | \(1)      |
+-------------------------+-------+-------------+-------------+--------------------------------+-----------+
| cumulative_max          | Unary | Numeric     | Numeric     | :struct:`CumulativeOptions`    | \(1)      |
+-------------------------+-------+-------------+-------------+--------------------------------+-----------+
| cumulative_min          | Unary | Numeric     | Numeric     | :struct:`CumulativeOptions`    | \(1)      |
+-------------------------+-------+-------------+-------------+--------------------------------+-----------+
| cumulative_mean         | Unary | Numeric     | Float64     | :struct:`CumulativeOptions`    | \(1) \(2) |
+-------------------------+-------+-------------+-------------+--------------------------------+-----------+

* \(1) CumulativeOptions has two optional parameters. The first parameter
  :member:`CumulativeOptions::start` is a starting value for the running
  accumulation. It has a default value of 0 for ``sum``, 1 for ``prod``, min of
  input type for ``max``, and max of input type for ``min``. Specified values of
  ``start`` must be castable to the input type. The second parameter
  :member:`CumulativeOptions::skip_nulls` is a boolean. When set to
  false (the default), the first encountered null is propagated. When set to
  true, each null in the input produces a corresponding null in the output and
  doesn't affect the accumulation forward.

* \(2) :member:`CumulativeOptions::start` is ignored.

Statistical functions
~~~~~~~~~~~~~~~~~~~~~

+-------------------------+-------+-------------+-------------+--------------------------------+-----------+
| Function name           | Arity | Input types | Output type | Options class                  | Notes     |
+=========================+=======+=============+=============+================================+===========+
| winsorize               | Unary | Numeric     | Numeric     | :struct:`WinsorizeOptions`     | \(1)      |
+-------------------------+-------+-------------+-------------+--------------------------------+-----------+

* \(1) Clamp values in the lower and upper quantiles to reduce the statistical
  influence of outliers. The quantiles can be configured in :struct:`WinsorizeOptions`.

Associative transforms
~~~~~~~~~~~~~~~~~~~~~~

+-------------------+-------+-----------------------------------+-------------+-------+
| Function name     | Arity | Input types                       | Output type | Notes |
+===================+=======+===================================+=============+=======+
| dictionary_encode | Unary | Boolean, Null, Numeric,           | Dictionary  | \(1)  |
|                   |       | Temporal, Binary- and String-like |             |       |
+-------------------+-------+-----------------------------------+-------------+-------+
| unique            | Unary | Boolean, Null, Numeric,           | Input type  | \(2)  |
|                   |       | Temporal, Binary- and String-like |             |       |
+-------------------+-------+-----------------------------------+-------------+-------+
| value_counts      | Unary | Boolean, Null, Numeric,           | Input type  | \(3)  |
|                   |       | Temporal, Binary- and String-like |             |       |
+-------------------+-------+-----------------------------------+-------------+-------+

* \(1) Output is ``Dictionary(Int32, input type)``. It is a no-op if input is
  already a Dictionary array.

* \(2) Duplicates are removed from the output while the original order is
  maintained.

* \(3) Output is a ``{"values": input type, "counts": Int64}`` Struct.
  Each output element corresponds to a unique value in the input, along
  with the number of times this value has appeared.

Selections
~~~~~~~~~~

These functions select and return a subset of their input.

+---------------------+--------+----------------+--------------+---------------------+-------------------------------------+-------+
| Function name       | Arity  | Input type 1   | Input type 2 | Output type         | Options class                       | Notes |
+=====================+========+================+==============+=====================+=====================================+=======+
| array_filter        | Binary | Any            | Boolean      | Input type 1        | :struct:`FilterOptions`             | \(2)  |
+---------------------+--------+----------------+--------------+---------------------+-------------------------------------+-------+
| array_take          | Binary | Any            | Integer      | Input type 1        | :struct:`TakeOptions`               | \(3)  |
+---------------------+--------+----------------+--------------+---------------------+-------------------------------------+-------+
| drop_null           | Unary  | Any            |              | Input type 1        |                                     | \(1)  |
+---------------------+--------+----------------+--------------+---------------------+-------------------------------------+-------+
| filter              | Binary | Any            | Boolean      | Input type 1        | :struct:`FilterOptions`             | \(2)  |
+---------------------+--------+----------------+--------------+---------------------+-------------------------------------+-------+
| inverse_permutation | Unary  | Signed Integer |              | Signed Integer \(4) | :struct:`InversePermutationOptions` | \(5)  |
+---------------------+--------+----------------+--------------+---------------------+-------------------------------------+-------+
| scatter             | Binary | Any            | Integer      | Input type 1        | :struct:`ScatterOptions`            | \(6)  |
+---------------------+--------+----------------+--------------+---------------------+-------------------------------------+-------+
| take                | Binary | Any            | Integer      | Input type 1        | :struct:`TakeOptions`               | \(3)  |
+---------------------+--------+----------------+--------------+---------------------+-------------------------------------+-------+

* \(1) Each element in the input is appended to the output iff it is non-null.
  If the input is a record batch or table, any null value in a column drops
  the entire row.

* \(2) Each element in input 1 (the values) is appended to the output iff
  the corresponding element in input 2 (the filter) is true.  How
  nulls in the filter are handled can be configured using FilterOptions.

* \(3) For each element *i* in input 2 (the indices), the *i*'th element
  in input 1 (the values) is appended to the output.

* \(4) The output type is specified in :struct:`InversePermutationOptions`.

* \(5) For *indices[i] = x*, *inverse_permutation[x] = i*. And *inverse_permutation[x]
  = null* if *x* does not appear in the input indices. Indices must be in the range
  of *[0, max_index]*, or null, which will be ignored. If multiple indices point to the
  same value, the last one is used.

* \(6) For *indices[i] = x*, *output[x] = values[i]*. And *output[x] = null*
  if *x* does not appear in the input indices. Indices must be in the range
  of *[0, max_index]*, or null, in which case the corresponding value will be
  ignored. If multiple indices point to the same value, the last one is used.

Containment tests
~~~~~~~~~~~~~~~~~

This function returns the indices at which array elements are non-null and non-zero.

+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+
| Function name         | Arity | Input types                       | Output type    | Options class                   | Notes |
+=======================+=======+===================================+================+=================================+=======+
| indices_nonzero       | Unary | Boolean, Null, Numeric, Decimal   | UInt64         |                                 |       |
+-----------------------+-------+-----------------------------------+----------------+---------------------------------+-------+

Sorts and partitions
~~~~~~~~~~~~~~~~~~~~

By default, in these functions, nulls are considered greater than any other value
(they will be sorted or partitioned at the end of the array).  Floating-point
NaN values are considered greater than any other non-null value, but smaller
than nulls.  This behaviour can be changed using the ``null_placement`` setting
in the respective option classes.

.. note::
   Binary- and String-like inputs are ordered lexicographically as bytestrings,
   even for String types.

+-----------------------+------------+---------------------------------------------------------+-------------------+-------------------------------+----------------+
| Function name         | Arity      | Input types                                             | Output type       | Options class                 | Notes          |
+=======================+============+=========================================================+===================+===============================+================+
| array_sort_indices    | Unary      | Boolean, Numeric, Temporal, Binary- and String-like     | UInt64            | :struct:`ArraySortOptions`    | \(1) \(2)      |
+-----------------------+------------+---------------------------------------------------------+-------------------+-------------------------------+----------------+
| partition_nth_indices | Unary      | Boolean, Numeric, Temporal, Binary- and String-like     | UInt64            | :struct:`PartitionNthOptions` | \(3)           |
+-----------------------+------------+---------------------------------------------------------+-------------------+-------------------------------+----------------+
| rank                  | Unary      | Boolean, Numeric, Temporal, Binary- and String-like     | UInt64            | :struct:`RankOptions`         | \(4)           |
+-----------------------+------------+---------------------------------------------------------+-------------------+-------------------------------+----------------+
| rank_normal           | Unary      | Boolean, Numeric, Temporal, Binary- and String-like     | Float64           | :struct:`RankQuantileOptions` | \(5)           |
+-----------------------+------------+---------------------------------------------------------+-------------------+-------------------------------+----------------+
| rank_quantile         | Unary      | Boolean, Numeric, Temporal, Binary- and String-like     | Float64           | :struct:`RankQuantileOptions` | \(5)           |
+-----------------------+------------+---------------------------------------------------------+-------------------+-------------------------------+----------------+
| select_k_unstable     | Unary      | Boolean, Numeric, Temporal, Binary- and String-like     | UInt64            | :struct:`SelectKOptions`      | \(6) \(7)      |
+-----------------------+------------+---------------------------------------------------------+-------------------+-------------------------------+----------------+
| sort_indices          | Unary      | Boolean, Numeric, Temporal, Binary- and String-like     | UInt64            | :struct:`SortOptions`         | \(1) \(6)      |
+-----------------------+------------+---------------------------------------------------------+-------------------+-------------------------------+----------------+


* \(1) The output is an array of indices into the input, that define a
  stable sort of the input.

* \(2) The input must be an array. The default order is ascending.

* \(3) The output is an array of indices into the input array, that define
  a partial non-stable sort such that the *N*'th index points to the *N*'th
  element in sorted order, and all indices before the *N*'th point to
  elements less or equal to elements at or after the *N*'th (similar to
  :func:`std::nth_element`).  *N* is given in
  :member:`PartitionNthOptions::pivot`.

* \(4) The output is a one-based numerical array of ranks.

* \(5) The output of ``rank_quantile`` is an array of quantiles strictly between
  0 and 1. The ouput of ``rank_normal`` is an array of finite real values
  corresponding to points in the normal distribution that reflect the input's
  quantile ranks.

* \(6) The input can be an array, chunked array, record batch or
  table. If the input is a record batch or table, one or more sort
  keys must be specified.

* \(7) The output is an array of indices into the input, that define a
  non-stable sort of the input.

.. _cpp-compute-vector-structural-transforms:

Structural transforms
~~~~~~~~~~~~~~~~~~~~~

+---------------------+------------+-------------------------------------+------------------+------------------------------+--------+
| Function name       | Arity      | Input types                         | Output type      | Options class                | Notes  |
+=====================+============+=====================================+==================+==============================+========+
| list_element        | Binary     | List-like (Arg 0), Integral (Arg 1) | List value type  |                              | \(1)   |
+---------------------+------------+-------------------------------------+------------------+------------------------------+--------+
| list_flatten        | Unary      | List-like                           | List value type  |                              | \(2)   |
+---------------------+------------+-------------------------------------+------------------+------------------------------+--------+
| list_parent_indices | Unary      | List-like                           | Int64            |                              | \(3)   |
+---------------------+------------+-------------------------------------+------------------+------------------------------+--------+
| list_slice          | Unary      | List-like                           | List-like        | :struct:`ListSliceOptions`   | \(4)   |
+---------------------+------------+-------------------------------------+------------------+------------------------------+--------+
| map_lookup          | Unary      | Map                                 | Computed         | :struct:`MapLookupOptions`   | \(5)   |
+---------------------+------------+-------------------------------------+------------------+------------------------------+--------+
| struct_field        | Unary      | Struct or Union                     | Computed         | :struct:`StructFieldOptions` | \(6)   |
+---------------------+------------+-------------------------------------+------------------+------------------------------+--------+

* \(1) Output is an array of the same length as the input list array. The
  output values are the values at the specified index of each child list.

* \(2) The top level of nesting is removed: all values in the list child array,
  including nulls, are appended to the output.  However, nulls in the parent
  list array are discarded.

* \(3) For each value in the list child array, the index at which it is found
  in the list-like array is appended to the output. Indices of null lists in the
  parent array might still be present in the output if they are non-empty null
  lists. If the parent is a list-view, child array values that are not used by
  any non-null list-view are null in the output.

* \(4) For each list element, compute the slice of that list element, then
  return another list-like array of those slices. Can return either a
  fixed or variable size list-like array, as determined by options provided.

* \(5) Extract either the ``FIRST``, ``LAST`` or ``ALL`` items from a
  map whose key match the given query key passed via options.
  The output type is an Array of items for the ``FIRST``/``LAST`` options
  and an Array of List of items for the ``ALL`` option.

* \(6) Extract a child value based on a sequence of indices passed in
  the options. The validity bitmap of the result will be the
  intersection of all intermediate validity bitmaps. For example, for
  an array with type ``struct<a: int32, b: struct<c: int64, d:
  float64>>``:

  * An empty sequence of indices yields the original value unchanged.
  * The index ``0`` yields an array of type ``int32`` whose validity
    bitmap is the intersection of the bitmap for the outermost struct
    and the bitmap for the child ``a``.
  * The index ``1, 1`` yields an array of type ``float64`` whose
    validity bitmap is the intersection of the bitmaps for the
    outermost struct, for struct ``b``, and for the child ``d``.

  For unions, a validity bitmap is synthesized based on the type
  codes. Also, the index is always the child index and not a type code.
  Hence for array with type ``sparse_union<2: int32, 7: utf8>``:

  * The index ``0`` yields an array of type ``int32``, which is valid
    at an index *n* if and only if the child array ``a`` is valid at
    index *n* and the type code at index *n* is 2.
  * The indices ``2`` and ``7`` are invalid.

.. _cpp-compute-vector-replace-functions:

Replace functions
~~~~~~~~~~~~~~~~~

These functions create a copy of the first input with some elements
replaced, based on the remaining inputs.

+--------------------------+------------+-----------------------+--------------+--------------+--------------+-------+
| Function name            | Arity      | Input type 1          | Input type 2 | Input type 3 | Output type  | Notes |
+==========================+============+=======================+==============+==============+==============+=======+
| fill_null_backward       | Unary      | Fixed-width or binary |              |              | Input type 1 | \(1)  |
+--------------------------+------------+-----------------------+--------------+--------------+--------------+-------+
| fill_null_forward        | Unary      | Fixed-width or binary |              |              | Input type 1 | \(1)  |
+--------------------------+------------+-----------------------+--------------+--------------+--------------+-------+
| replace_with_mask        | Ternary    | Fixed-width or binary | Boolean      | Input type 1 | Input type 1 | \(2)  |
+--------------------------+------------+-----------------------+--------------+--------------+--------------+-------+

* \(1) Valid values are carried forward/backward to fill null values.
* \(2) Each element in input 1 for which the corresponding Boolean in input 2
  is true is replaced with the next value from input 3. A null in input 2
  results in a corresponding null in the output.

  Also see: :ref:`if_else <cpp-compute-scalar-selections>`.

Pairwise functions
~~~~~~~~~~~~~~~~~~
Pairwise functions are unary vector functions that perform a binary operation on
a pair of elements in the input array, typically on adjacent elements. The n-th
output is computed by applying the binary operation to the n-th and (n-p)-th inputs,
where p is the period. The default period is 1, in which case the binary
operation is applied to adjacent pairs of inputs. The period can also be
negative, in which case the n-th output is computed by applying the binary
operation to the n-th and (n+abs(p))-th inputs.

+------------------------+-------+----------------------+----------------------+--------------------------------+----------+
| Function name          | Arity | Input types          | Output type          | Options class                  | Notes    |
+========================+=======+======================+======================+================================+==========+
| pairwise_diff          | Unary | Numeric/Temporal     | Numeric/Temporal     | :struct:`PairwiseOptions`      | \(1)(2)  |
+------------------------+-------+----------------------+----------------------+--------------------------------+----------+
| pairwise_diff_checked  | Unary | Numeric/Temporal     | Numeric/Temporal     | :struct:`PairwiseOptions`      | \(1)(3)  |
+------------------------+-------+----------------------+----------------------+--------------------------------+----------+

* \(1) Computes the first order difference of an array, It internally calls
  the scalar function ``Subtract`` (or the checked variant) to compute
  differences, so its behavior and supported types are the same as
  ``Subtract``. The period can be specified in :struct:`PairwiseOptions`.
* \(2) Wraps around the result when overflow is detected.
* \(3) Returns an ``Invalid`` :class:`Status` when overflow is detected.