1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170
|
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# substrait::{ExtensionTypeVariation, ExtensionType}s
# for wrapping types which appear in the arrow type system but
# are not first-class in substrait. These include:
# - null
# - unsigned integers
# - half-precision floating point numbers
# - 32-bit times and 64-bit dates
# - timestamps with units other than microseconds
# - timestamps with timezones other than UTC
# - 256-bit decimals
# - sparse and dense unions
# - dictionary encoded types
# - durations
# - string and binary with 64 bit offsets
# - list with 64-bit offsets
# - interval<months: i32>
# - interval<days: i32, millis: i32>
# - interval<months: i32, days: i32, nanos: i64>
# - arrow::ExtensionTypes
#
# These types fall into several categories of behavior:
# Certain Arrow data types are, from Substrait's point of view, encodings.
# These include dictionary, the view types (e.g. binary view, list view),
# and REE.
#
# These types are not logically distinct from the type they are encoding.
# Specifically, the types meet the following criteria:
# * There is no value in the decoded type that cannot be represented
# as a value in the encoded type and vice versa.
# * Functions have the same meaning when applied to the encoded type
#
# Note: if two types have a different range (e.g. string and large_string) then
# they do not satisfy the above criteria and are not encodings.
#
# These types will never have a Substrait equivalent. In the Substrait point
# of view these are execution details.
# The following types are encodings:
# binary_view
# list_view
# dictionary
# ree
# Arrow-cpp's Substrait serde does not yet handle parameterized UDTs. This means
# the following types are not yet supported but may be supported in the future.
# We define them below in case other implementations support them in the meantime.
# decimal256
# large_list
# fixed_size_list
# duration
# Other types are not encodings, but are not first-class in Substrait. These
# types are often similar to existing Substrait types but define a different range
# of values. For example, unsigned integer types are very similar to their integer
# counterparts, but have a different range of values. These types are defined here
# as extension types.
#
# A full description of the types, along with their specified range, can be found
# in Schema.fbs
#
# Consumers should take care when supporting the below types. Should Substrait decide
# later to support these types, the consumer will need to make sure to continue supporting
# the extension type names as aliases for proper backwards compatibility.
types:
- name: "null"
structure: {}
- name: interval_month
structure:
months: i32
- name: interval_day_milli
structure:
days: i32
millis: i32
- name: interval_month_day_nano
structure:
months: i32
days: i32
nanos: i64
# All unsigned integer literals are encoded as user defined literals with
# a google.protobuf.UInt64Value message.
- name: u8
structure: {}
- name: u16
structure: {}
- name: u32
structure: {}
- name: u64
structure: {}
# fp16 literals are encoded as user defined literals with
# a google.protobuf.UInt32Value message where the lower 16 bits are
# the fp16 value.
- name: fp16
structure: {}
# 64-bit integers are big. Even though date64 stores ms and not days it
# can still represent about 50x more dates than date32. Since it has a
# different range of values, it is an extension type.
#
# date64 literals are encoded as user defined literals with
# a google.protobuf.Int64Value message.
- name: date_millis
structure: {}
# time literals are encoded as user defined literals with
# a google.protobuf.Int32Value message (for time_seconds/time_millis)
# or a google.protobuf.Int64Value message (for time_nanos).
- name: time_seconds
structure: {}
- name: time_millis
structure: {}
- name: time_nanos
structure: {}
# Large string literals are encoded using a
# google.protobuf.StringValue message.
- name: large_string
structure: {}
# Large binary literals are encoded using a
# google.protobuf.BytesValue message.
- name: large_binary
structure: {}
# We cannot generate these today because they are parameterized UDTs and
# substrait-cpp does not yet support parameterized UDTs.
- name: decimal256
structure: {}
parameters:
- name: precision
type: integer
min: 0
max: 76
- name: scale
type: integer
min: 0
max: 76
- name: large_list
structure: {}
parameters:
- name: value_type
type: dataType
- name: fixed_size_list
structure: {}
parameters:
- name: value_type
type: dataType
- name: dimension
type: integer
min: 0
- name: duration
structure: {}
parameters:
- name: unit
type: string
|