1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148
|
// This test checks that result of shuffle(T, int) and shuffle(T, T, int) with constant indexes is shufflevector.
// RUN: %{ispc} --target=sse4.2-i8x16 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=sse4.2-i16x8 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=sse4.2-i32x4 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=sse4.2-i32x8 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx1-i32x4 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx1-i32x8 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx1-i32x16 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx1-i64x4 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx2-i8x32 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx2-i16x16 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx2-i32x4 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx2-i32x8 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx2-i32x16 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx2-i64x4 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx512skx-x4 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx512skx-x8 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx512skx-x16 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx512skx-x32 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s --check-prefix=CHECK-AVX512-32
// RUN: %{ispc} --target=avx512icl-x4 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx512icl-x8 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx512icl-x16 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s
// RUN: %{ispc} --target=avx512icl-x32 --nowrap -O2 --emit-llvm-text %s -o - | FileCheck %s --check-prefix=CHECK-AVX512-32
// For avx512-x32/x64 targets shuf1_int64 and shuf1_double uses implementation based on masked.gather.
// Optimizer can eliminate shufflevector completely and use GEP+gather so we're not checking it here.
// REQUIRES: X86_ENABLED
template <typename T>
unmasked void shuf1(uniform T a[], uniform T ret[], int perm) {
ret[programIndex] = shuffle(a[programIndex], perm);
}
// CHECK-LABEL: shuf1_int8
// CHECK: shufflevector
// CHECK-AVX512-32-LABEL: shuf1_int8
// CHECK-AVX512-32: shufflevector
unmasked void shuf1_int8(uniform int8 a[], uniform int8 ret[]) {
shuf1(a, ret, programCount - 1 - programIndex);
}
// CHECK-LABEL: shuf1_int16
// CHECK: shufflevector
// CHECK-AVX512-32-LABEL: shuf1_int16
// CHECK-AVX512-32: shufflevector
unmasked void shuf1_int16(uniform int16 a[], uniform int16 ret[]) {
shuf1(a, ret, programCount - 1 - programIndex);
}
// CHECK-LABEL: shuf1_float16
// CHECK: shufflevector
// CHECK-AVX512-32-LABEL: shuf1_float16
// CHECK-AVX512-32: shufflevector
unmasked void shuf1_float16(uniform float16 a[], uniform float16 ret[]) {
shuf1(a, ret, programCount - 1 - programIndex);
}
// CHECK-LABEL: shuf1_int
// CHECK: shufflevector
// CHECK-AVX512-32-LABEL: shuf1_int
// CHECK-AVX512-32: shufflevector
unmasked void shuf1_int(uniform int a[], uniform int ret[]) {
shuf1(a, ret, programCount - 1 - programIndex);
}
// CHECK-LABEL: shuf1_float
// CHECK: shufflevector
// CHECK-AVX512-32-LABEL: shuf1_float
// CHECK-AVX512-32: shufflevector
unmasked void shuf1_float(uniform float a[], uniform float ret[]) {
shuf1(a, ret, programCount - 1 - programIndex);
}
// CHECK-LABEL: shuf1_int64
// CHECK: shufflevector
unmasked void shuf1_int64(uniform int64 a[], uniform int64 ret[]) {
shuf1(a, ret, programCount - 1 - programIndex);
}
// CHECK-LABEL: shuf1_double
// CHECK: shufflevector
unmasked void shuf1_double(uniform double a[], uniform double ret[]) {
shuf1(a, ret, programCount - 1 - programIndex);
}
template <typename T>
unmasked void shuf2(uniform T a[], uniform T ret[], int perm) {
T aa = a[programIndex];
T bb = aa + programCount;
ret[programIndex] = shuffle(aa, bb, perm);
}
// CHECK-LABEL: shuf2_int8
// CHECK: shufflevector
// CHECK-AVX512-32-LABEL: shuf2_int8
// CHECK-AVX512-32: shufflevector
unmasked void shuf2_int8(uniform int8 a[], uniform int8 ret[]) {
shuf2(a, ret, programCount + 1);
}
// CHECK-LABEL: shuf2_int16
// CHECK: shufflevector
// CHECK-AVX512-32-LABEL: shuf2_int16
// CHECK-AVX512-32: shufflevector
unmasked void shuf2_int16(uniform int16 a[], uniform int16 ret[]) {
shuf2(a, ret, programCount + 1);
}
// CHECK-LABEL: shuf2_float16
// CHECK: shufflevector
// CHECK-AVX512-32-LABEL: shuf2_float16
// CHECK-AVX512-32: shufflevector
unmasked void shuf2_float16(uniform float16 a[], uniform float16 ret[]) {
shuf2(a, ret, programCount + 1);
}
// CHECK-LABEL: shuf2_int
// CHECK: shufflevector
// CHECK-AVX512-32-LABEL: shuf2_int
// CHECK-AVX512-32: shufflevector
unmasked void shuf2_int(uniform int a[], uniform int ret[]) {
shuf2(a, ret, programCount + 1);
}
// CHECK-LABEL: shuf2_float
// CHECK: shufflevector
// CHECK-AVX512-32-LABEL: shuf2_float
// CHECK-AVX512-32: shufflevector
unmasked void shuf2_float(uniform float a[], uniform float ret[]) {
shuf2(a, ret, programCount + 1);
}
// CHECK-LABEL: shuf2_int64
// CHECK: shufflevector
unmasked void shuf2_int64(uniform int64 a[], uniform int64 ret[]) {
shuf2(a, ret, programCount + 1);
}
// CHECK-LABEL: shuf2_double
// CHECK: shufflevector
unmasked void shuf2_double(uniform double a[], uniform double ret[]) {
shuf2(a, ret, programCount + 1);
}
|