1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
|
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -O3 -x86-asm-syntax=intel -mtriple=x86_64 -mcpu=skylake-avx512 < %s | FileCheck %s
declare <7 x i1> @llvm.get.active.lane.mask.v7i1.i64(i64, i64)
declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64, i64)
declare <32 x i1> @llvm.get.active.lane.mask.v32i1.i64(i64, i64)
declare <64 x i1> @llvm.get.active.lane.mask.v64i1.i64(i64, i64)
declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i32(i32, i32)
declare <64 x i1> @llvm.get.active.lane.mask.v64i1.i32(i32, i32)
define <7 x i1> @create_mask7(i64 %0) {
; CHECK-LABEL: create_mask7:
; CHECK: # %bb.0:
; CHECK-NEXT: mov rax, rdi
; CHECK-NEXT: vpbroadcastq zmm0, rsi
; CHECK-NEXT: vpcmpnleuq k0, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: kshiftrb k1, k0, 6
; CHECK-NEXT: kmovd ecx, k1
; CHECK-NEXT: kshiftrb k1, k0, 5
; CHECK-NEXT: kmovd edx, k1
; CHECK-NEXT: kshiftrb k1, k0, 4
; CHECK-NEXT: kmovd esi, k1
; CHECK-NEXT: kshiftrb k1, k0, 3
; CHECK-NEXT: kmovd edi, k1
; CHECK-NEXT: kshiftrb k1, k0, 2
; CHECK-NEXT: kmovd r8d, k1
; CHECK-NEXT: kshiftrb k1, k0, 1
; CHECK-NEXT: kmovd r9d, k1
; CHECK-NEXT: kmovd r10d, k0
; CHECK-NEXT: and r10b, 1
; CHECK-NEXT: and r9b, 1
; CHECK-NEXT: add r9b, r9b
; CHECK-NEXT: or r9b, r10b
; CHECK-NEXT: and r8b, 1
; CHECK-NEXT: shl r8b, 2
; CHECK-NEXT: or r8b, r9b
; CHECK-NEXT: and dil, 1
; CHECK-NEXT: shl dil, 3
; CHECK-NEXT: or dil, r8b
; CHECK-NEXT: and sil, 1
; CHECK-NEXT: shl sil, 4
; CHECK-NEXT: or sil, dil
; CHECK-NEXT: and dl, 1
; CHECK-NEXT: shl dl, 5
; CHECK-NEXT: or dl, sil
; CHECK-NEXT: shl cl, 6
; CHECK-NEXT: or cl, dl
; CHECK-NEXT: and cl, 127
; CHECK-NEXT: mov byte ptr [rax], cl
; CHECK-NEXT: vzeroupper
; CHECK-NEXT: ret
%2 = call <7 x i1> @llvm.get.active.lane.mask.v7i1.i64(i64 0, i64 %0)
ret <7 x i1> %2
}
define <16 x i1> @create_mask16(i64 %0) {
; CHECK-LABEL: create_mask16:
; CHECK: # %bb.0:
; CHECK-NEXT: vpbroadcastq zmm0, rdi
; CHECK-NEXT: vpcmpnleuq k0, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: vpcmpnleuq k1, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: kunpckbw k0, k1, k0
; CHECK-NEXT: vpmovm2b xmm0, k0
; CHECK-NEXT: vzeroupper
; CHECK-NEXT: ret
%2 = call <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 0, i64 %0)
ret <16 x i1> %2
}
define <32 x i1> @create_mask32(i64 %0) {
; CHECK-LABEL: create_mask32:
; CHECK: # %bb.0:
; CHECK-NEXT: vpbroadcastq zmm0, rdi
; CHECK-NEXT: vpcmpnleuq k0, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: vpcmpnleuq k1, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: vpcmpnleuq k2, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: kunpckbw k0, k1, k0
; CHECK-NEXT: vpcmpnleuq k1, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: kunpckbw k1, k1, k2
; CHECK-NEXT: kunpckwd k0, k1, k0
; CHECK-NEXT: vpmovm2b ymm0, k0
; CHECK-NEXT: ret
%2 = call <32 x i1> @llvm.get.active.lane.mask.v32i1.i64(i64 0, i64 %0)
ret <32 x i1> %2
}
define <64 x i1> @create_mask64(i64 %0) {
; CHECK-LABEL: create_mask64:
; CHECK: # %bb.0:
; CHECK-NEXT: vpbroadcastq zmm0, rdi
; CHECK-NEXT: vpcmpnleuq k0, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: vpcmpnleuq k1, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: vpcmpnleuq k2, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: kunpckbw k0, k1, k0
; CHECK-NEXT: vpcmpnleuq k1, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: kunpckbw k1, k1, k2
; CHECK-NEXT: vpcmpnleuq k2, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: kunpckwd k0, k1, k0
; CHECK-NEXT: vpcmpnleuq k1, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: kunpckbw k1, k1, k2
; CHECK-NEXT: vpcmpnleuq k2, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: vpcmpnleuq k3, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: kunpckbw k2, k3, k2
; CHECK-NEXT: kunpckwd k1, k2, k1
; CHECK-NEXT: kunpckdq k0, k1, k0
; CHECK-NEXT: vpmovm2b zmm0, k0
; CHECK-NEXT: ret
%2 = call <64 x i1> @llvm.get.active.lane.mask.v64i1.i64(i64 0, i64 %0)
ret <64 x i1> %2
}
define <16 x i1> @create_mask16_i32(i32 %0) {
; CHECK-LABEL: create_mask16_i32:
; CHECK: # %bb.0:
; CHECK-NEXT: vpbroadcastd zmm0, edi
; CHECK-NEXT: vpcmpnleud k0, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: vpmovm2b xmm0, k0
; CHECK-NEXT: vzeroupper
; CHECK-NEXT: ret
%2 = call <16 x i1> @llvm.get.active.lane.mask.v16i1.i32(i32 0, i32 %0)
ret <16 x i1> %2
}
define <64 x i1> @create_mask64_i32(i32 %0) {
; CHECK-LABEL: create_mask64_i32:
; CHECK: # %bb.0:
; CHECK-NEXT: vpbroadcastd zmm0, edi
; CHECK-NEXT: vpcmpnleud k0, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: vpcmpnleud k1, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: vpcmpnleud k2, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: kunpckwd k0, k1, k0
; CHECK-NEXT: vpcmpnleud k1, zmm0, zmmword ptr [rip + {{\.?LCPI[0-9]+_[0-9]+}}]
; CHECK-NEXT: kunpckwd k1, k1, k2
; CHECK-NEXT: kunpckdq k0, k1, k0
; CHECK-NEXT: vpmovm2b zmm0, k0
; CHECK-NEXT: ret
%2 = call <64 x i1> @llvm.get.active.lane.mask.v64i1.i32(i32 0, i32 %0)
ret <64 x i1> %2
}
|