1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155
|
;=========================== begin_copyright_notice ============================
;
; Copyright (C) 2022 Intel Corporation
;
; SPDX-License-Identifier: MIT
;
;============================ end_copyright_notice =============================
;
; RUN: igc_opt -adv-codemotion-cm=1 -igc-advcodemotion -S < %s | FileCheck %s
; ------------------------------------------------
; AdvCodeMotion
; ------------------------------------------------
define spir_kernel void @test(i32 addrspace(1)* %dst, <8 x i32> %r0, <8 x i32> %payloadHeader, i16 %localIdX, i16 %localIdY, i16 %localIdZ, <3 x i32> %globalSize, <3 x i32> %enqueuedLocalSize, <3 x i32> %localSize, i8* %privateBase, i32 %bufferOffset) #0 {
; CHECK-LABEL: @test(
; CHECK-NEXT: entry:
; CHECK-NEXT: [[TMP0:%.*]] = extractelement <8 x i32> [[R0:%.*]], i32 1
; CHECK-NEXT: [[TMP1:%.*]] = extractelement <3 x i32> [[GLOBALSIZE:%.*]], i32 0
; CHECK-NEXT: [[TMP2:%.*]] = extractelement <3 x i32> [[LOCALSIZE:%.*]], i32 0
; CHECK-NEXT: [[TMP3:%.*]] = extractelement <3 x i32> [[ENQUEUEDLOCALSIZE:%.*]], i32 0
; CHECK-NEXT: [[TMP4:%.*]] = mul i32 [[TMP3]], [[TMP0]]
; CHECK-NEXT: [[TMP5:%.*]] = zext i16 [[LOCALIDX:%.*]] to i32
; CHECK-NEXT: [[TMP6:%.*]] = add i32 [[TMP5]], [[TMP4]]
; CHECK-NEXT: [[TMP7:%.*]] = extractelement <8 x i32> [[PAYLOADHEADER:%.*]], i32 0
; CHECK-NEXT: [[TMP8:%.*]] = add i32 [[TMP6]], [[TMP7]]
; CHECK-NEXT: br label [[BB3:%.*]]
; CHECK: bb1:
; CHECK-NEXT: [[A:%.*]] = phi i32 [ [[B:%.*]], [[BB3]] ], [ [[AI:%.*]], [[BB1:%.*]] ]
; CHECK-NEXT: [[LC:%.*]] = phi i32 [ [[BI:%.*]], [[BB3]] ], [ [[LC]], [[BB1]] ]
; CHECK-NEXT: [[AI]] = add i32 [[A]], 1
; CHECK-NEXT: [[AC:%.*]] = icmp ne i32 [[TMP8]], [[AI]]
; CHECK-NEXT: [[CC:%.*]] = icmp eq i32 [[AI]], [[LC]]
; CHECK-NEXT: br i1 [[CC]], label [[BB2:%.*]], label [[BB1]]
; CHECK: bb2:
; CHECK-NEXT: [[AAA:%.*]] = add i32 [[TMP1]], [[TMP2]]
; CHECK-NEXT: [[ACC:%.*]] = icmp eq i32 [[AAA]], 0
; CHECK-NEXT: [[TMP9:%.*]] = and i1 [[AC]], [[ACC]]
; CHECK-NEXT: br i1 [[TMP9]], label [[TBB2:%.*]], label [[FBB2:%.*]]
; CHECK: bb3:
; CHECK-NEXT: [[B]] = phi i32 [ -1, [[ENTRY:%.*]] ], [ [[BI]], [[JOIN2:%.*]] ]
; CHECK-NEXT: [[BL:%.*]] = phi i32 [ 0, [[ENTRY]] ], [ [[BLI:%.*]], [[JOIN2]] ]
; CHECK-NEXT: [[BI]] = add i32 [[B]], [[TMP2]]
; CHECK-NEXT: [[BLI]] = add i32 [[BL]], [[TMP2]]
; CHECK-NEXT: [[BC:%.*]] = icmp ult i32 [[BLI]], [[TMP1]]
; CHECK-NEXT: br i1 [[BC]], label [[BB1]], label [[END:%.*]]
; CHECK: fbb2:
; CHECK-NEXT: br label [[JOIN2]]
; CHECK: tbb2:
; CHECK-NEXT: [[BBB:%.*]] = add i32 [[TMP1]], [[TMP2]]
; CHECK-NEXT: br label [[JOIN2]]
; CHECK: join2:
; CHECK-NEXT: [[J2PHI:%.*]] = phi i32 [ [[BBB]], [[TBB2]] ], [ 0, [[FBB2]] ]
; CHECK-NEXT: [[ORPHI:%.*]] = phi i32 [ 1, [[TBB2]] ], [ [[TMP1]], [[FBB2]] ]
; CHECK-NEXT: store i32 [[J2PHI]], i32 addrspace(1)* [[DST:%.*]], align 4
; CHECK-NEXT: store i32 [[ORPHI]], i32 addrspace(1)* [[DST]], align 4
; CHECK-NEXT: store i32 -1, i32 addrspace(1)* [[DST]], align 4
; CHECK-NEXT: br label [[BB3]]
; CHECK: end:
; CHECK-NEXT: store i32 [[TMP8]], i32 addrspace(1)* [[DST]], align 4
; CHECK-NEXT: ret void
entry:
%0 = extractelement <8 x i32> %r0, i32 1
%1 = extractelement <3 x i32> %globalSize, i32 0
%2 = extractelement <3 x i32> %localSize, i32 0
%3 = extractelement <3 x i32> %enqueuedLocalSize, i32 0
%4 = mul i32 %3, %0
%5 = zext i16 %localIdX to i32
%6 = add i32 %5, %4
%7 = extractelement <8 x i32> %payloadHeader, i32 0
%8 = add i32 %6, %7
br label %bb3
bb1: ; preds = %bb3, %bb1
%a = phi i32 [ %b, %bb3 ], [ %ai, %bb1 ]
%lc = phi i32 [ %bi, %bb3 ], [ %lc, %bb1 ]
%ai = add i32 %a, 1
%ac = icmp ne i32 %8, %ai
%cc = icmp eq i32 %ai, %lc
br i1 %cc, label %bb2, label %bb1
bb2: ; preds = %bb1
br i1 %ac, label %bb4, label %fbb
bb3: ; preds = %join, %entry
%b = phi i32 [ -1, %entry ], [ %bi, %join ]
%bl = phi i32 [ 0, %entry ], [ %bli, %join ]
%bi = add i32 %b, %2
%bli = add i32 %bl, %2
%bc = icmp ult i32 %bli, %1
br i1 %bc, label %bb1, label %end
bb4: ; preds = %bb2
%aaa = add i32 %1, %2
%acc = icmp eq i32 %aaa, 0
br i1 %acc, label %tbb2, label %fbb2
fbb: ; preds = %bb2
br label %join
fbb2: ; preds = %bb4
br label %join2
tbb2: ; preds = %bb4
%bbb = add i32 %1, %2
br label %join2
join2: ; preds = %tbb2, %fbb2
%j2phi = phi i32 [ %bbb, %tbb2 ], [ 0, %fbb2 ]
%orphi = phi i32 [ 1, %tbb2 ], [ 0, %fbb2 ]
store i32 %j2phi, i32 addrspace(1)* %dst, align 4
%oropt = or i32 %1, %orphi
store i32 %oropt, i32 addrspace(1)* %dst, align 4
br label %join
join: ; preds = %join2, %fbb
%jphi = phi i32 [ -1, %join2 ], [ 0, %fbb ]
store i32 %jphi, i32 addrspace(1)* %dst, align 4
br label %bb3
end: ; preds = %bb3
store i32 %8, i32 addrspace(1)* %dst, align 4
ret void
}
; Function Attrs: nounwind readnone speculatable
declare void @llvm.dbg.declare(metadata, metadata, metadata) #1
; Function Attrs: convergent nounwind readnone
declare spir_func i32 @__builtin_IB_get_local_size(i32) local_unnamed_addr #2
; Function Attrs: convergent nounwind readnone
declare spir_func i32 @__builtin_IB_get_global_size(i32) local_unnamed_addr #2
; Function Attrs: nounwind readnone speculatable
declare void @llvm.dbg.value(metadata, metadata, metadata) #1
attributes #0 = { convergent noinline nounwind optnone }
attributes #1 = { nounwind readnone speculatable }
attributes #2 = { convergent nounwind readnone }
!igc.functions = !{!0}
!0 = !{void (i32 addrspace(1)*, <8 x i32>, <8 x i32>, i16, i16, i16, <3 x i32>, <3 x i32>, <3 x i32>, i8*, i32)* @test, !1}
!1 = !{!2, !3}
!2 = !{!"function_type", i32 0}
!3 = !{!"implicit_arg_desc", !4, !5, !6, !7, !8, !9}
!4 = !{i32 0}
!5 = !{i32 1}
!6 = !{i32 4}
!7 = !{i32 5}
!8 = !{i32 12}
!9 = !{i32 14, !10}
!10 = !{!"explicit_arg_num", i32 0}
|