1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163
|
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -passes=slp-vectorizer,dce -S -mtriple=i386-apple-macosx10.8.0 -mcpu=corei7-avx | FileCheck %s
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128-n8:16:32-S128"
target triple = "i386-apple-macosx10.8.0"
define double @foo(ptr nocapture %D) {
; CHECK-LABEL: @foo(
; CHECK-NEXT: br label [[TMP1:%.*]]
; CHECK: 1:
; CHECK-NEXT: [[I_02:%.*]] = phi i32 [ 0, [[TMP0:%.*]] ], [ [[TMP11:%.*]], [[TMP1]] ]
; CHECK-NEXT: [[SUM_01:%.*]] = phi double [ 0.000000e+00, [[TMP0]] ], [ [[TMP10:%.*]], [[TMP1]] ]
; CHECK-NEXT: [[TMP2:%.*]] = shl nsw i32 [[I_02]], 1
; CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds double, ptr [[D:%.*]], i32 [[TMP2]]
; CHECK-NEXT: [[TMP4:%.*]] = load <2 x double>, ptr [[TMP3]], align 4
; CHECK-NEXT: [[TMP5:%.*]] = fmul <2 x double> [[TMP4]], [[TMP4]]
; CHECK-NEXT: [[TMP6:%.*]] = fmul <2 x double> [[TMP5]], [[TMP5]]
; CHECK-NEXT: [[TMP7:%.*]] = extractelement <2 x double> [[TMP6]], i32 0
; CHECK-NEXT: [[TMP8:%.*]] = extractelement <2 x double> [[TMP6]], i32 1
; CHECK-NEXT: [[TMP9:%.*]] = fadd double [[TMP7]], [[TMP8]]
; CHECK-NEXT: [[TMP10]] = fadd double [[SUM_01]], [[TMP9]]
; CHECK-NEXT: [[TMP11]] = add nsw i32 [[I_02]], 1
; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[TMP11]], 100
; CHECK-NEXT: br i1 [[EXITCOND]], label [[TMP12:%.*]], label [[TMP1]]
; CHECK: 12:
; CHECK-NEXT: ret double [[TMP10]]
;
br label %1
; <label>:1 ; preds = %1, %0
%i.02 = phi i32 [ 0, %0 ], [ %10, %1 ]
%sum.01 = phi double [ 0.000000e+00, %0 ], [ %9, %1 ]
%2 = shl nsw i32 %i.02, 1
%3 = getelementptr inbounds double, ptr %D, i32 %2
%4 = load double, ptr %3, align 4
%A4 = fmul double %4, %4
%A42 = fmul double %A4, %A4
%5 = or disjoint i32 %2, 1
%6 = getelementptr inbounds double, ptr %D, i32 %5
%7 = load double, ptr %6, align 4
%A7 = fmul double %7, %7
%A72 = fmul double %A7, %A7
%8 = fadd double %A42, %A72
%9 = fadd double %sum.01, %8
%10 = add nsw i32 %i.02, 1
%exitcond = icmp eq i32 %10, 100
br i1 %exitcond, label %11, label %1
; <label>:11 ; preds = %1
ret double %9
}
define i1 @two_wide_fcmp_reduction(<2 x double> %a0) {
; CHECK-LABEL: @two_wide_fcmp_reduction(
; CHECK-NEXT: [[A:%.*]] = fcmp ogt <2 x double> [[A0:%.*]], <double 1.000000e+00, double 1.000000e+00>
; CHECK-NEXT: [[B:%.*]] = extractelement <2 x i1> [[A]], i32 0
; CHECK-NEXT: [[C:%.*]] = extractelement <2 x i1> [[A]], i32 1
; CHECK-NEXT: [[D:%.*]] = and i1 [[B]], [[C]]
; CHECK-NEXT: ret i1 [[D]]
;
%a = fcmp ogt <2 x double> %a0, <double 1.0, double 1.0>
%b = extractelement <2 x i1> %a, i32 0
%c = extractelement <2 x i1> %a, i32 1
%d = and i1 %b, %c
ret i1 %d
}
define double @fadd_reduction(<2 x double> %a0) {
; CHECK-LABEL: @fadd_reduction(
; CHECK-NEXT: [[A:%.*]] = fadd fast <2 x double> [[A0:%.*]], <double 1.000000e+00, double 1.000000e+00>
; CHECK-NEXT: [[B:%.*]] = extractelement <2 x double> [[A]], i32 0
; CHECK-NEXT: [[C:%.*]] = extractelement <2 x double> [[A]], i32 1
; CHECK-NEXT: [[D:%.*]] = fadd fast double [[B]], [[C]]
; CHECK-NEXT: ret double [[D]]
;
%a = fadd fast <2 x double> %a0, <double 1.000000e+00, double 1.000000e+00>
%b = extractelement <2 x double> %a, i32 0
%c = extractelement <2 x double> %a, i32 1
%d = fadd fast double %b, %c
ret double %d
}
; PR43745 https://bugs.llvm.org/show_bug.cgi?id=43745
define i1 @fcmp_lt_gt(double %a, double %b, double %c) {
; CHECK-LABEL: @fcmp_lt_gt(
; CHECK-NEXT: entry:
; CHECK-NEXT: [[FNEG:%.*]] = fneg double [[B:%.*]]
; CHECK-NEXT: [[MUL:%.*]] = fmul double [[A:%.*]], 2.000000e+00
; CHECK-NEXT: [[TMP0:%.*]] = insertelement <2 x double> poison, double [[C:%.*]], i32 1
; CHECK-NEXT: [[TMP1:%.*]] = insertelement <2 x double> [[TMP0]], double [[FNEG]], i32 0
; CHECK-NEXT: [[TMP2:%.*]] = shufflevector <2 x double> [[TMP1]], <2 x double> poison, <2 x i32> <i32 1, i32 poison>
; CHECK-NEXT: [[TMP3:%.*]] = insertelement <2 x double> [[TMP2]], double [[B]], i32 1
; CHECK-NEXT: [[TMP4:%.*]] = fsub <2 x double> [[TMP1]], [[TMP3]]
; CHECK-NEXT: [[TMP5:%.*]] = insertelement <2 x double> poison, double [[MUL]], i32 0
; CHECK-NEXT: [[TMP6:%.*]] = shufflevector <2 x double> [[TMP5]], <2 x double> poison, <2 x i32> zeroinitializer
; CHECK-NEXT: [[TMP7:%.*]] = fdiv <2 x double> [[TMP4]], [[TMP6]]
; CHECK-NEXT: [[TMP8:%.*]] = extractelement <2 x double> [[TMP7]], i32 1
; CHECK-NEXT: [[CMP:%.*]] = fcmp olt double [[TMP8]], 0x3EB0C6F7A0B5ED8D
; CHECK-NEXT: [[TMP9:%.*]] = extractelement <2 x double> [[TMP7]], i32 0
; CHECK-NEXT: [[CMP4:%.*]] = fcmp olt double [[TMP9]], 0x3EB0C6F7A0B5ED8D
; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[CMP]], [[CMP4]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[CLEANUP:%.*]], label [[LOR_LHS_FALSE:%.*]]
; CHECK: lor.lhs.false:
; CHECK-NEXT: [[TMP10:%.*]] = fcmp ule <2 x double> [[TMP7]], <double 1.000000e+00, double 1.000000e+00>
; CHECK-NEXT: [[TMP11:%.*]] = extractelement <2 x i1> [[TMP10]], i32 0
; CHECK-NEXT: [[TMP12:%.*]] = extractelement <2 x i1> [[TMP10]], i32 1
; CHECK-NEXT: [[NOT_OR_COND9:%.*]] = or i1 [[TMP11]], [[TMP12]]
; CHECK-NEXT: ret i1 [[NOT_OR_COND9]]
; CHECK: cleanup:
; CHECK-NEXT: ret i1 false
;
entry:
%fneg = fneg double %b
%add = fsub double %c, %b
%mul = fmul double %a, 2.000000e+00
%div = fdiv double %add, %mul
%sub = fsub double %fneg, %c
%div3 = fdiv double %sub, %mul
%cmp = fcmp olt double %div, 0x3EB0C6F7A0B5ED8D
%cmp4 = fcmp olt double %div3, 0x3EB0C6F7A0B5ED8D
%or.cond = and i1 %cmp, %cmp4
br i1 %or.cond, label %cleanup, label %lor.lhs.false
lor.lhs.false:
%cmp5 = fcmp ule double %div, 1.000000e+00
%cmp7 = fcmp ule double %div3, 1.000000e+00
%not.or.cond9 = or i1 %cmp7, %cmp5
ret i1 %not.or.cond9
cleanup:
ret i1 false
}
define i1 @fcmp_lt(double %a, double %b, double %c) {
; CHECK-LABEL: @fcmp_lt(
; CHECK-NEXT: [[FNEG:%.*]] = fneg double [[B:%.*]]
; CHECK-NEXT: [[MUL:%.*]] = fmul double [[A:%.*]], 2.000000e+00
; CHECK-NEXT: [[TMP1:%.*]] = insertelement <2 x double> poison, double [[C:%.*]], i32 1
; CHECK-NEXT: [[TMP2:%.*]] = insertelement <2 x double> [[TMP1]], double [[FNEG]], i32 0
; CHECK-NEXT: [[TMP3:%.*]] = shufflevector <2 x double> [[TMP2]], <2 x double> poison, <2 x i32> <i32 1, i32 poison>
; CHECK-NEXT: [[TMP4:%.*]] = insertelement <2 x double> [[TMP3]], double [[B]], i32 1
; CHECK-NEXT: [[TMP5:%.*]] = fsub <2 x double> [[TMP2]], [[TMP4]]
; CHECK-NEXT: [[TMP6:%.*]] = insertelement <2 x double> poison, double [[MUL]], i32 0
; CHECK-NEXT: [[TMP7:%.*]] = shufflevector <2 x double> [[TMP6]], <2 x double> poison, <2 x i32> zeroinitializer
; CHECK-NEXT: [[TMP8:%.*]] = fdiv <2 x double> [[TMP5]], [[TMP7]]
; CHECK-NEXT: [[TMP9:%.*]] = fcmp uge <2 x double> [[TMP8]], <double 0x3EB0C6F7A0B5ED8D, double 0x3EB0C6F7A0B5ED8D>
; CHECK-NEXT: [[TMP10:%.*]] = extractelement <2 x i1> [[TMP9]], i32 0
; CHECK-NEXT: [[TMP11:%.*]] = extractelement <2 x i1> [[TMP9]], i32 1
; CHECK-NEXT: [[NOT_OR_COND:%.*]] = or i1 [[TMP10]], [[TMP11]]
; CHECK-NEXT: ret i1 [[NOT_OR_COND]]
;
%fneg = fneg double %b
%add = fsub double %c, %b
%mul = fmul double %a, 2.000000e+00
%div = fdiv double %add, %mul
%sub = fsub double %fneg, %c
%div3 = fdiv double %sub, %mul
%cmp = fcmp uge double %div, 0x3EB0C6F7A0B5ED8D
%cmp4 = fcmp uge double %div3, 0x3EB0C6F7A0B5ED8D
%not.or.cond = or i1 %cmp4, %cmp
ret i1 %not.or.cond
}
|