1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506
|
; RUN: llvm-upgrade < %s | llvm-as | opt -instcombine | \
; RUN: llc -march=ppc32 -mcpu=g5 | not grep vperm
; RUN: llvm-upgrade < %s | llvm-as | llc -march=ppc32 -mcpu=g5 > %t
; RUN: grep vsldoi %t | count 2
; RUN: grep vmrgh %t | count 7
; RUN: grep vmrgl %t | count 6
; RUN: grep vpkuhum %t | count 1
; RUN: grep vpkuwum %t | count 1
void %VSLDOI_xy(<8 x short>* %A, <8 x short>* %B) {
entry:
%tmp = load <8 x short>* %A ; <<8 x short>> [#uses=1]
%tmp2 = load <8 x short>* %B ; <<8 x short>> [#uses=1]
%tmp = cast <8 x short> %tmp to <16 x sbyte> ; <<16 x sbyte>> [#uses=11]
%tmp2 = cast <8 x short> %tmp2 to <16 x sbyte> ; <<16 x sbyte>> [#uses=5]
%tmp = extractelement <16 x sbyte> %tmp, uint 5 ; <sbyte> [#uses=1]
%tmp3 = extractelement <16 x sbyte> %tmp, uint 6 ; <sbyte> [#uses=1]
%tmp4 = extractelement <16 x sbyte> %tmp, uint 7 ; <sbyte> [#uses=1]
%tmp5 = extractelement <16 x sbyte> %tmp, uint 8 ; <sbyte> [#uses=1]
%tmp6 = extractelement <16 x sbyte> %tmp, uint 9 ; <sbyte> [#uses=1]
%tmp7 = extractelement <16 x sbyte> %tmp, uint 10 ; <sbyte> [#uses=1]
%tmp8 = extractelement <16 x sbyte> %tmp, uint 11 ; <sbyte> [#uses=1]
%tmp9 = extractelement <16 x sbyte> %tmp, uint 12 ; <sbyte> [#uses=1]
%tmp10 = extractelement <16 x sbyte> %tmp, uint 13 ; <sbyte> [#uses=1]
%tmp11 = extractelement <16 x sbyte> %tmp, uint 14 ; <sbyte> [#uses=1]
%tmp12 = extractelement <16 x sbyte> %tmp, uint 15 ; <sbyte> [#uses=1]
%tmp13 = extractelement <16 x sbyte> %tmp2, uint 0 ; <sbyte> [#uses=1]
%tmp14 = extractelement <16 x sbyte> %tmp2, uint 1 ; <sbyte> [#uses=1]
%tmp15 = extractelement <16 x sbyte> %tmp2, uint 2 ; <sbyte> [#uses=1]
%tmp16 = extractelement <16 x sbyte> %tmp2, uint 3 ; <sbyte> [#uses=1]
%tmp17 = extractelement <16 x sbyte> %tmp2, uint 4 ; <sbyte> [#uses=1]
%tmp18 = insertelement <16 x sbyte> undef, sbyte %tmp, uint 0 ; <<16 x sbyte>> [#uses=1]
%tmp19 = insertelement <16 x sbyte> %tmp18, sbyte %tmp3, uint 1 ; <<16 x sbyte>> [#uses=1]
%tmp20 = insertelement <16 x sbyte> %tmp19, sbyte %tmp4, uint 2 ; <<16 x sbyte>> [#uses=1]
%tmp21 = insertelement <16 x sbyte> %tmp20, sbyte %tmp5, uint 3 ; <<16 x sbyte>> [#uses=1]
%tmp22 = insertelement <16 x sbyte> %tmp21, sbyte %tmp6, uint 4 ; <<16 x sbyte>> [#uses=1]
%tmp23 = insertelement <16 x sbyte> %tmp22, sbyte %tmp7, uint 5 ; <<16 x sbyte>> [#uses=1]
%tmp24 = insertelement <16 x sbyte> %tmp23, sbyte %tmp8, uint 6 ; <<16 x sbyte>> [#uses=1]
%tmp25 = insertelement <16 x sbyte> %tmp24, sbyte %tmp9, uint 7 ; <<16 x sbyte>> [#uses=1]
%tmp26 = insertelement <16 x sbyte> %tmp25, sbyte %tmp10, uint 8 ; <<16 x sbyte>> [#uses=1]
%tmp27 = insertelement <16 x sbyte> %tmp26, sbyte %tmp11, uint 9 ; <<16 x sbyte>> [#uses=1]
%tmp28 = insertelement <16 x sbyte> %tmp27, sbyte %tmp12, uint 10 ; <<16 x sbyte>> [#uses=1]
%tmp29 = insertelement <16 x sbyte> %tmp28, sbyte %tmp13, uint 11 ; <<16 x sbyte>> [#uses=1]
%tmp30 = insertelement <16 x sbyte> %tmp29, sbyte %tmp14, uint 12 ; <<16 x sbyte>> [#uses=1]
%tmp31 = insertelement <16 x sbyte> %tmp30, sbyte %tmp15, uint 13 ; <<16 x sbyte>> [#uses=1]
%tmp32 = insertelement <16 x sbyte> %tmp31, sbyte %tmp16, uint 14 ; <<16 x sbyte>> [#uses=1]
%tmp33 = insertelement <16 x sbyte> %tmp32, sbyte %tmp17, uint 15 ; <<16 x sbyte>> [#uses=1]
%tmp33 = cast <16 x sbyte> %tmp33 to <8 x short> ; <<8 x short>> [#uses=1]
store <8 x short> %tmp33, <8 x short>* %A
ret void
}
void %VSLDOI_xx(<8 x short>* %A, <8 x short>* %B) {
%tmp = load <8 x short>* %A ; <<8 x short>> [#uses=1]
%tmp2 = load <8 x short>* %A ; <<8 x short>> [#uses=1]
%tmp = cast <8 x short> %tmp to <16 x sbyte> ; <<16 x sbyte>> [#uses=11]
%tmp2 = cast <8 x short> %tmp2 to <16 x sbyte> ; <<16 x sbyte>> [#uses=5]
%tmp = extractelement <16 x sbyte> %tmp, uint 5 ; <sbyte> [#uses=1]
%tmp3 = extractelement <16 x sbyte> %tmp, uint 6 ; <sbyte> [#uses=1]
%tmp4 = extractelement <16 x sbyte> %tmp, uint 7 ; <sbyte> [#uses=1]
%tmp5 = extractelement <16 x sbyte> %tmp, uint 8 ; <sbyte> [#uses=1]
%tmp6 = extractelement <16 x sbyte> %tmp, uint 9 ; <sbyte> [#uses=1]
%tmp7 = extractelement <16 x sbyte> %tmp, uint 10 ; <sbyte> [#uses=1]
%tmp8 = extractelement <16 x sbyte> %tmp, uint 11 ; <sbyte> [#uses=1]
%tmp9 = extractelement <16 x sbyte> %tmp, uint 12 ; <sbyte> [#uses=1]
%tmp10 = extractelement <16 x sbyte> %tmp, uint 13 ; <sbyte> [#uses=1]
%tmp11 = extractelement <16 x sbyte> %tmp, uint 14 ; <sbyte> [#uses=1]
%tmp12 = extractelement <16 x sbyte> %tmp, uint 15 ; <sbyte> [#uses=1]
%tmp13 = extractelement <16 x sbyte> %tmp2, uint 0 ; <sbyte> [#uses=1]
%tmp14 = extractelement <16 x sbyte> %tmp2, uint 1 ; <sbyte> [#uses=1]
%tmp15 = extractelement <16 x sbyte> %tmp2, uint 2 ; <sbyte> [#uses=1]
%tmp16 = extractelement <16 x sbyte> %tmp2, uint 3 ; <sbyte> [#uses=1]
%tmp17 = extractelement <16 x sbyte> %tmp2, uint 4 ; <sbyte> [#uses=1]
%tmp18 = insertelement <16 x sbyte> undef, sbyte %tmp, uint 0 ; <<16 x sbyte>> [#uses=1]
%tmp19 = insertelement <16 x sbyte> %tmp18, sbyte %tmp3, uint 1 ; <<16 x sbyte>> [#uses=1]
%tmp20 = insertelement <16 x sbyte> %tmp19, sbyte %tmp4, uint 2 ; <<16 x sbyte>> [#uses=1]
%tmp21 = insertelement <16 x sbyte> %tmp20, sbyte %tmp5, uint 3 ; <<16 x sbyte>> [#uses=1]
%tmp22 = insertelement <16 x sbyte> %tmp21, sbyte %tmp6, uint 4 ; <<16 x sbyte>> [#uses=1]
%tmp23 = insertelement <16 x sbyte> %tmp22, sbyte %tmp7, uint 5 ; <<16 x sbyte>> [#uses=1]
%tmp24 = insertelement <16 x sbyte> %tmp23, sbyte %tmp8, uint 6 ; <<16 x sbyte>> [#uses=1]
%tmp25 = insertelement <16 x sbyte> %tmp24, sbyte %tmp9, uint 7 ; <<16 x sbyte>> [#uses=1]
%tmp26 = insertelement <16 x sbyte> %tmp25, sbyte %tmp10, uint 8 ; <<16 x sbyte>> [#uses=1]
%tmp27 = insertelement <16 x sbyte> %tmp26, sbyte %tmp11, uint 9 ; <<16 x sbyte>> [#uses=1]
%tmp28 = insertelement <16 x sbyte> %tmp27, sbyte %tmp12, uint 10 ; <<16 x sbyte>> [#uses=1]
%tmp29 = insertelement <16 x sbyte> %tmp28, sbyte %tmp13, uint 11 ; <<16 x sbyte>> [#uses=1]
%tmp30 = insertelement <16 x sbyte> %tmp29, sbyte %tmp14, uint 12 ; <<16 x sbyte>> [#uses=1]
%tmp31 = insertelement <16 x sbyte> %tmp30, sbyte %tmp15, uint 13 ; <<16 x sbyte>> [#uses=1]
%tmp32 = insertelement <16 x sbyte> %tmp31, sbyte %tmp16, uint 14 ; <<16 x sbyte>> [#uses=1]
%tmp33 = insertelement <16 x sbyte> %tmp32, sbyte %tmp17, uint 15 ; <<16 x sbyte>> [#uses=1]
%tmp33 = cast <16 x sbyte> %tmp33 to <8 x short> ; <<8 x short>> [#uses=1]
store <8 x short> %tmp33, <8 x short>* %A
ret void
}
void %VPERM_promote(<8 x short>* %A, <8 x short>* %B) {
entry:
%tmp = load <8 x short>* %A ; <<8 x short>> [#uses=1]
%tmp = cast <8 x short> %tmp to <4 x int> ; <<4 x int>> [#uses=1]
%tmp2 = load <8 x short>* %B ; <<8 x short>> [#uses=1]
%tmp2 = cast <8 x short> %tmp2 to <4 x int> ; <<4 x int>> [#uses=1]
%tmp3 = call <4 x int> %llvm.ppc.altivec.vperm( <4 x int> %tmp, <4 x int> %tmp2, <16 x sbyte> < sbyte 14, sbyte 14, sbyte 14, sbyte 14, sbyte 14, sbyte 14, sbyte 14, sbyte 14, sbyte 14, sbyte 14, sbyte 14, sbyte 14, sbyte 14, sbyte 14, sbyte 14, sbyte 14 > ) ; <<4 x int>> [#uses=1]
%tmp3 = cast <4 x int> %tmp3 to <8 x short> ; <<8 x short>> [#uses=1]
store <8 x short> %tmp3, <8 x short>* %A
ret void
}
declare <4 x int> %llvm.ppc.altivec.vperm(<4 x int>, <4 x int>, <16 x sbyte>)
void %tb_l(<16 x sbyte>* %A, <16 x sbyte>* %B) {
entry:
%tmp = load <16 x sbyte>* %A ; <<16 x sbyte>> [#uses=8]
%tmp2 = load <16 x sbyte>* %B ; <<16 x sbyte>> [#uses=8]
%tmp = extractelement <16 x sbyte> %tmp, uint 8 ; <sbyte> [#uses=1]
%tmp3 = extractelement <16 x sbyte> %tmp2, uint 8 ; <sbyte> [#uses=1]
%tmp4 = extractelement <16 x sbyte> %tmp, uint 9 ; <sbyte> [#uses=1]
%tmp5 = extractelement <16 x sbyte> %tmp2, uint 9 ; <sbyte> [#uses=1]
%tmp6 = extractelement <16 x sbyte> %tmp, uint 10 ; <sbyte> [#uses=1]
%tmp7 = extractelement <16 x sbyte> %tmp2, uint 10 ; <sbyte> [#uses=1]
%tmp8 = extractelement <16 x sbyte> %tmp, uint 11 ; <sbyte> [#uses=1]
%tmp9 = extractelement <16 x sbyte> %tmp2, uint 11 ; <sbyte> [#uses=1]
%tmp10 = extractelement <16 x sbyte> %tmp, uint 12 ; <sbyte> [#uses=1]
%tmp11 = extractelement <16 x sbyte> %tmp2, uint 12 ; <sbyte> [#uses=1]
%tmp12 = extractelement <16 x sbyte> %tmp, uint 13 ; <sbyte> [#uses=1]
%tmp13 = extractelement <16 x sbyte> %tmp2, uint 13 ; <sbyte> [#uses=1]
%tmp14 = extractelement <16 x sbyte> %tmp, uint 14 ; <sbyte> [#uses=1]
%tmp15 = extractelement <16 x sbyte> %tmp2, uint 14 ; <sbyte> [#uses=1]
%tmp16 = extractelement <16 x sbyte> %tmp, uint 15 ; <sbyte> [#uses=1]
%tmp17 = extractelement <16 x sbyte> %tmp2, uint 15 ; <sbyte> [#uses=1]
%tmp18 = insertelement <16 x sbyte> undef, sbyte %tmp, uint 0 ; <<16 x sbyte>> [#uses=1]
%tmp19 = insertelement <16 x sbyte> %tmp18, sbyte %tmp3, uint 1 ; <<16 x sbyte>> [#uses=1]
%tmp20 = insertelement <16 x sbyte> %tmp19, sbyte %tmp4, uint 2 ; <<16 x sbyte>> [#uses=1]
%tmp21 = insertelement <16 x sbyte> %tmp20, sbyte %tmp5, uint 3 ; <<16 x sbyte>> [#uses=1]
%tmp22 = insertelement <16 x sbyte> %tmp21, sbyte %tmp6, uint 4 ; <<16 x sbyte>> [#uses=1]
%tmp23 = insertelement <16 x sbyte> %tmp22, sbyte %tmp7, uint 5 ; <<16 x sbyte>> [#uses=1]
%tmp24 = insertelement <16 x sbyte> %tmp23, sbyte %tmp8, uint 6 ; <<16 x sbyte>> [#uses=1]
%tmp25 = insertelement <16 x sbyte> %tmp24, sbyte %tmp9, uint 7 ; <<16 x sbyte>> [#uses=1]
%tmp26 = insertelement <16 x sbyte> %tmp25, sbyte %tmp10, uint 8 ; <<16 x sbyte>> [#uses=1]
%tmp27 = insertelement <16 x sbyte> %tmp26, sbyte %tmp11, uint 9 ; <<16 x sbyte>> [#uses=1]
%tmp28 = insertelement <16 x sbyte> %tmp27, sbyte %tmp12, uint 10 ; <<16 x sbyte>> [#uses=1]
%tmp29 = insertelement <16 x sbyte> %tmp28, sbyte %tmp13, uint 11 ; <<16 x sbyte>> [#uses=1]
%tmp30 = insertelement <16 x sbyte> %tmp29, sbyte %tmp14, uint 12 ; <<16 x sbyte>> [#uses=1]
%tmp31 = insertelement <16 x sbyte> %tmp30, sbyte %tmp15, uint 13 ; <<16 x sbyte>> [#uses=1]
%tmp32 = insertelement <16 x sbyte> %tmp31, sbyte %tmp16, uint 14 ; <<16 x sbyte>> [#uses=1]
%tmp33 = insertelement <16 x sbyte> %tmp32, sbyte %tmp17, uint 15 ; <<16 x sbyte>> [#uses=1]
store <16 x sbyte> %tmp33, <16 x sbyte>* %A
ret void
}
void %th_l(<8 x short>* %A, <8 x short>* %B) {
entry:
%tmp = load <8 x short>* %A ; <<8 x short>> [#uses=4]
%tmp2 = load <8 x short>* %B ; <<8 x short>> [#uses=4]
%tmp = extractelement <8 x short> %tmp, uint 4 ; <short> [#uses=1]
%tmp3 = extractelement <8 x short> %tmp2, uint 4 ; <short> [#uses=1]
%tmp4 = extractelement <8 x short> %tmp, uint 5 ; <short> [#uses=1]
%tmp5 = extractelement <8 x short> %tmp2, uint 5 ; <short> [#uses=1]
%tmp6 = extractelement <8 x short> %tmp, uint 6 ; <short> [#uses=1]
%tmp7 = extractelement <8 x short> %tmp2, uint 6 ; <short> [#uses=1]
%tmp8 = extractelement <8 x short> %tmp, uint 7 ; <short> [#uses=1]
%tmp9 = extractelement <8 x short> %tmp2, uint 7 ; <short> [#uses=1]
%tmp10 = insertelement <8 x short> undef, short %tmp, uint 0 ; <<8 x short>> [#uses=1]
%tmp11 = insertelement <8 x short> %tmp10, short %tmp3, uint 1 ; <<8 x short>> [#uses=1]
%tmp12 = insertelement <8 x short> %tmp11, short %tmp4, uint 2 ; <<8 x short>> [#uses=1]
%tmp13 = insertelement <8 x short> %tmp12, short %tmp5, uint 3 ; <<8 x short>> [#uses=1]
%tmp14 = insertelement <8 x short> %tmp13, short %tmp6, uint 4 ; <<8 x short>> [#uses=1]
%tmp15 = insertelement <8 x short> %tmp14, short %tmp7, uint 5 ; <<8 x short>> [#uses=1]
%tmp16 = insertelement <8 x short> %tmp15, short %tmp8, uint 6 ; <<8 x short>> [#uses=1]
%tmp17 = insertelement <8 x short> %tmp16, short %tmp9, uint 7 ; <<8 x short>> [#uses=1]
store <8 x short> %tmp17, <8 x short>* %A
ret void
}
void %tw_l(<4 x int>* %A, <4 x int>* %B) {
entry:
%tmp = load <4 x int>* %A ; <<4 x int>> [#uses=2]
%tmp2 = load <4 x int>* %B ; <<4 x int>> [#uses=2]
%tmp = extractelement <4 x int> %tmp, uint 2 ; <int> [#uses=1]
%tmp3 = extractelement <4 x int> %tmp2, uint 2 ; <int> [#uses=1]
%tmp4 = extractelement <4 x int> %tmp, uint 3 ; <int> [#uses=1]
%tmp5 = extractelement <4 x int> %tmp2, uint 3 ; <int> [#uses=1]
%tmp6 = insertelement <4 x int> undef, int %tmp, uint 0 ; <<4 x int>> [#uses=1]
%tmp7 = insertelement <4 x int> %tmp6, int %tmp3, uint 1 ; <<4 x int>> [#uses=1]
%tmp8 = insertelement <4 x int> %tmp7, int %tmp4, uint 2 ; <<4 x int>> [#uses=1]
%tmp9 = insertelement <4 x int> %tmp8, int %tmp5, uint 3 ; <<4 x int>> [#uses=1]
store <4 x int> %tmp9, <4 x int>* %A
ret void
}
void %tb_h(<16 x sbyte>* %A, <16 x sbyte>* %B) {
entry:
%tmp = load <16 x sbyte>* %A ; <<16 x sbyte>> [#uses=8]
%tmp2 = load <16 x sbyte>* %B ; <<16 x sbyte>> [#uses=8]
%tmp = extractelement <16 x sbyte> %tmp, uint 0 ; <sbyte> [#uses=1]
%tmp3 = extractelement <16 x sbyte> %tmp2, uint 0 ; <sbyte> [#uses=1]
%tmp4 = extractelement <16 x sbyte> %tmp, uint 1 ; <sbyte> [#uses=1]
%tmp5 = extractelement <16 x sbyte> %tmp2, uint 1 ; <sbyte> [#uses=1]
%tmp6 = extractelement <16 x sbyte> %tmp, uint 2 ; <sbyte> [#uses=1]
%tmp7 = extractelement <16 x sbyte> %tmp2, uint 2 ; <sbyte> [#uses=1]
%tmp8 = extractelement <16 x sbyte> %tmp, uint 3 ; <sbyte> [#uses=1]
%tmp9 = extractelement <16 x sbyte> %tmp2, uint 3 ; <sbyte> [#uses=1]
%tmp10 = extractelement <16 x sbyte> %tmp, uint 4 ; <sbyte> [#uses=1]
%tmp11 = extractelement <16 x sbyte> %tmp2, uint 4 ; <sbyte> [#uses=1]
%tmp12 = extractelement <16 x sbyte> %tmp, uint 5 ; <sbyte> [#uses=1]
%tmp13 = extractelement <16 x sbyte> %tmp2, uint 5 ; <sbyte> [#uses=1]
%tmp14 = extractelement <16 x sbyte> %tmp, uint 6 ; <sbyte> [#uses=1]
%tmp15 = extractelement <16 x sbyte> %tmp2, uint 6 ; <sbyte> [#uses=1]
%tmp16 = extractelement <16 x sbyte> %tmp, uint 7 ; <sbyte> [#uses=1]
%tmp17 = extractelement <16 x sbyte> %tmp2, uint 7 ; <sbyte> [#uses=1]
%tmp18 = insertelement <16 x sbyte> undef, sbyte %tmp, uint 0 ; <<16 x sbyte>> [#uses=1]
%tmp19 = insertelement <16 x sbyte> %tmp18, sbyte %tmp3, uint 1 ; <<16 x sbyte>> [#uses=1]
%tmp20 = insertelement <16 x sbyte> %tmp19, sbyte %tmp4, uint 2 ; <<16 x sbyte>> [#uses=1]
%tmp21 = insertelement <16 x sbyte> %tmp20, sbyte %tmp5, uint 3 ; <<16 x sbyte>> [#uses=1]
%tmp22 = insertelement <16 x sbyte> %tmp21, sbyte %tmp6, uint 4 ; <<16 x sbyte>> [#uses=1]
%tmp23 = insertelement <16 x sbyte> %tmp22, sbyte %tmp7, uint 5 ; <<16 x sbyte>> [#uses=1]
%tmp24 = insertelement <16 x sbyte> %tmp23, sbyte %tmp8, uint 6 ; <<16 x sbyte>> [#uses=1]
%tmp25 = insertelement <16 x sbyte> %tmp24, sbyte %tmp9, uint 7 ; <<16 x sbyte>> [#uses=1]
%tmp26 = insertelement <16 x sbyte> %tmp25, sbyte %tmp10, uint 8 ; <<16 x sbyte>> [#uses=1]
%tmp27 = insertelement <16 x sbyte> %tmp26, sbyte %tmp11, uint 9 ; <<16 x sbyte>> [#uses=1]
%tmp28 = insertelement <16 x sbyte> %tmp27, sbyte %tmp12, uint 10 ; <<16 x sbyte>> [#uses=1]
%tmp29 = insertelement <16 x sbyte> %tmp28, sbyte %tmp13, uint 11 ; <<16 x sbyte>> [#uses=1]
%tmp30 = insertelement <16 x sbyte> %tmp29, sbyte %tmp14, uint 12 ; <<16 x sbyte>> [#uses=1]
%tmp31 = insertelement <16 x sbyte> %tmp30, sbyte %tmp15, uint 13 ; <<16 x sbyte>> [#uses=1]
%tmp32 = insertelement <16 x sbyte> %tmp31, sbyte %tmp16, uint 14 ; <<16 x sbyte>> [#uses=1]
%tmp33 = insertelement <16 x sbyte> %tmp32, sbyte %tmp17, uint 15 ; <<16 x sbyte>> [#uses=1]
store <16 x sbyte> %tmp33, <16 x sbyte>* %A
ret void
}
void %th_h(<8 x short>* %A, <8 x short>* %B) {
entry:
%tmp = load <8 x short>* %A ; <<8 x short>> [#uses=4]
%tmp2 = load <8 x short>* %B ; <<8 x short>> [#uses=4]
%tmp = extractelement <8 x short> %tmp, uint 0 ; <short> [#uses=1]
%tmp3 = extractelement <8 x short> %tmp2, uint 0 ; <short> [#uses=1]
%tmp4 = extractelement <8 x short> %tmp, uint 1 ; <short> [#uses=1]
%tmp5 = extractelement <8 x short> %tmp2, uint 1 ; <short> [#uses=1]
%tmp6 = extractelement <8 x short> %tmp, uint 2 ; <short> [#uses=1]
%tmp7 = extractelement <8 x short> %tmp2, uint 2 ; <short> [#uses=1]
%tmp8 = extractelement <8 x short> %tmp, uint 3 ; <short> [#uses=1]
%tmp9 = extractelement <8 x short> %tmp2, uint 3 ; <short> [#uses=1]
%tmp10 = insertelement <8 x short> undef, short %tmp, uint 0 ; <<8 x short>> [#uses=1]
%tmp11 = insertelement <8 x short> %tmp10, short %tmp3, uint 1 ; <<8 x short>> [#uses=1]
%tmp12 = insertelement <8 x short> %tmp11, short %tmp4, uint 2 ; <<8 x short>> [#uses=1]
%tmp13 = insertelement <8 x short> %tmp12, short %tmp5, uint 3 ; <<8 x short>> [#uses=1]
%tmp14 = insertelement <8 x short> %tmp13, short %tmp6, uint 4 ; <<8 x short>> [#uses=1]
%tmp15 = insertelement <8 x short> %tmp14, short %tmp7, uint 5 ; <<8 x short>> [#uses=1]
%tmp16 = insertelement <8 x short> %tmp15, short %tmp8, uint 6 ; <<8 x short>> [#uses=1]
%tmp17 = insertelement <8 x short> %tmp16, short %tmp9, uint 7 ; <<8 x short>> [#uses=1]
store <8 x short> %tmp17, <8 x short>* %A
ret void
}
void %tw_h(<4 x int>* %A, <4 x int>* %B) {
entry:
%tmp = load <4 x int>* %A ; <<4 x int>> [#uses=2]
%tmp2 = load <4 x int>* %B ; <<4 x int>> [#uses=2]
%tmp = extractelement <4 x int> %tmp2, uint 0 ; <int> [#uses=1]
%tmp3 = extractelement <4 x int> %tmp, uint 0 ; <int> [#uses=1]
%tmp4 = extractelement <4 x int> %tmp2, uint 1 ; <int> [#uses=1]
%tmp5 = extractelement <4 x int> %tmp, uint 1 ; <int> [#uses=1]
%tmp6 = insertelement <4 x int> undef, int %tmp, uint 0 ; <<4 x int>> [#uses=1]
%tmp7 = insertelement <4 x int> %tmp6, int %tmp3, uint 1 ; <<4 x int>> [#uses=1]
%tmp8 = insertelement <4 x int> %tmp7, int %tmp4, uint 2 ; <<4 x int>> [#uses=1]
%tmp9 = insertelement <4 x int> %tmp8, int %tmp5, uint 3 ; <<4 x int>> [#uses=1]
store <4 x int> %tmp9, <4 x int>* %A
ret void
}
void %tw_h_flop(<4 x int>* %A, <4 x int>* %B) {
%tmp = load <4 x int>* %A ; <<4 x int>> [#uses=2]
%tmp2 = load <4 x int>* %B ; <<4 x int>> [#uses=2]
%tmp = extractelement <4 x int> %tmp, uint 0 ; <int> [#uses=1]
%tmp3 = extractelement <4 x int> %tmp2, uint 0 ; <int> [#uses=1]
%tmp4 = extractelement <4 x int> %tmp, uint 1 ; <int> [#uses=1]
%tmp5 = extractelement <4 x int> %tmp2, uint 1 ; <int> [#uses=1]
%tmp6 = insertelement <4 x int> undef, int %tmp, uint 0 ; <<4 x int>> [#uses=1]
%tmp7 = insertelement <4 x int> %tmp6, int %tmp3, uint 1 ; <<4 x int>> [#uses=1]
%tmp8 = insertelement <4 x int> %tmp7, int %tmp4, uint 2 ; <<4 x int>> [#uses=1]
%tmp9 = insertelement <4 x int> %tmp8, int %tmp5, uint 3 ; <<4 x int>> [#uses=1]
store <4 x int> %tmp9, <4 x int>* %A
ret void
}
void %VMRG_UNARY_tb_l(<16 x sbyte>* %A, <16 x sbyte>* %B) {
entry:
%tmp = load <16 x sbyte>* %A ; <<16 x sbyte>> [#uses=16]
%tmp = extractelement <16 x sbyte> %tmp, uint 8 ; <sbyte> [#uses=1]
%tmp3 = extractelement <16 x sbyte> %tmp, uint 8 ; <sbyte> [#uses=1]
%tmp4 = extractelement <16 x sbyte> %tmp, uint 9 ; <sbyte> [#uses=1]
%tmp5 = extractelement <16 x sbyte> %tmp, uint 9 ; <sbyte> [#uses=1]
%tmp6 = extractelement <16 x sbyte> %tmp, uint 10 ; <sbyte> [#uses=1]
%tmp7 = extractelement <16 x sbyte> %tmp, uint 10 ; <sbyte> [#uses=1]
%tmp8 = extractelement <16 x sbyte> %tmp, uint 11 ; <sbyte> [#uses=1]
%tmp9 = extractelement <16 x sbyte> %tmp, uint 11 ; <sbyte> [#uses=1]
%tmp10 = extractelement <16 x sbyte> %tmp, uint 12 ; <sbyte> [#uses=1]
%tmp11 = extractelement <16 x sbyte> %tmp, uint 12 ; <sbyte> [#uses=1]
%tmp12 = extractelement <16 x sbyte> %tmp, uint 13 ; <sbyte> [#uses=1]
%tmp13 = extractelement <16 x sbyte> %tmp, uint 13 ; <sbyte> [#uses=1]
%tmp14 = extractelement <16 x sbyte> %tmp, uint 14 ; <sbyte> [#uses=1]
%tmp15 = extractelement <16 x sbyte> %tmp, uint 14 ; <sbyte> [#uses=1]
%tmp16 = extractelement <16 x sbyte> %tmp, uint 15 ; <sbyte> [#uses=1]
%tmp17 = extractelement <16 x sbyte> %tmp, uint 15 ; <sbyte> [#uses=1]
%tmp18 = insertelement <16 x sbyte> undef, sbyte %tmp, uint 0 ; <<16 x sbyte>> [#uses=1]
%tmp19 = insertelement <16 x sbyte> %tmp18, sbyte %tmp3, uint 1 ; <<16 x sbyte>> [#uses=1]
%tmp20 = insertelement <16 x sbyte> %tmp19, sbyte %tmp4, uint 2 ; <<16 x sbyte>> [#uses=1]
%tmp21 = insertelement <16 x sbyte> %tmp20, sbyte %tmp5, uint 3 ; <<16 x sbyte>> [#uses=1]
%tmp22 = insertelement <16 x sbyte> %tmp21, sbyte %tmp6, uint 4 ; <<16 x sbyte>> [#uses=1]
%tmp23 = insertelement <16 x sbyte> %tmp22, sbyte %tmp7, uint 5 ; <<16 x sbyte>> [#uses=1]
%tmp24 = insertelement <16 x sbyte> %tmp23, sbyte %tmp8, uint 6 ; <<16 x sbyte>> [#uses=1]
%tmp25 = insertelement <16 x sbyte> %tmp24, sbyte %tmp9, uint 7 ; <<16 x sbyte>> [#uses=1]
%tmp26 = insertelement <16 x sbyte> %tmp25, sbyte %tmp10, uint 8 ; <<16 x sbyte>> [#uses=1]
%tmp27 = insertelement <16 x sbyte> %tmp26, sbyte %tmp11, uint 9 ; <<16 x sbyte>> [#uses=1]
%tmp28 = insertelement <16 x sbyte> %tmp27, sbyte %tmp12, uint 10 ; <<16 x sbyte>> [#uses=1]
%tmp29 = insertelement <16 x sbyte> %tmp28, sbyte %tmp13, uint 11 ; <<16 x sbyte>> [#uses=1]
%tmp30 = insertelement <16 x sbyte> %tmp29, sbyte %tmp14, uint 12 ; <<16 x sbyte>> [#uses=1]
%tmp31 = insertelement <16 x sbyte> %tmp30, sbyte %tmp15, uint 13 ; <<16 x sbyte>> [#uses=1]
%tmp32 = insertelement <16 x sbyte> %tmp31, sbyte %tmp16, uint 14 ; <<16 x sbyte>> [#uses=1]
%tmp33 = insertelement <16 x sbyte> %tmp32, sbyte %tmp17, uint 15 ; <<16 x sbyte>> [#uses=1]
store <16 x sbyte> %tmp33, <16 x sbyte>* %A
ret void
}
void %VMRG_UNARY_th_l(<8 x short>* %A, <8 x short>* %B) {
entry:
%tmp = load <8 x short>* %A ; <<8 x short>> [#uses=8]
%tmp = extractelement <8 x short> %tmp, uint 4 ; <short> [#uses=1]
%tmp3 = extractelement <8 x short> %tmp, uint 4 ; <short> [#uses=1]
%tmp4 = extractelement <8 x short> %tmp, uint 5 ; <short> [#uses=1]
%tmp5 = extractelement <8 x short> %tmp, uint 5 ; <short> [#uses=1]
%tmp6 = extractelement <8 x short> %tmp, uint 6 ; <short> [#uses=1]
%tmp7 = extractelement <8 x short> %tmp, uint 6 ; <short> [#uses=1]
%tmp8 = extractelement <8 x short> %tmp, uint 7 ; <short> [#uses=1]
%tmp9 = extractelement <8 x short> %tmp, uint 7 ; <short> [#uses=1]
%tmp10 = insertelement <8 x short> undef, short %tmp, uint 0 ; <<8 x short>> [#uses=1]
%tmp11 = insertelement <8 x short> %tmp10, short %tmp3, uint 1 ; <<8 x short>> [#uses=1]
%tmp12 = insertelement <8 x short> %tmp11, short %tmp4, uint 2 ; <<8 x short>> [#uses=1]
%tmp13 = insertelement <8 x short> %tmp12, short %tmp5, uint 3 ; <<8 x short>> [#uses=1]
%tmp14 = insertelement <8 x short> %tmp13, short %tmp6, uint 4 ; <<8 x short>> [#uses=1]
%tmp15 = insertelement <8 x short> %tmp14, short %tmp7, uint 5 ; <<8 x short>> [#uses=1]
%tmp16 = insertelement <8 x short> %tmp15, short %tmp8, uint 6 ; <<8 x short>> [#uses=1]
%tmp17 = insertelement <8 x short> %tmp16, short %tmp9, uint 7 ; <<8 x short>> [#uses=1]
store <8 x short> %tmp17, <8 x short>* %A
ret void
}
void %VMRG_UNARY_tw_l(<4 x int>* %A, <4 x int>* %B) {
entry:
%tmp = load <4 x int>* %A ; <<4 x int>> [#uses=4]
%tmp = extractelement <4 x int> %tmp, uint 2 ; <int> [#uses=1]
%tmp3 = extractelement <4 x int> %tmp, uint 2 ; <int> [#uses=1]
%tmp4 = extractelement <4 x int> %tmp, uint 3 ; <int> [#uses=1]
%tmp5 = extractelement <4 x int> %tmp, uint 3 ; <int> [#uses=1]
%tmp6 = insertelement <4 x int> undef, int %tmp, uint 0 ; <<4 x int>> [#uses=1]
%tmp7 = insertelement <4 x int> %tmp6, int %tmp3, uint 1 ; <<4 x int>> [#uses=1]
%tmp8 = insertelement <4 x int> %tmp7, int %tmp4, uint 2 ; <<4 x int>> [#uses=1]
%tmp9 = insertelement <4 x int> %tmp8, int %tmp5, uint 3 ; <<4 x int>> [#uses=1]
store <4 x int> %tmp9, <4 x int>* %A
ret void
}
void %VMRG_UNARY_tb_h(<16 x sbyte>* %A, <16 x sbyte>* %B) {
entry:
%tmp = load <16 x sbyte>* %A ; <<16 x sbyte>> [#uses=16]
%tmp = extractelement <16 x sbyte> %tmp, uint 0 ; <sbyte> [#uses=1]
%tmp3 = extractelement <16 x sbyte> %tmp, uint 0 ; <sbyte> [#uses=1]
%tmp4 = extractelement <16 x sbyte> %tmp, uint 1 ; <sbyte> [#uses=1]
%tmp5 = extractelement <16 x sbyte> %tmp, uint 1 ; <sbyte> [#uses=1]
%tmp6 = extractelement <16 x sbyte> %tmp, uint 2 ; <sbyte> [#uses=1]
%tmp7 = extractelement <16 x sbyte> %tmp, uint 2 ; <sbyte> [#uses=1]
%tmp8 = extractelement <16 x sbyte> %tmp, uint 3 ; <sbyte> [#uses=1]
%tmp9 = extractelement <16 x sbyte> %tmp, uint 3 ; <sbyte> [#uses=1]
%tmp10 = extractelement <16 x sbyte> %tmp, uint 4 ; <sbyte> [#uses=1]
%tmp11 = extractelement <16 x sbyte> %tmp, uint 4 ; <sbyte> [#uses=1]
%tmp12 = extractelement <16 x sbyte> %tmp, uint 5 ; <sbyte> [#uses=1]
%tmp13 = extractelement <16 x sbyte> %tmp, uint 5 ; <sbyte> [#uses=1]
%tmp14 = extractelement <16 x sbyte> %tmp, uint 6 ; <sbyte> [#uses=1]
%tmp15 = extractelement <16 x sbyte> %tmp, uint 6 ; <sbyte> [#uses=1]
%tmp16 = extractelement <16 x sbyte> %tmp, uint 7 ; <sbyte> [#uses=1]
%tmp17 = extractelement <16 x sbyte> %tmp, uint 7 ; <sbyte> [#uses=1]
%tmp18 = insertelement <16 x sbyte> undef, sbyte %tmp, uint 0 ; <<16 x sbyte>> [#uses=1]
%tmp19 = insertelement <16 x sbyte> %tmp18, sbyte %tmp3, uint 1 ; <<16 x sbyte>> [#uses=1]
%tmp20 = insertelement <16 x sbyte> %tmp19, sbyte %tmp4, uint 2 ; <<16 x sbyte>> [#uses=1]
%tmp21 = insertelement <16 x sbyte> %tmp20, sbyte %tmp5, uint 3 ; <<16 x sbyte>> [#uses=1]
%tmp22 = insertelement <16 x sbyte> %tmp21, sbyte %tmp6, uint 4 ; <<16 x sbyte>> [#uses=1]
%tmp23 = insertelement <16 x sbyte> %tmp22, sbyte %tmp7, uint 5 ; <<16 x sbyte>> [#uses=1]
%tmp24 = insertelement <16 x sbyte> %tmp23, sbyte %tmp8, uint 6 ; <<16 x sbyte>> [#uses=1]
%tmp25 = insertelement <16 x sbyte> %tmp24, sbyte %tmp9, uint 7 ; <<16 x sbyte>> [#uses=1]
%tmp26 = insertelement <16 x sbyte> %tmp25, sbyte %tmp10, uint 8 ; <<16 x sbyte>> [#uses=1]
%tmp27 = insertelement <16 x sbyte> %tmp26, sbyte %tmp11, uint 9 ; <<16 x sbyte>> [#uses=1]
%tmp28 = insertelement <16 x sbyte> %tmp27, sbyte %tmp12, uint 10 ; <<16 x sbyte>> [#uses=1]
%tmp29 = insertelement <16 x sbyte> %tmp28, sbyte %tmp13, uint 11 ; <<16 x sbyte>> [#uses=1]
%tmp30 = insertelement <16 x sbyte> %tmp29, sbyte %tmp14, uint 12 ; <<16 x sbyte>> [#uses=1]
%tmp31 = insertelement <16 x sbyte> %tmp30, sbyte %tmp15, uint 13 ; <<16 x sbyte>> [#uses=1]
%tmp32 = insertelement <16 x sbyte> %tmp31, sbyte %tmp16, uint 14 ; <<16 x sbyte>> [#uses=1]
%tmp33 = insertelement <16 x sbyte> %tmp32, sbyte %tmp17, uint 15 ; <<16 x sbyte>> [#uses=1]
store <16 x sbyte> %tmp33, <16 x sbyte>* %A
ret void
}
void %VMRG_UNARY_th_h(<8 x short>* %A, <8 x short>* %B) {
entry:
%tmp = load <8 x short>* %A ; <<8 x short>> [#uses=8]
%tmp = extractelement <8 x short> %tmp, uint 0 ; <short> [#uses=1]
%tmp3 = extractelement <8 x short> %tmp, uint 0 ; <short> [#uses=1]
%tmp4 = extractelement <8 x short> %tmp, uint 1 ; <short> [#uses=1]
%tmp5 = extractelement <8 x short> %tmp, uint 1 ; <short> [#uses=1]
%tmp6 = extractelement <8 x short> %tmp, uint 2 ; <short> [#uses=1]
%tmp7 = extractelement <8 x short> %tmp, uint 2 ; <short> [#uses=1]
%tmp8 = extractelement <8 x short> %tmp, uint 3 ; <short> [#uses=1]
%tmp9 = extractelement <8 x short> %tmp, uint 3 ; <short> [#uses=1]
%tmp10 = insertelement <8 x short> undef, short %tmp, uint 0 ; <<8 x short>> [#uses=1]
%tmp11 = insertelement <8 x short> %tmp10, short %tmp3, uint 1 ; <<8 x short>> [#uses=1]
%tmp12 = insertelement <8 x short> %tmp11, short %tmp4, uint 2 ; <<8 x short>> [#uses=1]
%tmp13 = insertelement <8 x short> %tmp12, short %tmp5, uint 3 ; <<8 x short>> [#uses=1]
%tmp14 = insertelement <8 x short> %tmp13, short %tmp6, uint 4 ; <<8 x short>> [#uses=1]
%tmp15 = insertelement <8 x short> %tmp14, short %tmp7, uint 5 ; <<8 x short>> [#uses=1]
%tmp16 = insertelement <8 x short> %tmp15, short %tmp8, uint 6 ; <<8 x short>> [#uses=1]
%tmp17 = insertelement <8 x short> %tmp16, short %tmp9, uint 7 ; <<8 x short>> [#uses=1]
store <8 x short> %tmp17, <8 x short>* %A
ret void
}
void %VMRG_UNARY_tw_h(<4 x int>* %A, <4 x int>* %B) {
entry:
%tmp = load <4 x int>* %A ; <<4 x int>> [#uses=4]
%tmp = extractelement <4 x int> %tmp, uint 0 ; <int> [#uses=1]
%tmp3 = extractelement <4 x int> %tmp, uint 0 ; <int> [#uses=1]
%tmp4 = extractelement <4 x int> %tmp, uint 1 ; <int> [#uses=1]
%tmp5 = extractelement <4 x int> %tmp, uint 1 ; <int> [#uses=1]
%tmp6 = insertelement <4 x int> undef, int %tmp, uint 0 ; <<4 x int>> [#uses=1]
%tmp7 = insertelement <4 x int> %tmp6, int %tmp3, uint 1 ; <<4 x int>> [#uses=1]
%tmp8 = insertelement <4 x int> %tmp7, int %tmp4, uint 2 ; <<4 x int>> [#uses=1]
%tmp9 = insertelement <4 x int> %tmp8, int %tmp5, uint 3 ; <<4 x int>> [#uses=1]
store <4 x int> %tmp9, <4 x int>* %A
ret void
}
void %VPCKUHUM_unary(<8 x short>* %A, <8 x short>* %B) {
entry:
%tmp = load <8 x short>* %A ; <<8 x short>> [#uses=2]
%tmp = cast <8 x short> %tmp to <16 x sbyte> ; <<16 x sbyte>> [#uses=8]
%tmp3 = cast <8 x short> %tmp to <16 x sbyte> ; <<16 x sbyte>> [#uses=8]
%tmp = extractelement <16 x sbyte> %tmp, uint 1 ; <sbyte> [#uses=1]
%tmp4 = extractelement <16 x sbyte> %tmp, uint 3 ; <sbyte> [#uses=1]
%tmp5 = extractelement <16 x sbyte> %tmp, uint 5 ; <sbyte> [#uses=1]
%tmp6 = extractelement <16 x sbyte> %tmp, uint 7 ; <sbyte> [#uses=1]
%tmp7 = extractelement <16 x sbyte> %tmp, uint 9 ; <sbyte> [#uses=1]
%tmp8 = extractelement <16 x sbyte> %tmp, uint 11 ; <sbyte> [#uses=1]
%tmp9 = extractelement <16 x sbyte> %tmp, uint 13 ; <sbyte> [#uses=1]
%tmp10 = extractelement <16 x sbyte> %tmp, uint 15 ; <sbyte> [#uses=1]
%tmp11 = extractelement <16 x sbyte> %tmp3, uint 1 ; <sbyte> [#uses=1]
%tmp12 = extractelement <16 x sbyte> %tmp3, uint 3 ; <sbyte> [#uses=1]
%tmp13 = extractelement <16 x sbyte> %tmp3, uint 5 ; <sbyte> [#uses=1]
%tmp14 = extractelement <16 x sbyte> %tmp3, uint 7 ; <sbyte> [#uses=1]
%tmp15 = extractelement <16 x sbyte> %tmp3, uint 9 ; <sbyte> [#uses=1]
%tmp16 = extractelement <16 x sbyte> %tmp3, uint 11 ; <sbyte> [#uses=1]
%tmp17 = extractelement <16 x sbyte> %tmp3, uint 13 ; <sbyte> [#uses=1]
%tmp18 = extractelement <16 x sbyte> %tmp3, uint 15 ; <sbyte> [#uses=1]
%tmp19 = insertelement <16 x sbyte> undef, sbyte %tmp, uint 0 ; <<16 x sbyte>> [#uses=1]
%tmp20 = insertelement <16 x sbyte> %tmp19, sbyte %tmp4, uint 1 ; <<16 x sbyte>> [#uses=1]
%tmp21 = insertelement <16 x sbyte> %tmp20, sbyte %tmp5, uint 2 ; <<16 x sbyte>> [#uses=1]
%tmp22 = insertelement <16 x sbyte> %tmp21, sbyte %tmp6, uint 3 ; <<16 x sbyte>> [#uses=1]
%tmp23 = insertelement <16 x sbyte> %tmp22, sbyte %tmp7, uint 4 ; <<16 x sbyte>> [#uses=1]
%tmp24 = insertelement <16 x sbyte> %tmp23, sbyte %tmp8, uint 5 ; <<16 x sbyte>> [#uses=1]
%tmp25 = insertelement <16 x sbyte> %tmp24, sbyte %tmp9, uint 6 ; <<16 x sbyte>> [#uses=1]
%tmp26 = insertelement <16 x sbyte> %tmp25, sbyte %tmp10, uint 7 ; <<16 x sbyte>> [#uses=1]
%tmp27 = insertelement <16 x sbyte> %tmp26, sbyte %tmp11, uint 8 ; <<16 x sbyte>> [#uses=1]
%tmp28 = insertelement <16 x sbyte> %tmp27, sbyte %tmp12, uint 9 ; <<16 x sbyte>> [#uses=1]
%tmp29 = insertelement <16 x sbyte> %tmp28, sbyte %tmp13, uint 10 ; <<16 x sbyte>> [#uses=1]
%tmp30 = insertelement <16 x sbyte> %tmp29, sbyte %tmp14, uint 11 ; <<16 x sbyte>> [#uses=1]
%tmp31 = insertelement <16 x sbyte> %tmp30, sbyte %tmp15, uint 12 ; <<16 x sbyte>> [#uses=1]
%tmp32 = insertelement <16 x sbyte> %tmp31, sbyte %tmp16, uint 13 ; <<16 x sbyte>> [#uses=1]
%tmp33 = insertelement <16 x sbyte> %tmp32, sbyte %tmp17, uint 14 ; <<16 x sbyte>> [#uses=1]
%tmp34 = insertelement <16 x sbyte> %tmp33, sbyte %tmp18, uint 15 ; <<16 x sbyte>> [#uses=1]
%tmp34 = cast <16 x sbyte> %tmp34 to <8 x short> ; <<8 x short>> [#uses=1]
store <8 x short> %tmp34, <8 x short>* %A
ret void
}
void %VPCKUWUM_unary(<4 x int>* %A, <4 x int>* %B) {
entry:
%tmp = load <4 x int>* %A ; <<4 x int>> [#uses=2]
%tmp = cast <4 x int> %tmp to <8 x short> ; <<8 x short>> [#uses=4]
%tmp3 = cast <4 x int> %tmp to <8 x short> ; <<8 x short>> [#uses=4]
%tmp = extractelement <8 x short> %tmp, uint 1 ; <short> [#uses=1]
%tmp4 = extractelement <8 x short> %tmp, uint 3 ; <short> [#uses=1]
%tmp5 = extractelement <8 x short> %tmp, uint 5 ; <short> [#uses=1]
%tmp6 = extractelement <8 x short> %tmp, uint 7 ; <short> [#uses=1]
%tmp7 = extractelement <8 x short> %tmp3, uint 1 ; <short> [#uses=1]
%tmp8 = extractelement <8 x short> %tmp3, uint 3 ; <short> [#uses=1]
%tmp9 = extractelement <8 x short> %tmp3, uint 5 ; <short> [#uses=1]
%tmp10 = extractelement <8 x short> %tmp3, uint 7 ; <short> [#uses=1]
%tmp11 = insertelement <8 x short> undef, short %tmp, uint 0 ; <<8 x short>> [#uses=1]
%tmp12 = insertelement <8 x short> %tmp11, short %tmp4, uint 1 ; <<8 x short>> [#uses=1]
%tmp13 = insertelement <8 x short> %tmp12, short %tmp5, uint 2 ; <<8 x short>> [#uses=1]
%tmp14 = insertelement <8 x short> %tmp13, short %tmp6, uint 3 ; <<8 x short>> [#uses=1]
%tmp15 = insertelement <8 x short> %tmp14, short %tmp7, uint 4 ; <<8 x short>> [#uses=1]
%tmp16 = insertelement <8 x short> %tmp15, short %tmp8, uint 5 ; <<8 x short>> [#uses=1]
%tmp17 = insertelement <8 x short> %tmp16, short %tmp9, uint 6 ; <<8 x short>> [#uses=1]
%tmp18 = insertelement <8 x short> %tmp17, short %tmp10, uint 7 ; <<8 x short>> [#uses=1]
%tmp18 = cast <8 x short> %tmp18 to <4 x int> ; <<4 x int>> [#uses=1]
store <4 x int> %tmp18, <4 x int>* %A
ret void
}
|