1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182
|
//@ compile-flags: -Copt-level=3 -Z merge-functions=disabled
//@ only-x86_64
//@ min-llvm-version: 20
//@ ignore-std-debug-assertions (`ptr::swap_nonoverlapping` has one which blocks some optimizations)
#![crate_type = "lib"]
use std::mem::swap;
type RGB48 = [u16; 3];
// CHECK-LABEL: @swap_rgb48_manually(
#[no_mangle]
pub fn swap_rgb48_manually(x: &mut RGB48, y: &mut RGB48) {
// FIXME: See #115212 for why this has an alloca again
// CHECK: alloca [6 x i8], align 2
// CHECK: call void @llvm.memcpy.p0.p0.i64({{.+}}, i64 6, i1 false)
// CHECK: call void @llvm.memcpy.p0.p0.i64({{.+}}, i64 6, i1 false)
// CHECK: call void @llvm.memcpy.p0.p0.i64({{.+}}, i64 6, i1 false)
let temp = *x;
*x = *y;
*y = temp;
}
// CHECK-LABEL: @swap_rgb48
#[no_mangle]
pub fn swap_rgb48(x: &mut RGB48, y: &mut RGB48) {
// CHECK-NOT: alloca
// Swapping `i48` might be cleaner in LLVM-IR here, but `i32`+`i16` isn't bad,
// and is closer to the assembly it generates anyway.
// CHECK-NOT: load{{ }}
// CHECK: load i32{{.+}}align 2
// CHECK-NEXT: load i32{{.+}}align 2
// CHECK-NEXT: store i32{{.+}}align 2
// CHECK-NEXT: store i32{{.+}}align 2
// CHECK: load i16{{.+}}align 2
// CHECK-NEXT: load i16{{.+}}align 2
// CHECK-NEXT: store i16{{.+}}align 2
// CHECK-NEXT: store i16{{.+}}align 2
// CHECK-NOT: store{{ }}
swap(x, y)
}
type RGBA64 = [u16; 4];
// CHECK-LABEL: @swap_rgba64
#[no_mangle]
pub fn swap_rgba64(x: &mut RGBA64, y: &mut RGBA64) {
// CHECK-NOT: alloca
// CHECK-DAG: %[[XVAL:.+]] = load i64, ptr %x, align 2
// CHECK-DAG: %[[YVAL:.+]] = load i64, ptr %y, align 2
// CHECK-DAG: store i64 %[[YVAL]], ptr %x, align 2
// CHECK-DAG: store i64 %[[XVAL]], ptr %y, align 2
swap(x, y)
}
// CHECK-LABEL: @swap_vecs
#[no_mangle]
pub fn swap_vecs(x: &mut Vec<u32>, y: &mut Vec<u32>) {
// CHECK-NOT: alloca
// There are plenty more loads and stores than just these,
// but at least one sure better be 64-bit (for size or capacity).
// CHECK: load i64
// CHECK: load i64
// CHECK: store i64
// CHECK: store i64
// CHECK: ret void
swap(x, y)
}
// CHECK-LABEL: @swap_slices
#[no_mangle]
pub fn swap_slices<'a>(x: &mut &'a [u32], y: &mut &'a [u32]) {
// CHECK-NOT: alloca
// CHECK: load ptr
// CHECK: load i64
// CHECK: call void @llvm.memcpy.p0.p0.i64({{.+}}, i64 16, i1 false)
// CHECK: store ptr
// CHECK: store i64
swap(x, y)
}
type RGB24 = [u8; 3];
// CHECK-LABEL: @swap_rgb24_slices
#[no_mangle]
pub fn swap_rgb24_slices(x: &mut [RGB24], y: &mut [RGB24]) {
// CHECK-NOT: alloca
// CHECK: mul nuw nsw i64 %{{x|y}}.1, 3
// CHECK: load <{{[0-9]+}} x i64>
// CHECK: store <{{[0-9]+}} x i64>
// CHECK-COUNT-2: load i32
// CHECK-COUNT-2: store i32
// CHECK-COUNT-2: load i16
// CHECK-COUNT-2: store i16
// CHECK-COUNT-2: load i8
// CHECK-COUNT-2: store i8
if x.len() == y.len() {
x.swap_with_slice(y);
}
}
type RGBA32 = [u8; 4];
// CHECK-LABEL: @swap_rgba32_slices
#[no_mangle]
pub fn swap_rgba32_slices(x: &mut [RGBA32], y: &mut [RGBA32]) {
// CHECK-NOT: alloca
// Because the size in bytes in a multiple of 4, we can skip the smallest sizes.
// CHECK: load <{{[0-9]+}} x i64>
// CHECK: store <{{[0-9]+}} x i64>
// CHECK-COUNT-2: load i32
// CHECK-COUNT-2: store i32
// CHECK-NOT: load i16
// CHECK-NOT: store i16
// CHECK-NOT: load i8
// CHECK-NOT: store i8
if x.len() == y.len() {
x.swap_with_slice(y);
}
}
// Strings have a non-power-of-two size, but have pointer alignment,
// so we swap usizes instead of dropping all the way down to bytes.
const _: () = assert!(!std::mem::size_of::<String>().is_power_of_two());
// CHECK-LABEL: @swap_string_slices
#[no_mangle]
pub fn swap_string_slices(x: &mut [String], y: &mut [String]) {
// CHECK-NOT: alloca
// CHECK: load <{{[0-9]+}} x i64>{{.+}}, align 8,
// CHECK: store <{{[0-9]+}} x i64>{{.+}}, align 8,
if x.len() == y.len() {
x.swap_with_slice(y);
}
}
#[repr(C, packed)]
pub struct Packed {
pub first: bool,
pub second: usize,
}
// CHECK-LABEL: @swap_packed_structs
#[no_mangle]
pub fn swap_packed_structs(x: &mut Packed, y: &mut Packed) {
// CHECK-NOT: alloca
// CHECK-NOT: load
// CHECK-NOT: store
// CHECK: %[[A:.+]] = load i64, ptr %x, align 1,
// CHECK-NEXT: %[[B:.+]] = load i64, ptr %y, align 1,
// CHECK-NEXT: store i64 %[[B]], ptr %x, align 1,
// CHECK-NEXT: store i64 %[[A]], ptr %y, align 1,
// CHECK-NOT: load
// CHECK-NOT: store
// CHECK: %[[C:.+]] = load i8, ptr %[[X8:.+]], align 1,
// CHECK-NEXT: %[[D:.+]] = load i8, ptr %[[Y8:.+]], align 1,
// CHECK-NEXT: store i8 %[[D]], ptr %[[X8]], align 1,
// CHECK-NEXT: store i8 %[[C]], ptr %[[Y8]], align 1,
// CHECK-NOT: load
// CHECK-NOT: store
// CHECK: ret void
swap(x, y)
}
|