1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195
|
// RUN: mlir-opt %s -test-transform-dialect-interpreter -split-input-file | FileCheck %s
func.func @promote_subview_matmul(%arg0: memref<?x?xf32, strided<[?, 1], offset: ?>>,
%arg1: memref<?x?xf32, strided<[?, 1], offset: ?>>,
%arg2: memref<?x?xf32, strided<[?, 1], offset: ?>>) {
%c2000 = arith.constant 2000 : index
%c3000 = arith.constant 3000 : index
%c4000 = arith.constant 4000 : index
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%0 = memref.dim %arg0, %c0 : memref<?x?xf32, strided<[?, 1], offset: ?>>
%1 = memref.dim %arg0, %c1 : memref<?x?xf32, strided<[?, 1], offset: ?>>
%2 = memref.dim %arg1, %c1 : memref<?x?xf32, strided<[?, 1], offset: ?>>
scf.for %arg3 = %c0 to %0 step %c2000 {
scf.for %arg4 = %c0 to %2 step %c3000 {
scf.for %arg5 = %c0 to %1 step %c4000 {
%3 = memref.subview %arg0[%arg3, %arg5][%c2000, %c4000][%c1, %c1] :
memref<?x?xf32, strided<[?, 1], offset: ?>> to memref<?x?xf32, strided<[?, ?], offset: ?>>
%4 = memref.subview %arg1[%arg5, %arg4][%c4000, %c3000][%c1, %c1] :
memref<?x?xf32, strided<[?, 1], offset: ?>> to memref<?x?xf32, strided<[?, ?], offset: ?>>
%5 = memref.subview %arg2[%arg3, %arg4][%c2000, %c3000][%c1, %c1] :
memref<?x?xf32, strided<[?, 1], offset: ?>> to memref<?x?xf32, strided<[?, ?], offset: ?>>
linalg.matmul ins(%3, %4: memref<?x?xf32, strided<[?, ?], offset: ?>>,
memref<?x?xf32, strided<[?, ?], offset: ?>>)
outs(%5: memref<?x?xf32, strided<[?, ?], offset: ?>>)
}
}
}
return
}
// CHECK-LABEL: func @promote_subview_matmul
// CHECK-DAG: %[[c0:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[c2000:.*]] = arith.constant 2000 : index
// CHECK-DAG: %[[c3000:.*]] = arith.constant 3000 : index
// CHECK-DAG: %[[c4000:.*]] = arith.constant 4000 : index
// CHECK: scf.for {{.*}} = %[[c0]] to {{.*}} step %[[c2000]] {
// CHECK: scf.for {{.*}} = %[[c0]] to {{.*}} step %[[c3000]] {
// CHECK: scf.for {{.*}} = %[[c0]] to {{.*}} step %[[c4000]] {
// CHECK: %[[s0:.*]] = memref.subview {{.*}}: memref<?x?xf32, strided{{.*}}> to memref<?x?xf32, strided{{.*}}>
// CHECK: %[[s1:.*]] = memref.subview {{.*}}: memref<?x?xf32, strided{{.*}}> to memref<?x?xf32, strided{{.*}}>
// CHECK: %[[s2:.*]] = memref.subview {{.*}}: memref<?x?xf32, strided{{.*}}> to memref<?x?xf32, strided{{.*}}>
// CHECK: %[[a0:.*]] = memref.alloc() : memref<32000000xi8>
// CHECK: %[[v0:.*]] = memref.view %[[a0]]{{.*}} : memref<32000000xi8> to memref<?x?xf32>
// CHECK: %[[l0:.*]] = memref.subview %[[v0]][0, 0] [%{{.*}}, %{{.*}}] [1, 1]
// CHECK-SAME: memref<?x?xf32> to memref<?x?xf32, strided<[?, 1], offset: ?>>
// CHECK: %[[a1:.*]] = memref.alloc() : memref<48000000xi8>
// CHECK: %[[v1:.*]] = memref.view %[[a1]]{{.*}} : memref<48000000xi8> to memref<?x?xf32>
// CHECK: %[[l1:.*]] = memref.subview %[[v1]][0, 0] [%{{.*}}, %{{.*}}] [1, 1]
// CHECK-SAME: memref<?x?xf32> to memref<?x?xf32, strided<[?, 1], offset: ?>>
// CHECK: %[[a2:.*]] = memref.alloc() : memref<24000000xi8>
// CHECK: %[[v2:.*]] = memref.view %[[a2]]{{.*}} : memref<24000000xi8> to memref<?x?xf32>
// CHECK: %[[l2:.*]] = memref.subview %[[v2]][0, 0] [%{{.*}}, %{{.*}}] [1, 1]
// CHECK-SAME: memref<?x?xf32> to memref<?x?xf32, strided<[?, 1], offset: ?>>
// CHECK: memref.copy %[[s0]], %[[l0]] : memref<?x?xf32, strided{{.*}}> to memref<?x?xf32, strided{{.*}}>
// CHECK: memref.copy %[[s1]], %[[l1]] : memref<?x?xf32, strided{{.*}}> to memref<?x?xf32, strided{{.*}}>
// CHECK: memref.copy %[[s2]], %[[l2]] : memref<?x?xf32, strided{{.*}}> to memref<?x?xf32, strided{{.*}}>
// CHECK: linalg.matmul
// CHECK-SAME: ins(%[[v0]], %[[v1]] : memref<?x?xf32>, memref<?x?xf32>)
// CHECK-SAME: outs(%[[v2]] : memref<?x?xf32>)
transform.sequence failures(propagate) {
^bb0(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = transform.structured.promote %0 { operands_to_promote = [0, 1, 2], use_full_tiles_by_default } : (!transform.any_op) -> !transform.any_op
}
// -----
func.func @promote_first_subview_matmul(%arg0: memref<?x?xf32, strided<[?, 1], offset: ?>>,
%arg1: memref<?x?xf32, strided<[?, 1], offset: ?>>,
%arg2: memref<?x?xf32, strided<[?, 1], offset: ?>>) {
%c2000 = arith.constant 2000 : index
%c3000 = arith.constant 3000 : index
%c4000 = arith.constant 4000 : index
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%0 = memref.dim %arg0, %c0 : memref<?x?xf32, strided<[?, 1], offset: ?>>
%1 = memref.dim %arg0, %c1 : memref<?x?xf32, strided<[?, 1], offset: ?>>
%2 = memref.dim %arg1, %c1 : memref<?x?xf32, strided<[?, 1], offset: ?>>
scf.for %arg3 = %c0 to %0 step %c2000 {
scf.for %arg4 = %c0 to %2 step %c3000 {
scf.for %arg5 = %c0 to %1 step %c4000 {
%3 = memref.subview %arg0[%arg3, %arg5][%c2000, %c4000][%c1, %c1] :
memref<?x?xf32, strided<[?, 1], offset: ?>> to memref<?x?xf32, strided<[?, ?], offset: ?>>
%4 = memref.subview %arg1[%arg5, %arg4][%c4000, %c3000][%c1, %c1] :
memref<?x?xf32, strided<[?, 1], offset: ?>> to memref<?x?xf32, strided<[?, ?], offset: ?>>
%5 = memref.subview %arg2[%arg3, %arg4][%c2000, %c3000][%c1, %c1] :
memref<?x?xf32, strided<[?, 1], offset: ?>> to memref<?x?xf32, strided<[?, ?], offset: ?>>
linalg.matmul {__internal_linalg_transform__ = "_promote_first_view_"}
ins(%3, %4: memref<?x?xf32, strided<[?, ?], offset: ?>>,
memref<?x?xf32, strided<[?, ?], offset: ?>>)
outs(%5: memref<?x?xf32, strided<[?, ?], offset: ?>>)
}
}
}
return
}
// CHECK-LABEL: func @promote_first_subview_matmul
// CHECK-DAG: %[[c0:.*]] = arith.constant 0 : index
// CHECK-DAG: %[[c2000:.*]] = arith.constant 2000 : index
// CHECK-DAG: %[[c3000:.*]] = arith.constant 3000 : index
// CHECK-DAG: %[[c4000:.*]] = arith.constant 4000 : index
// CHECK: scf.for {{.*}} = %[[c0]] to {{.*}} step %[[c2000]] {
// CHECK: scf.for {{.*}} = %[[c0]] to {{.*}} step %[[c3000]] {
// CHECK: scf.for {{.*}} = %[[c0]] to {{.*}} step %[[c4000]] {
// CHECK: %[[s0:.*]] = memref.subview {{.*}}: memref<?x?xf32, strided{{.*}}> to memref<?x?xf32, strided{{.*}}>
// CHECK: %[[s1:.*]] = memref.subview {{.*}}: memref<?x?xf32, strided{{.*}}> to memref<?x?xf32, strided{{.*}}>
// CHECK: %[[s2:.*]] = memref.subview {{.*}}: memref<?x?xf32, strided{{.*}}> to memref<?x?xf32, strided{{.*}}>
// CHECK: %[[a0:.*]] = memref.alloc() : memref<32000000xi8>
// CHECK: %[[v0:.*]] = memref.view %[[a0]]{{.*}} : memref<32000000xi8> to memref<?x?xf32>
// CHECK: %[[l0:.*]] = memref.subview %[[v0]][0, 0] [%{{.*}}, %{{.*}}] [1, 1] : memref<?x?xf32> to memref<?x?xf32, strided<[?, 1], offset: ?>>
// CHECK-NOT: memref.alloc
// CHECK-NOT: memref.view
// CHECK-NOT: memref.subview
// CHECK: memref.copy %[[s0]], %[[l0]] : memref<?x?xf32, strided{{.*}}> to memref<?x?xf32, strided{{.*}}>
// CHECK-NOT: memref.copy
// CHECK: linalg.matmul
// CHECK-SAME: ins(%[[v0]], %[[s1]] : memref<?x?xf32>, memref<?x?xf32, strided<[?, ?], offset: ?>>)
// CHECK-SAME: outs(%[[s2]] : memref<?x?xf32, strided<[?, ?], offset: ?>>)
transform.with_pdl_patterns {
^bb0(%arg0: !transform.any_op):
sequence %arg0 : !transform.any_op failures(propagate) {
^bb0(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.matmul"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = transform.structured.promote %0 { operands_to_promote = [0], use_full_tiles_by_default } : (!transform.any_op) -> !transform.any_op
}
}
// -----
func.func @aligned_promote_fill(%arg0: memref<?x?xf32, strided<[?, 1], offset: ?>>) {
%c2000 = arith.constant 2000 : index
%c4000 = arith.constant 4000 : index
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%cf = arith.constant 1.0 : f32
%3 = memref.subview %arg0[%c0, %c0][%c2000, %c4000][%c1, %c1] :
memref<?x?xf32, strided<[?, 1], offset: ?>> to memref<?x?xf32, strided<[?, ?], offset: ?>>
linalg.fill
ins(%cf : f32) outs(%3 : memref<?x?xf32, strided<[?, ?], offset: ?>>)
return
}
// CHECK-LABEL: func @aligned_promote_fill
// CHECK: %[[cf:.*]] = arith.constant 1.{{.*}} : f32
// CHECK: %[[s0:.*]] = memref.subview {{.*}}: memref<?x?xf32, strided{{.*}}> to memref<?x?xf32, strided{{.*}}>
// CHECK: %[[a0:.*]] = memref.alloc() {alignment = 32 : i64} : memref<32000000xi8>
// CHECK: %[[v0:.*]] = memref.view %[[a0]]{{.*}} : memref<32000000xi8> to memref<?x?xf32>
// CHECK: %[[l0:.*]] = memref.subview %[[v0]][0, 0] [%{{.*}}, %{{.*}}] [1, 1] : memref<?x?xf32> to memref<?x?xf32, strided<[?, 1], offset: ?>>
// CHECK: linalg.fill ins({{.*}} : f32) outs(%[[v0]] : memref<?x?xf32>)
// CHECK: memref.copy %[[s0]], %[[l0]] : memref<?x?xf32, strided{{.*}}> to memref<?x?xf32, strided{{.*}}>
// CHECK: linalg.fill ins(%[[cf]] : f32) outs(%[[v0]] : memref<?x?xf32>)
transform.with_pdl_patterns {
^bb0(%arg0: !transform.any_op):
sequence %arg0 : !transform.any_op failures(propagate) {
^bb0(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.fill"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = transform.structured.promote %0 { operands_to_promote = [1], use_full_tile_buffers = [false, true], alignment = 32} : (!transform.any_op) -> !transform.any_op
}
}
// -----
func.func @aligned_promote_fill_complex(%arg0: memref<?x?xcomplex<f32>, strided<[?, 1], offset: ?>>) {
%c2000 = arith.constant 2000 : index
%c4000 = arith.constant 4000 : index
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%cf = arith.constant 1.0 : f32
%cc = complex.create %cf, %cf : complex<f32>
%3 = memref.subview %arg0[%c0, %c0][%c2000, %c4000][%c1, %c1] :
memref<?x?xcomplex<f32>, strided<[?, 1], offset: ?>> to memref<?x?xcomplex<f32>, strided<[?, ?], offset: ?>>
linalg.fill ins(%cc : complex<f32>)
outs(%3 : memref<?x?xcomplex<f32>, strided<[?, ?], offset: ?>>)
return
}
// CHECK-LABEL: func @aligned_promote_fill_complex
// CHECK: %[[cc:.*]] = complex.create {{.*}} : complex<f32>
// CHECK: %[[s0:.*]] = memref.subview {{.*}}: memref<?x?xcomplex<f32>, strided{{.*}}> to memref<?x?xcomplex<f32>, strided{{.*}}>
// CHECK: %[[a0:.*]] = memref.alloc() {alignment = 32 : i64} : memref<64000000xi8>
// CHECK: %[[v0:.*]] = memref.view %[[a0]]{{.*}} : memref<64000000xi8> to memref<?x?xcomplex<f32>>
// CHECK: %[[l0:.*]] = memref.subview %[[v0]][0, 0] [%{{.*}}, %{{.*}}] [1, 1] : memref<?x?xcomplex<f32>> to memref<?x?xcomplex<f32>, strided<[?, 1], offset: ?>>
// CHECK: linalg.fill ins({{.*}} : complex<f32>) outs(%[[v0]] : memref<?x?xcomplex<f32>>)
// CHECK: memref.copy %[[s0]], %[[l0]] : memref<?x?xcomplex<f32>, strided{{.*}}> to memref<?x?xcomplex<f32>, strided{{.*}}>
// CHECK: linalg.fill ins(%[[cc]] : complex<f32>) outs(%[[v0]] : memref<?x?xcomplex<f32>>)
transform.with_pdl_patterns {
^bb0(%arg0: !transform.any_op):
sequence %arg0 : !transform.any_op failures(propagate) {
^bb0(%arg1: !transform.any_op):
%0 = transform.structured.match ops{["linalg.fill"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = transform.structured.promote %0 { operands_to_promote = [1], use_full_tile_buffers = [false, true], alignment = 32} : (!transform.any_op) -> !transform.any_op
}
}
|