1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252
|
# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
# RUN: llvm-mca -mtriple=riscv64 -mcpu=andes-nx45 -mattr=+b,+zbc -timeline -iterations=1 < %s | FileCheck %s
# Two ALUs without dependency can be dispatched in the same cycle.
add a0, a0, a0
sub a1, a1, a1
# Otherwise, they can't be dispatched in the same cycle.
addw a0, a0, a0
subw a0, a0, a0
// ALU and Shift
slli a0, a0, 4
slliw a0, a0, 4
srl a0, a0, a0
srlw a0, a0, a0
// MDU
mul a0, a0, a0
mulw a0, a0, a0
div a0, a0, a0
divw a0, a0, a0
// Memory
lb a0, 4(a1)
lh a0, 4(a1)
lw a0, 4(a1)
ld a0, 4(a1)
flw fa0, 4(a1)
fld fa0, 4(a1)
sb a0, 4(a1)
sh a0, 4(a1)
sw a0, 4(a1)
sd a0, 4(a1)
// Atomic Memory
amoswap.w a0, a1, (a0)
amoswap.d a0, a1, (a0)
lr.w a0, (a0)
lr.d a0, (a0)
sc.w a0, a1, (a0)
sc.d a0, a1, (a0)
// CSR
csrrw a0, mstatus, zero
// Bitmanip
sh1add a0, a0, a0
sh1add.uw a0, a0, a0
rori a0, a0, 4
roriw a0, a0, 4
rol a0, a0, a0
rolw a0, a0, a0
clz a0, a0
clzw a0, a0
clmul a0, a0, a0
bclri a0, a0, 4
bclr a0, a0, a0
bexti a0, a0, 4
bext a0, a0, a0
# CHECK: Iterations: 1
# CHECK-NEXT: Instructions: 42
# CHECK-NEXT: Total Cycles: 158
# CHECK-NEXT: Total uOps: 42
# CHECK: Dispatch Width: 2
# CHECK-NEXT: uOps Per Cycle: 0.27
# CHECK-NEXT: IPC: 0.27
# CHECK-NEXT: Block RThroughput: 80.0
# CHECK: Instruction Info:
# CHECK-NEXT: [1]: #uOps
# CHECK-NEXT: [2]: Latency
# CHECK-NEXT: [3]: RThroughput
# CHECK-NEXT: [4]: MayLoad
# CHECK-NEXT: [5]: MayStore
# CHECK-NEXT: [6]: HasSideEffects (U)
# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
# CHECK-NEXT: 1 1 0.50 add a0, a0, a0
# CHECK-NEXT: 1 1 0.50 sub a1, a1, a1
# CHECK-NEXT: 1 1 0.50 addw a0, a0, a0
# CHECK-NEXT: 1 1 0.50 subw a0, a0, a0
# CHECK-NEXT: 1 1 0.50 slli a0, a0, 4
# CHECK-NEXT: 1 1 0.50 slliw a0, a0, 4
# CHECK-NEXT: 1 1 0.50 srl a0, a0, a0
# CHECK-NEXT: 1 1 0.50 srlw a0, a0, a0
# CHECK-NEXT: 1 3 1.00 mul a0, a0, a0
# CHECK-NEXT: 1 3 1.00 mulw a0, a0, a0
# CHECK-NEXT: 1 39 39.00 div a0, a0, a0
# CHECK-NEXT: 1 39 39.00 divw a0, a0, a0
# CHECK-NEXT: 1 5 1.00 * lb a0, 4(a1)
# CHECK-NEXT: 1 5 1.00 * lh a0, 4(a1)
# CHECK-NEXT: 1 3 1.00 * lw a0, 4(a1)
# CHECK-NEXT: 1 3 1.00 * ld a0, 4(a1)
# CHECK-NEXT: 1 3 1.00 * flw fa0, 4(a1)
# CHECK-NEXT: 1 3 1.00 * fld fa0, 4(a1)
# CHECK-NEXT: 1 1 1.00 * sb a0, 4(a1)
# CHECK-NEXT: 1 1 1.00 * sh a0, 4(a1)
# CHECK-NEXT: 1 1 1.00 * sw a0, 4(a1)
# CHECK-NEXT: 1 1 1.00 * sd a0, 4(a1)
# CHECK-NEXT: 1 9 1.00 * * amoswap.w a0, a1, (a0)
# CHECK-NEXT: 1 9 1.00 * * amoswap.d a0, a1, (a0)
# CHECK-NEXT: 1 9 1.00 * lr.w a0, (a0)
# CHECK-NEXT: 1 9 1.00 * lr.d a0, (a0)
# CHECK-NEXT: 1 3 1.00 * sc.w a0, a1, (a0)
# CHECK-NEXT: 1 3 1.00 * sc.d a0, a1, (a0)
# CHECK-NEXT: 1 1 1.00 U csrrw a0, mstatus, zero
# CHECK-NEXT: 1 1 0.50 sh1add a0, a0, a0
# CHECK-NEXT: 1 1 0.50 sh1add.uw a0, a0, a0
# CHECK-NEXT: 1 1 0.50 rori a0, a0, 4
# CHECK-NEXT: 1 1 0.50 roriw a0, a0, 4
# CHECK-NEXT: 1 1 0.50 rol a0, a0, a0
# CHECK-NEXT: 1 1 0.50 rolw a0, a0, a0
# CHECK-NEXT: 1 3 0.50 clz a0, a0
# CHECK-NEXT: 1 3 0.50 clzw a0, a0
# CHECK-NEXT: 1 3 0.50 clmul a0, a0, a0
# CHECK-NEXT: 1 1 0.50 bclri a0, a0, 4
# CHECK-NEXT: 1 1 0.50 bclr a0, a0, a0
# CHECK-NEXT: 1 1 0.50 bexti a0, a0, 4
# CHECK-NEXT: 1 1 0.50 bext a0, a0, a0
# CHECK: Resources:
# CHECK-NEXT: [0.0] - Andes45ALU
# CHECK-NEXT: [0.1] - Andes45ALU
# CHECK-NEXT: [1] - Andes45CSR
# CHECK-NEXT: [2] - Andes45FDIV
# CHECK-NEXT: [3] - Andes45FMAC
# CHECK-NEXT: [4] - Andes45FMISC
# CHECK-NEXT: [5] - Andes45FMV
# CHECK-NEXT: [6] - Andes45LSU
# CHECK-NEXT: [7] - Andes45MDU
# CHECK: Resource pressure per iteration:
# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7]
# CHECK-NEXT: 10.00 11.00 1.00 - - - - 16.00 80.00
# CHECK: Resource pressure by instruction:
# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7] Instructions:
# CHECK-NEXT: - 1.00 - - - - - - - add a0, a0, a0
# CHECK-NEXT: 1.00 - - - - - - - - sub a1, a1, a1
# CHECK-NEXT: - 1.00 - - - - - - - addw a0, a0, a0
# CHECK-NEXT: 1.00 - - - - - - - - subw a0, a0, a0
# CHECK-NEXT: - 1.00 - - - - - - - slli a0, a0, 4
# CHECK-NEXT: 1.00 - - - - - - - - slliw a0, a0, 4
# CHECK-NEXT: - 1.00 - - - - - - - srl a0, a0, a0
# CHECK-NEXT: 1.00 - - - - - - - - srlw a0, a0, a0
# CHECK-NEXT: - - - - - - - - 1.00 mul a0, a0, a0
# CHECK-NEXT: - - - - - - - - 1.00 mulw a0, a0, a0
# CHECK-NEXT: - - - - - - - - 39.00 div a0, a0, a0
# CHECK-NEXT: - - - - - - - - 39.00 divw a0, a0, a0
# CHECK-NEXT: - - - - - - - 1.00 - lb a0, 4(a1)
# CHECK-NEXT: - - - - - - - 1.00 - lh a0, 4(a1)
# CHECK-NEXT: - - - - - - - 1.00 - lw a0, 4(a1)
# CHECK-NEXT: - - - - - - - 1.00 - ld a0, 4(a1)
# CHECK-NEXT: - - - - - - - 1.00 - flw fa0, 4(a1)
# CHECK-NEXT: - - - - - - - 1.00 - fld fa0, 4(a1)
# CHECK-NEXT: - - - - - - - 1.00 - sb a0, 4(a1)
# CHECK-NEXT: - - - - - - - 1.00 - sh a0, 4(a1)
# CHECK-NEXT: - - - - - - - 1.00 - sw a0, 4(a1)
# CHECK-NEXT: - - - - - - - 1.00 - sd a0, 4(a1)
# CHECK-NEXT: - - - - - - - 1.00 - amoswap.w a0, a1, (a0)
# CHECK-NEXT: - - - - - - - 1.00 - amoswap.d a0, a1, (a0)
# CHECK-NEXT: - - - - - - - 1.00 - lr.w a0, (a0)
# CHECK-NEXT: - - - - - - - 1.00 - lr.d a0, (a0)
# CHECK-NEXT: - - - - - - - 1.00 - sc.w a0, a1, (a0)
# CHECK-NEXT: - - - - - - - 1.00 - sc.d a0, a1, (a0)
# CHECK-NEXT: - - 1.00 - - - - - - csrrw a0, mstatus, zero
# CHECK-NEXT: - 1.00 - - - - - - - sh1add a0, a0, a0
# CHECK-NEXT: 1.00 - - - - - - - - sh1add.uw a0, a0, a0
# CHECK-NEXT: - 1.00 - - - - - - - rori a0, a0, 4
# CHECK-NEXT: 1.00 - - - - - - - - roriw a0, a0, 4
# CHECK-NEXT: - 1.00 - - - - - - - rol a0, a0, a0
# CHECK-NEXT: 1.00 - - - - - - - - rolw a0, a0, a0
# CHECK-NEXT: - 1.00 - - - - - - - clz a0, a0
# CHECK-NEXT: 1.00 - - - - - - - - clzw a0, a0
# CHECK-NEXT: - 1.00 - - - - - - - clmul a0, a0, a0
# CHECK-NEXT: 1.00 - - - - - - - - bclri a0, a0, 4
# CHECK-NEXT: - 1.00 - - - - - - - bclr a0, a0, a0
# CHECK-NEXT: 1.00 - - - - - - - - bexti a0, a0, 4
# CHECK-NEXT: - 1.00 - - - - - - - bext a0, a0, a0
# CHECK: Timeline view:
# CHECK-NEXT: 0123456789 0123456789 012
# CHECK-NEXT: Index 0123456789 0123456789 0123456789
# CHECK: [0,0] DE . . . . . . . . . . . add a0, a0, a0
# CHECK-NEXT: [0,1] DE . . . . . . . . . . . sub a1, a1, a1
# CHECK-NEXT: [0,2] .DE . . . . . . . . . . . addw a0, a0, a0
# CHECK-NEXT: [0,3] . DE . . . . . . . . . . . subw a0, a0, a0
# CHECK-NEXT: [0,4] . DE. . . . . . . . . . . slli a0, a0, 4
# CHECK-NEXT: [0,5] . DE . . . . . . . . . . slliw a0, a0, 4
# CHECK-NEXT: [0,6] . DE . . . . . . . . . . srl a0, a0, a0
# CHECK-NEXT: [0,7] . .DE . . . . . . . . . . srlw a0, a0, a0
# CHECK-NEXT: [0,8] . . DeeE . . . . . . . . . mul a0, a0, a0
# CHECK-NEXT: [0,9] . . DeeE . . . . . . . . . mulw a0, a0, a0
# CHECK-NEXT: [0,10] . . . DeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeE div a0, a0, a0
# CHECK-NEXT: Truncated display due to cycle limit
# CHECK: Average Wait times (based on the timeline view):
# CHECK-NEXT: [0]: Executions
# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
# CHECK: [0] [1] [2] [3]
# CHECK-NEXT: 0. 1 0.0 0.0 0.0 add a0, a0, a0
# CHECK-NEXT: 1. 1 0.0 0.0 0.0 sub a1, a1, a1
# CHECK-NEXT: 2. 1 0.0 0.0 0.0 addw a0, a0, a0
# CHECK-NEXT: 3. 1 0.0 0.0 0.0 subw a0, a0, a0
# CHECK-NEXT: 4. 1 0.0 0.0 0.0 slli a0, a0, 4
# CHECK-NEXT: 5. 1 0.0 0.0 0.0 slliw a0, a0, 4
# CHECK-NEXT: 6. 1 0.0 0.0 0.0 srl a0, a0, a0
# CHECK-NEXT: 7. 1 0.0 0.0 0.0 srlw a0, a0, a0
# CHECK-NEXT: 8. 1 0.0 0.0 0.0 mul a0, a0, a0
# CHECK-NEXT: 9. 1 0.0 0.0 0.0 mulw a0, a0, a0
# CHECK-NEXT: 10. 1 0.0 0.0 0.0 div a0, a0, a0
# CHECK-NEXT: 11. 1 0.0 0.0 0.0 divw a0, a0, a0
# CHECK-NEXT: 12. 1 0.0 0.0 0.0 lb a0, 4(a1)
# CHECK-NEXT: 13. 1 0.0 0.0 0.0 lh a0, 4(a1)
# CHECK-NEXT: 14. 1 0.0 0.0 0.0 lw a0, 4(a1)
# CHECK-NEXT: 15. 1 0.0 0.0 0.0 ld a0, 4(a1)
# CHECK-NEXT: 16. 1 0.0 0.0 0.0 flw fa0, 4(a1)
# CHECK-NEXT: 17. 1 0.0 0.0 0.0 fld fa0, 4(a1)
# CHECK-NEXT: 18. 1 0.0 0.0 0.0 sb a0, 4(a1)
# CHECK-NEXT: 19. 1 0.0 0.0 0.0 sh a0, 4(a1)
# CHECK-NEXT: 20. 1 0.0 0.0 0.0 sw a0, 4(a1)
# CHECK-NEXT: 21. 1 0.0 0.0 0.0 sd a0, 4(a1)
# CHECK-NEXT: 22. 1 0.0 0.0 0.0 amoswap.w a0, a1, (a0)
# CHECK-NEXT: 23. 1 0.0 0.0 0.0 amoswap.d a0, a1, (a0)
# CHECK-NEXT: 24. 1 0.0 0.0 0.0 lr.w a0, (a0)
# CHECK-NEXT: 25. 1 0.0 0.0 0.0 lr.d a0, (a0)
# CHECK-NEXT: 26. 1 0.0 0.0 0.0 sc.w a0, a1, (a0)
# CHECK-NEXT: 27. 1 0.0 0.0 0.0 sc.d a0, a1, (a0)
# CHECK-NEXT: 28. 1 0.0 0.0 0.0 csrrw a0, mstatus, zero
# CHECK-NEXT: 29. 1 0.0 0.0 0.0 sh1add a0, a0, a0
# CHECK-NEXT: 30. 1 0.0 0.0 0.0 sh1add.uw a0, a0, a0
# CHECK-NEXT: 31. 1 0.0 0.0 0.0 rori a0, a0, 4
# CHECK-NEXT: 32. 1 0.0 0.0 0.0 roriw a0, a0, 4
# CHECK-NEXT: 33. 1 0.0 0.0 0.0 rol a0, a0, a0
# CHECK-NEXT: 34. 1 0.0 0.0 0.0 rolw a0, a0, a0
# CHECK-NEXT: 35. 1 0.0 0.0 0.0 clz a0, a0
# CHECK-NEXT: 36. 1 0.0 0.0 0.0 clzw a0, a0
# CHECK-NEXT: 37. 1 0.0 0.0 0.0 clmul a0, a0, a0
# CHECK-NEXT: 38. 1 0.0 0.0 0.0 bclri a0, a0, 4
# CHECK-NEXT: 39. 1 0.0 0.0 0.0 bclr a0, a0, a0
# CHECK-NEXT: 40. 1 0.0 0.0 0.0 bexti a0, a0, 4
# CHECK-NEXT: 41. 1 0.0 0.0 0.0 bext a0, a0, a0
# CHECK-NEXT: 1 0.0 0.0 0.0 <total>
|