1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168
|
# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=bdver2 -timeline -timeline-max-iterations=1 -register-file-stats < %s | FileCheck %s
# These are dependency-breaking one-idioms.
# Much like zero-idioms, but they produce ones, and do consume resources.
# perf stats reports a throughput of 2.00 IPC.
pcmpeqb %mm2, %mm2
pcmpeqd %mm2, %mm2
pcmpeqw %mm2, %mm2
pcmpeqb %xmm2, %xmm2
pcmpeqd %xmm2, %xmm2
pcmpeqq %xmm2, %xmm2
pcmpeqw %xmm2, %xmm2
vpcmpeqb %xmm3, %xmm3, %xmm3
vpcmpeqd %xmm3, %xmm3, %xmm3
vpcmpeqq %xmm3, %xmm3, %xmm3
vpcmpeqw %xmm3, %xmm3, %xmm3
vpcmpeqb %xmm3, %xmm3, %xmm5
vpcmpeqd %xmm3, %xmm3, %xmm5
vpcmpeqq %xmm3, %xmm3, %xmm5
vpcmpeqw %xmm3, %xmm3, %xmm5
# FIXME: their handling is broken in llvm-mca.
# CHECK: Iterations: 100
# CHECK-NEXT: Instructions: 1500
# CHECK-NEXT: Total Cycles: 1353
# CHECK-NEXT: Total uOps: 1500
# CHECK: Dispatch Width: 4
# CHECK-NEXT: uOps Per Cycle: 1.11
# CHECK-NEXT: IPC: 1.11
# CHECK-NEXT: Block RThroughput: 13.5
# CHECK: Instruction Info:
# CHECK-NEXT: [1]: #uOps
# CHECK-NEXT: [2]: Latency
# CHECK-NEXT: [3]: RThroughput
# CHECK-NEXT: [4]: MayLoad
# CHECK-NEXT: [5]: MayStore
# CHECK-NEXT: [6]: HasSideEffects (U)
# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
# CHECK-NEXT: 1 2 0.50 pcmpeqb %mm2, %mm2
# CHECK-NEXT: 1 2 0.50 pcmpeqd %mm2, %mm2
# CHECK-NEXT: 1 2 0.50 pcmpeqw %mm2, %mm2
# CHECK-NEXT: 1 2 1.00 pcmpeqb %xmm2, %xmm2
# CHECK-NEXT: 1 2 1.00 pcmpeqd %xmm2, %xmm2
# CHECK-NEXT: 1 2 1.00 pcmpeqq %xmm2, %xmm2
# CHECK-NEXT: 1 2 1.00 pcmpeqw %xmm2, %xmm2
# CHECK-NEXT: 1 2 1.00 vpcmpeqb %xmm3, %xmm3, %xmm3
# CHECK-NEXT: 1 2 1.00 vpcmpeqd %xmm3, %xmm3, %xmm3
# CHECK-NEXT: 1 2 1.00 vpcmpeqq %xmm3, %xmm3, %xmm3
# CHECK-NEXT: 1 2 1.00 vpcmpeqw %xmm3, %xmm3, %xmm3
# CHECK-NEXT: 1 2 1.00 vpcmpeqb %xmm3, %xmm3, %xmm5
# CHECK-NEXT: 1 2 1.00 vpcmpeqd %xmm3, %xmm3, %xmm5
# CHECK-NEXT: 1 2 1.00 vpcmpeqq %xmm3, %xmm3, %xmm5
# CHECK-NEXT: 1 2 1.00 vpcmpeqw %xmm3, %xmm3, %xmm5
# CHECK: Register File statistics:
# CHECK-NEXT: Total number of mappings created: 1500
# CHECK-NEXT: Max number of mappings used: 69
# CHECK: * Register File #1 -- PdFpuPRF:
# CHECK-NEXT: Number of physical registers: 160
# CHECK-NEXT: Total number of mappings created: 1500
# CHECK-NEXT: Max number of mappings used: 69
# CHECK: * Register File #2 -- PdIntegerPRF:
# CHECK-NEXT: Number of physical registers: 96
# CHECK-NEXT: Total number of mappings created: 0
# CHECK-NEXT: Max number of mappings used: 0
# CHECK: Resources:
# CHECK-NEXT: [0.0] - PdAGLU01
# CHECK-NEXT: [0.1] - PdAGLU01
# CHECK-NEXT: [1] - PdBranch
# CHECK-NEXT: [2] - PdCount
# CHECK-NEXT: [3] - PdDiv
# CHECK-NEXT: [4] - PdEX0
# CHECK-NEXT: [5] - PdEX1
# CHECK-NEXT: [6] - PdFPCVT
# CHECK-NEXT: [7.0] - PdFPFMA
# CHECK-NEXT: [7.1] - PdFPFMA
# CHECK-NEXT: [8.0] - PdFPMAL
# CHECK-NEXT: [8.1] - PdFPMAL
# CHECK-NEXT: [9] - PdFPMMA
# CHECK-NEXT: [10] - PdFPSTO
# CHECK-NEXT: [11] - PdFPU0
# CHECK-NEXT: [12] - PdFPU1
# CHECK-NEXT: [13] - PdFPU2
# CHECK-NEXT: [14] - PdFPU3
# CHECK-NEXT: [15] - PdFPXBR
# CHECK-NEXT: [16.0] - PdLoad
# CHECK-NEXT: [16.1] - PdLoad
# CHECK-NEXT: [17] - PdMul
# CHECK-NEXT: [18] - PdStore
# CHECK: Resource pressure per iteration:
# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18]
# CHECK-NEXT: - - - - - - - - - - 13.50 13.50 - - 7.50 7.50 - - - - - - -
# CHECK: Resource pressure by instruction:
# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:
# CHECK-NEXT: - - - - - - - - - - 0.50 0.50 - - 0.50 0.50 - - - - - - - pcmpeqb %mm2, %mm2
# CHECK-NEXT: - - - - - - - - - - 0.50 0.50 - - 0.50 0.50 - - - - - - - pcmpeqd %mm2, %mm2
# CHECK-NEXT: - - - - - - - - - - 0.50 0.50 - - 0.50 0.50 - - - - - - - pcmpeqw %mm2, %mm2
# CHECK-NEXT: - - - - - - - - - - 1.00 1.00 - - 0.50 0.50 - - - - - - - pcmpeqb %xmm2, %xmm2
# CHECK-NEXT: - - - - - - - - - - 1.00 1.00 - - 0.50 0.50 - - - - - - - pcmpeqd %xmm2, %xmm2
# CHECK-NEXT: - - - - - - - - - - 1.00 1.00 - - 0.50 0.50 - - - - - - - pcmpeqq %xmm2, %xmm2
# CHECK-NEXT: - - - - - - - - - - 1.00 1.00 - - 0.50 0.50 - - - - - - - pcmpeqw %xmm2, %xmm2
# CHECK-NEXT: - - - - - - - - - - 1.00 1.00 - - 0.50 0.50 - - - - - - - vpcmpeqb %xmm3, %xmm3, %xmm3
# CHECK-NEXT: - - - - - - - - - - 1.00 1.00 - - 0.50 0.50 - - - - - - - vpcmpeqd %xmm3, %xmm3, %xmm3
# CHECK-NEXT: - - - - - - - - - - 1.00 1.00 - - 0.50 0.50 - - - - - - - vpcmpeqq %xmm3, %xmm3, %xmm3
# CHECK-NEXT: - - - - - - - - - - 1.00 1.00 - - 0.50 0.50 - - - - - - - vpcmpeqw %xmm3, %xmm3, %xmm3
# CHECK-NEXT: - - - - - - - - - - 1.00 1.00 - - 0.50 0.50 - - - - - - - vpcmpeqb %xmm3, %xmm3, %xmm5
# CHECK-NEXT: - - - - - - - - - - 1.00 1.00 - - 0.50 0.50 - - - - - - - vpcmpeqd %xmm3, %xmm3, %xmm5
# CHECK-NEXT: - - - - - - - - - - 1.00 1.00 - - 0.50 0.50 - - - - - - - vpcmpeqq %xmm3, %xmm3, %xmm5
# CHECK-NEXT: - - - - - - - - - - 1.00 1.00 - - 0.50 0.50 - - - - - - - vpcmpeqw %xmm3, %xmm3, %xmm5
# CHECK: Timeline view:
# CHECK-NEXT: 0123456
# CHECK-NEXT: Index 0123456789
# CHECK: [0,0] DeeER. . .. pcmpeqb %mm2, %mm2
# CHECK-NEXT: [0,1] DeeER. . .. pcmpeqd %mm2, %mm2
# CHECK-NEXT: [0,2] D=eeER . .. pcmpeqw %mm2, %mm2
# CHECK-NEXT: [0,3] D==eeER . .. pcmpeqb %xmm2, %xmm2
# CHECK-NEXT: [0,4] .DeeE-R . .. pcmpeqd %xmm2, %xmm2
# CHECK-NEXT: [0,5] .D==eeER . .. pcmpeqq %xmm2, %xmm2
# CHECK-NEXT: [0,6] .D===eeER . .. pcmpeqw %xmm2, %xmm2
# CHECK-NEXT: [0,7] .D=====eeER .. vpcmpeqb %xmm3, %xmm3, %xmm3
# CHECK-NEXT: [0,8] . D===eeE-R .. vpcmpeqd %xmm3, %xmm3, %xmm3
# CHECK-NEXT: [0,9] . D======eeER .. vpcmpeqq %xmm3, %xmm3, %xmm3
# CHECK-NEXT: [0,10] . D=====eeE-R .. vpcmpeqw %xmm3, %xmm3, %xmm3
# CHECK-NEXT: [0,11] . D=======eeER .. vpcmpeqb %xmm3, %xmm3, %xmm5
# CHECK-NEXT: [0,12] . D=======eeER.. vpcmpeqd %xmm3, %xmm3, %xmm5
# CHECK-NEXT: [0,13] . D========eeER. vpcmpeqq %xmm3, %xmm3, %xmm5
# CHECK-NEXT: [0,14] . D=========eeER vpcmpeqw %xmm3, %xmm3, %xmm5
# CHECK: Average Wait times (based on the timeline view):
# CHECK-NEXT: [0]: Executions
# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
# CHECK: [0] [1] [2] [3]
# CHECK-NEXT: 0. 1 1.0 1.0 0.0 pcmpeqb %mm2, %mm2
# CHECK-NEXT: 1. 1 1.0 1.0 0.0 pcmpeqd %mm2, %mm2
# CHECK-NEXT: 2. 1 2.0 2.0 0.0 pcmpeqw %mm2, %mm2
# CHECK-NEXT: 3. 1 3.0 3.0 0.0 pcmpeqb %xmm2, %xmm2
# CHECK-NEXT: 4. 1 1.0 1.0 1.0 pcmpeqd %xmm2, %xmm2
# CHECK-NEXT: 5. 1 3.0 0.0 0.0 pcmpeqq %xmm2, %xmm2
# CHECK-NEXT: 6. 1 4.0 4.0 0.0 pcmpeqw %xmm2, %xmm2
# CHECK-NEXT: 7. 1 6.0 6.0 0.0 vpcmpeqb %xmm3, %xmm3, %xmm3
# CHECK-NEXT: 8. 1 4.0 4.0 1.0 vpcmpeqd %xmm3, %xmm3, %xmm3
# CHECK-NEXT: 9. 1 7.0 1.0 0.0 vpcmpeqq %xmm3, %xmm3, %xmm3
# CHECK-NEXT: 10. 1 6.0 6.0 1.0 vpcmpeqw %xmm3, %xmm3, %xmm3
# CHECK-NEXT: 11. 1 8.0 8.0 0.0 vpcmpeqb %xmm3, %xmm3, %xmm5
# CHECK-NEXT: 12. 1 8.0 8.0 0.0 vpcmpeqd %xmm3, %xmm3, %xmm5
# CHECK-NEXT: 13. 1 9.0 2.0 0.0 vpcmpeqq %xmm3, %xmm3, %xmm5
# CHECK-NEXT: 14. 1 10.0 10.0 0.0 vpcmpeqw %xmm3, %xmm3, %xmm5
# CHECK-NEXT: 1 4.9 3.8 0.2 <total>
|