File: mkl-benchmark.txt

package info (click to toggle)
visp 3.6.0-5
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 119,296 kB
  • sloc: cpp: 500,914; ansic: 52,904; xml: 22,642; python: 7,365; java: 4,247; sh: 482; makefile: 237; objc: 145
file content (513 lines) | stat: -rw-r--r-- 33,071 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
Default matrix/vector min size to enable Blas/Lapack optimization: 8
Used matrix/vector min size to enable Blas/Lapack optimization: 0

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
perfMatrixMultiplication is a Catch v2.9.2 host application.
Run with -? for options

-------------------------------------------------------------------------------
Benchmark matrix-matrix multiplication
-------------------------------------------------------------------------------
/home/fspindle/visp-ws/visp-fspindle/modules/core/test/math/perfMatrixMultiplication.cpp:217
...............................................................................

benchmark name                                  samples       iterations    estimated
                                                mean          low mean      high mean
                                                std dev       low std dev   high std dev
-------------------------------------------------------------------------------
(3x3)x(3x3) - Naive code                                100          223    2.0516 ms
                                                      89 ns        88 ns        89 ns
                                                       1 ns         0 ns         3 ns

(3x3)x(3x3) - ViSP                                      100          103      2.06 ms
                                                     202 ns       200 ns       210 ns
                                                      17 ns         1 ns        40 ns

(6x6)x(6x6) - Naive code                                100           95     2.052 ms
                                                     217 ns       216 ns       219 ns
                                                       5 ns         3 ns         9 ns

(6x6)x(6x6) - ViSP                                      100          114    2.0406 ms
                                                     179 ns       178 ns       185 ns
                                                      10 ns         0 ns        24 ns

(8x8)x(8x8) - Naive code                                100           54     2.079 ms
                                                     388 ns       387 ns       390 ns
                                                       6 ns         0 ns        14 ns

(8x8)x(8x8) - ViSP                                      100           84    2.0664 ms
                                                     245 ns       244 ns       253 ns
                                                      16 ns         0 ns        38 ns

(10x10)x(10x10) - Naive code                            100           28    2.0496 ms
                                                     673 ns       672 ns       678 ns
                                                      10 ns         0 ns        26 ns

(10x10)x(10x10) - ViSP                                  100           50     2.045 ms
                                                     385 ns       356 ns       426 ns
                                                     174 ns       135 ns       239 ns

(20x20)x(20x20) - Naive code                            100            4    2.4388 ms
                                                   5.698 us     5.516 us     6.146 us
                                                   1.362 us       498 ns     2.484 us

(20x20)x(20x20) - ViSP                                  100           26    2.0774 ms
                                                     994 ns       950 ns     1.057 us
                                                     269 ns       209 ns       355 ns

(6x200)x(200x6) - Naive code                            100            3    2.2197 ms
                                                   7.398 us     7.266 us     7.735 us
                                                     975 ns       239 ns     1.757 us

(6x200)x(200x6) - ViSP                                  100           32    2.0992 ms
                                                     615 ns       592 ns       669 ns
                                                     171 ns        86 ns       304 ns

(200x6)x(6x200) - Naive code                            100            1    17.175 ms
                                                 165.571 us   160.191 us     173.5 us
                                                  32.984 us    24.704 us    44.521 us

(200x6)x(6x200) - ViSP                                  100            1    2.8158 ms
                                                  29.848 us    28.934 us    31.884 us
                                                   6.576 us     3.754 us    12.739 us

(207x119)x(119x207) - Naive code                        100            1   503.475 ms
                                                 5.26692 ms   5.21554 ms   5.33146 ms
                                                 293.196 us    245.95 us   386.767 us

(207x119)x(119x207) - ViSP                              100            1   24.0501 ms
                                                 241.335 us   232.314 us   253.677 us
                                                  53.573 us     41.88 us    67.573 us

(83x201)x(201x83) - Naive code                          100            1   135.879 ms
                                                 1.36202 ms   1.35929 ms   1.36931 ms
                                                  21.007 us     9.253 us    44.017 us

(83x201)x(201x83) - ViSP                                100            1     8.335 ms
                                                  70.298 us    67.956 us    74.124 us
                                                  14.996 us    10.355 us    21.024 us

(600x400)x(400x600) - Naive code                        100            1    15.4613 s
                                                 171.254 ms       166 ms   178.327 ms
                                                 31.0091 ms   24.7589 ms   39.2442 ms

(600x400)x(400x600) - ViSP                              100            1   647.363 ms
                                                 6.38457 ms   6.34948 ms   6.41008 ms
                                                 151.093 us   109.895 us   216.753 us

(400x600)x(600x400) - Naive code                        100            1     10.185 s
                                                 102.595 ms   102.154 ms    103.28 ms
                                                 2.76177 ms   1.93992 ms   3.67911 ms

(400x600)x(600x400) - ViSP                              100            1   424.223 ms
                                                 4.21401 ms   4.18659 ms   4.23194 ms
                                                 111.623 us    77.423 us   160.784 us


-------------------------------------------------------------------------------
Benchmark matrix-rotation matrix multiplication
-------------------------------------------------------------------------------
/home/fspindle/visp-ws/visp-fspindle/modules/core/test/math/perfMatrixMultiplication.cpp:294
...............................................................................

benchmark name                                  samples       iterations    estimated
                                                mean          low mean      high mean
                                                std dev       low std dev   high std dev
-------------------------------------------------------------------------------
(3x3)x(3x3) - Naive code                                100          162    2.0412 ms
                                                     118 ns       118 ns       119 ns
                                                       2 ns         0 ns         5 ns

(3x3)x(3x3) - ViSP                                      100          221    2.0332 ms
                                                      86 ns        85 ns        87 ns
                                                       2 ns         0 ns         4 ns


-------------------------------------------------------------------------------
Benchmark matrix-homogeneous matrix multiplication
-------------------------------------------------------------------------------
/home/fspindle/visp-ws/visp-fspindle/modules/core/test/math/perfMatrixMultiplication.cpp:379
...............................................................................

benchmark name                                  samples       iterations    estimated
                                                mean          low mean      high mean
                                                std dev       low std dev   high std dev
-------------------------------------------------------------------------------
(4x4)x(4x4) - Naive code                                100          135     2.052 ms
                                                     146 ns       146 ns       148 ns
                                                       3 ns         0 ns         8 ns

(4x4)x(4x4) - ViSP                                      100          174    2.0358 ms
                                                     112 ns       112 ns       113 ns
                                                       2 ns         0 ns         5 ns


-------------------------------------------------------------------------------
Benchmark matrix-vector multiplication
-------------------------------------------------------------------------------
/home/fspindle/visp-ws/visp-fspindle/modules/core/test/math/perfMatrixMultiplication.cpp:467
...............................................................................

benchmark name                                  samples       iterations    estimated
                                                mean          low mean      high mean
                                                std dev       low std dev   high std dev
-------------------------------------------------------------------------------
(3x3)x(3x1) - Naive code                                100          273    2.0475 ms
                                                      73 ns        71 ns        76 ns
                                                      10 ns         1 ns        19 ns

(3x3)x(3x1) - ViSP                                      100          169    2.0449 ms
                                                     139 ns       129 ns       153 ns
                                                      61 ns        45 ns        79 ns

(6x6)x(6x1) - Naive code                                100          213    2.0448 ms
                                                      91 ns        91 ns        91 ns
                                                       1 ns         0 ns         3 ns

(6x6)x(6x1) - ViSP                                      100          158    2.0382 ms
                                                     123 ns       123 ns       126 ns
                                                       5 ns         1 ns        11 ns

(8x8)x(8x1) - Naive code                                100          179    2.0406 ms
                                                     107 ns       107 ns       108 ns
                                                       1 ns         0 ns         4 ns

(8x8)x(8x1) - ViSP                                      100          157     2.041 ms
                                                     124 ns       124 ns       126 ns
                                                       4 ns         0 ns         7 ns

(10x10)x(10x1) - Naive code                             100          132     2.046 ms
                                                     133 ns       133 ns       135 ns
                                                       2 ns         0 ns         5 ns

(10x10)x(10x1) - ViSP                                   100          136      2.04 ms
                                                     137 ns       136 ns       141 ns
                                                      10 ns         0 ns        25 ns

(20x20)x(20x1) - Naive code                             100           59    2.0473 ms
                                                     311 ns       310 ns       313 ns
                                                       4 ns         0 ns        11 ns

(20x20)x(20x1) - ViSP                                   100          106    2.0564 ms
                                                     180 ns       178 ns       185 ns
                                                      12 ns         0 ns        29 ns

(6x200)x(200x1) - Naive code                            100           23    2.1183 ms
                                                     794 ns       793 ns       795 ns
                                                       3 ns         0 ns         7 ns

(6x200)x(200x1) - ViSP                                  100           93    2.0646 ms
                                                     201 ns       199 ns       211 ns
                                                      20 ns         2 ns        47 ns

(200x6)x(6x1) - Naive code                              100           16    2.0624 ms
                                                   1.164 us     1.161 us     1.176 us
                                                      25 ns         1 ns        59 ns

(200x6)x(6x1) - ViSP                                    100           25    2.1075 ms
                                                     781 ns       773 ns       822 ns
                                                      81 ns         0 ns       193 ns

(207x119)x(119x1) - Naive code                          100            2     3.609 ms
                                                  16.763 us    16.421 us    17.584 us
                                                   2.573 us     1.043 us     4.592 us

(207x119)x(119x1) - ViSP                                100            6    2.2386 ms
                                                   3.653 us     3.578 us     3.834 us
                                                     571 ns       275 ns       983 ns

(83x201)x(201x1) - Naive code                           100            2    2.2302 ms
                                                  10.029 us     9.913 us    10.605 us
                                                   1.149 us         8 ns     2.742 us

(83x201)x(201x1) - ViSP                                 100           10     2.199 ms
                                                   2.092 us     2.069 us     2.195 us
                                                     211 ns        26 ns       500 ns

(600x400)x(400x1) - Naive code                          100            1   22.7039 ms
                                                 215.771 us   213.856 us   220.459 us
                                                  14.681 us     7.299 us    25.746 us

(600x400)x(400x1) - ViSP                                100            1    3.4367 ms
                                                   34.26 us    33.546 us    35.702 us
                                                   4.943 us     2.727 us      7.98 us

(400x600)x(600x1) - Naive code                          100            1   18.0648 ms
                                                  169.05 us   166.772 us   174.014 us
                                                  16.283 us     8.605 us    28.822 us

(400x600)x(600x1) - ViSP                                100            1    3.3765 ms
                                                    32.3 us    31.727 us    33.925 us
                                                   4.274 us       644 ns     8.951 us


-------------------------------------------------------------------------------
Benchmark AtA
-------------------------------------------------------------------------------
/home/fspindle/visp-ws/visp-fspindle/modules/core/test/math/perfMatrixMultiplication.cpp:548
...............................................................................

benchmark name                                  samples       iterations    estimated
                                                mean          low mean      high mean
                                                std dev       low std dev   high std dev
-------------------------------------------------------------------------------
(3x3) - Naive code                                      100          219    2.0367 ms
                                                      81 ns        81 ns        82 ns
                                                       1 ns         0 ns         2 ns

(3x3) - ViSP                                            100           87    2.0445 ms
                                                     196 ns       194 ns       202 ns
                                                      13 ns         0 ns        32 ns

(6x6) - Naive code                                      100          111    2.0535 ms
                                                     166 ns       165 ns       167 ns
                                                       3 ns         0 ns         7 ns

(6x6) - ViSP                                            100           85    2.0485 ms
                                                     223 ns       220 ns       233 ns
                                                      21 ns         0 ns        50 ns

(8x8) - Naive code                                      100           67    2.0435 ms
                                                     279 ns       279 ns       282 ns
                                                       5 ns         1 ns        12 ns

(8x8) - ViSP                                            100           77    2.0482 ms
                                                     252 ns       245 ns       266 ns
                                                      47 ns        26 ns        77 ns

(10x10) - Naive code                                    100           40     2.056 ms
                                                     464 ns       463 ns       467 ns
                                                       7 ns         0 ns        17 ns

(10x10) - ViSP                                          100           63    2.0538 ms
                                                     300 ns       298 ns       311 ns
                                                      22 ns         2 ns        53 ns

(20x20) - Naive code                                    100            7    2.3093 ms
                                                   3.044 us     3.041 us     3.059 us
                                                      30 ns         3 ns        71 ns

(20x20) - ViSP                                          100           23    2.0999 ms
                                                     815 ns       806 ns       859 ns
                                                      87 ns         2 ns       208 ns

(6x200) - Naive code                                    100            1   10.4797 ms
                                                  97.175 us    96.383 us    98.895 us
                                                   5.682 us     2.742 us     9.619 us

(6x200) - ViSP                                          100            1    2.6835 ms
                                                  25.388 us    24.922 us    27.178 us
                                                   4.216 us       750 ns     9.909 us

(200x6) - Naive code                                    100            5    2.0935 ms
                                                   4.184 us     4.177 us     4.201 us
                                                      47 ns         4 ns        86 ns

(200x6) - ViSP                                          100           27    2.0871 ms
                                                     690 ns       672 ns       733 ns
                                                     140 ns        69 ns       229 ns

(207x119) - Naive code                                  100            1   144.588 ms
                                                 1.45281 ms   1.44668 ms   1.46487 ms
                                                  42.491 us    24.651 us     67.98 us

(207x119) - ViSP                                        100            1   14.9142 ms
                                                 131.627 us   130.423 us   134.817 us
                                                   8.986 us     2.199 us    16.924 us

(83x201) - Naive code                                   100            1   152.614 ms
                                                 1.49636 ms   1.49332 ms   1.50178 ms
                                                   20.07 us    12.992 us    29.432 us

(83x201) - ViSP                                         100            1     16.69 ms
                                                 144.707 us     143.3 us   147.362 us
                                                   9.528 us     5.805 us    16.117 us

(600x400) - Naive code                                  100            1    5.72741 s
                                                 57.3315 ms   57.1886 ms    57.538 ms
                                                 870.225 us   658.677 us   1.14161 ms

(600x400) - ViSP                                        100            1   435.461 ms
                                                 4.30212 ms   4.27026 ms   4.31975 ms
                                                 117.438 us    75.304 us   184.055 us

(400x600) - Naive code                                  100            1    7.82654 s
                                                 78.6463 ms   78.4704 ms   78.8933 ms
                                                 1.05031 ms   816.214 us   1.43298 ms

(400x600) - ViSP                                        100            1   660.716 ms
                                                 6.53438 ms   6.49527 ms   6.57244 ms
                                                 195.587 us   120.116 us   296.007 us


-------------------------------------------------------------------------------
Benchmark AAt
-------------------------------------------------------------------------------
/home/fspindle/visp-ws/visp-fspindle/modules/core/test/math/perfMatrixMultiplication.cpp:619
...............................................................................

benchmark name                                  samples       iterations    estimated
                                                mean          low mean      high mean
                                                std dev       low std dev   high std dev
-------------------------------------------------------------------------------
(3x3) - Naive code                                      100          217    2.0398 ms
                                                      83 ns        83 ns        84 ns
                                                       2 ns         0 ns         4 ns

(3x3) - ViSP                                            100           91    2.0566 ms
                                                     211 ns       205 ns       223 ns
                                                      38 ns        17 ns        68 ns

(6x6) - Naive code                                      100          110     2.046 ms
                                                     193 ns       182 ns       211 ns
                                                      72 ns        51 ns       102 ns

(6x6) - ViSP                                            100           79     2.054 ms
                                                     267 ns       260 ns       281 ns
                                                      45 ns        25 ns        75 ns

(8x8) - Naive code                                      100           63    2.0412 ms
                                                     337 ns       319 ns       366 ns
                                                     113 ns        78 ns       165 ns

(8x8) - ViSP                                            100           67    2.0636 ms
                                                     279 ns       268 ns       302 ns
                                                      77 ns        46 ns       143 ns

(10x10) - Naive code                                    100           41    2.0828 ms
                                                     459 ns       454 ns       479 ns
                                                      48 ns        16 ns       109 ns

(10x10) - ViSP                                          100           46    2.0838 ms
                                                     363 ns       339 ns       410 ns
                                                     163 ns        96 ns       264 ns

(20x20) - Naive code                                    100            7    2.2001 ms
                                                    4.88 us     4.665 us     5.071 us
                                                   1.032 us       914 ns      1.14 us

(20x20) - ViSP                                          100           22    2.1208 ms
                                                     875 ns       845 ns       951 ns
                                                     240 ns       116 ns       404 ns

(6x200) - Naive code                                    100            5    2.1265 ms
                                                   4.142 us     4.042 us     4.384 us
                                                     758 ns       371 ns     1.295 us

(6x200) - ViSP                                          100           20      2.08 ms
                                                     947 ns       913 ns     1.026 us
                                                     259 ns       120 ns       434 ns

(200x6) - Naive code                                    100            1    9.9795 ms
                                                 100.619 us    96.876 us   105.807 us
                                                  22.349 us    17.575 us    29.378 us

(200x6) - ViSP                                          100            1     2.872 ms
                                                  26.514 us    25.036 us    32.238 us
                                                  12.726 us     3.037 us    29.115 us


-------------------------------------------------------------------------------
Benchmark matrix-velocity twist multiplication
-------------------------------------------------------------------------------
/home/fspindle/visp-ws/visp-fspindle/modules/core/test/math/perfMatrixMultiplication.cpp:690
...............................................................................

benchmark name                                  samples       iterations    estimated
                                                mean          low mean      high mean
                                                std dev       low std dev   high std dev
-------------------------------------------------------------------------------
(6x6)x(6x6) - Naive code                                100           70     2.065 ms
                                                     262 ns       257 ns       275 ns
                                                      35 ns         6 ns        64 ns

(6x6)x(6x6) - ViSP                                      100          106    2.0458 ms
                                                     172 ns       169 ns       179 ns
                                                      16 ns         0 ns        37 ns

(20x6)x(6x6) - Naive code                               100           29    2.0851 ms
                                                     729 ns       696 ns       774 ns
                                                     196 ns       151 ns       248 ns

(20x6)x(6x6) - ViSP                                     100           60     2.058 ms
                                                     328 ns       313 ns       358 ns
                                                     103 ns        61 ns       160 ns

(207x6)x(6x6) - Naive code                              100            4    2.3788 ms
                                                   5.941 us     5.705 us     6.301 us
                                                   1.462 us      1.09 us     2.053 us

(207x6)x(6x6) - ViSP                                    100            9    2.1888 ms
                                                   2.209 us     2.158 us     2.343 us
                                                     381 ns       117 ns       744 ns

(600x6)x(6x6) - Naive code                              100            2    3.3366 ms
                                                  14.785 us    14.589 us    15.278 us
                                                   1.464 us       516 ns     2.755 us

(600x6)x(6x6) - ViSP                                    100            4    2.6204 ms
                                                   6.292 us     6.222 us     6.489 us
                                                     552 ns       179 ns     1.162 us

(1201x6)x(6x6) - Naive code                             100            1    3.1746 ms
                                                  28.902 us    28.787 us    29.142 us
                                                     815 ns       485 ns     1.563 us

(1201x6)x(6x6) - ViSP                                   100            2     2.491 ms
                                                  12.312 us    11.953 us    13.049 us
                                                   2.508 us      1.43 us     4.146 us


-------------------------------------------------------------------------------
Benchmark matrix-force twist multiplication
-------------------------------------------------------------------------------
/home/fspindle/visp-ws/visp-fspindle/modules/core/test/math/perfMatrixMultiplication.cpp:775
...............................................................................

benchmark name                                  samples       iterations    estimated
                                                mean          low mean      high mean
                                                std dev       low std dev   high std dev
-------------------------------------------------------------------------------
(6x6)x(6x6) - Naive code                                100           74    2.0646 ms
                                                     253 ns       253 ns       256 ns
                                                       6 ns         1 ns        15 ns

(6x6)x(6x6) - ViSP                                      100          114    2.0406 ms
                                                     166 ns       166 ns       168 ns
                                                       4 ns         0 ns        11 ns

(20x6)x(6x6) - Naive code                               100           32    2.0608 ms
                                                     624 ns       621 ns       631 ns
                                                      20 ns         5 ns        36 ns

(20x6)x(6x6) - ViSP                                     100           66    2.0592 ms
                                                     292 ns       290 ns       298 ns
                                                      13 ns         2 ns        31 ns

(207x6)x(6x6) - Naive code                              100            4       2.3 ms
                                                    5.46 us      5.28 us     5.871 us
                                                    1.32 us       729 ns     2.362 us

(207x6)x(6x6) - ViSP                                    100           10     2.263 ms
                                                   2.145 us      2.14 us     2.164 us
                                                      40 ns         2 ns        95 ns

(600x6)x(6x6) - Naive code                              100            2      3.23 ms
                                                  14.517 us    14.484 us    14.633 us
                                                     266 ns        24 ns       602 ns

(600x6)x(6x6) - ViSP                                    100            4    2.5576 ms
                                                   5.914 us     5.881 us     6.063 us
                                                     299 ns         7 ns       706 ns

(1201x6)x(6x6) - Naive code                             100            1    3.1943 ms
                                                  30.849 us    29.807 us     32.58 us
                                                   6.719 us      4.79 us     9.907 us

(1201x6)x(6x6) - ViSP                                   100            2     2.477 ms
                                                  11.978 us    11.768 us    12.505 us
                                                   1.594 us       629 ns     2.894 us


===============================================================================
All tests passed (60 assertions in 8 test cases)