1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287
|
mxm_demo
Prob =
struct with fields:
A: [9000x9000 double]
name: 'ND/nd3k'
title: 'ND problem set, matrix nd3k'
id: 936
date: '2003'
author: 'author unknown'
ed: 'T. Davis'
kind: '2D/3D problem'
Prob2 =
struct with fields:
name: 'Freescale/Freescale2'
title: 'circuit simulation matrix from Freescale'
A: [2999349x2999349 double]
Zeros: [2999349x2999349 double]
id: 2662
date: '2015'
author: 'K. Gullapalli'
ed: 'T. Davis'
kind: 'circuit simulation matrix'
notes: [4x59 char]
hypersparse.cs.tamu.edu
MATLAB version: 9.9 release: (R2020b)
GraphBLAS version: 4.0.1 (Jan 4, 2021)
-------------------------------------------------
Testing single-threaded performance of C=A*B:
-------------------------------------------------
=== builtin: double (real) vs GraphBLAS: single
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin: 3.7771 GrB: 3.5406 speedup: 1.07 err: 1.63154e-07
trial 2: builtin: 3.7923 GrB: 3.5586 speedup: 1.07 err: 1.63154e-07
trial 3: builtin: 3.8008 GrB: 3.5582 speedup: 1.07 err: 1.63154e-07
trial 4: builtin: 3.7916 GrB: 3.5497 speedup: 1.07 err: 1.63154e-07
average: builtin: 3.7904 GrB: 3.5518 speedup: 1.07
C=A*x: sparse matrix times sparse vector:
trial 1: builtin: 0.2234 GrB: 0.0773 speedup: 2.89 err: 3.60006e-08
trial 2: builtin: 0.2225 GrB: 0.0660 speedup: 3.37 err: 3.60006e-08
trial 3: builtin: 0.2102 GrB: 0.0661 speedup: 3.18 err: 3.60006e-08
trial 4: builtin: 0.2221 GrB: 0.0662 speedup: 3.36 err: 3.60006e-08
average: builtin: 0.2196 GrB: 0.0689 speedup: 3.19
C=A*x: sparse matrix times dense vector:
trial 1: builtin: 0.0416 GrB: 0.0611 speedup: 0.68 err: 4.86966e-08
trial 2: builtin: 0.0372 GrB: 0.0611 speedup: 0.61 err: 4.86966e-08
trial 3: builtin: 0.0375 GrB: 0.0611 speedup: 0.61 err: 4.86966e-08
trial 4: builtin: 0.0372 GrB: 0.0611 speedup: 0.61 err: 4.86966e-08
average: builtin: 0.0384 GrB: 0.0611 speedup: 0.63
=== builtin: double (real) vs GraphBLAS: double
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin: 3.7769 GrB: 3.8013 speedup: 0.99 err: 0
trial 2: builtin: 3.7918 GrB: 3.8230 speedup: 0.99 err: 0
trial 3: builtin: 3.7907 GrB: 3.8220 speedup: 0.99 err: 0
trial 4: builtin: 3.7894 GrB: 3.8224 speedup: 0.99 err: 0
average: builtin: 3.7872 GrB: 3.8172 speedup: 0.99
C=A*x: sparse matrix times sparse vector:
trial 1: builtin: 0.2227 GrB: 0.0814 speedup: 2.74 err: 0
trial 2: builtin: 0.2061 GrB: 0.0653 speedup: 3.16 err: 0
trial 3: builtin: 0.2054 GrB: 0.0654 speedup: 3.14 err: 0
trial 4: builtin: 0.2057 GrB: 0.0653 speedup: 3.15 err: 0
average: builtin: 0.2100 GrB: 0.0694 speedup: 3.03
C=A*x: sparse matrix times dense vector:
trial 1: builtin: 0.0379 GrB: 0.0653 speedup: 0.58 err: 0
trial 2: builtin: 0.0372 GrB: 0.0650 speedup: 0.57 err: 0
trial 3: builtin: 0.0377 GrB: 0.0653 speedup: 0.58 err: 0
trial 4: builtin: 0.0371 GrB: 0.0649 speedup: 0.57 err: 0
average: builtin: 0.0375 GrB: 0.0651 speedup: 0.58
=== builtin: double complex vs GraphBLAS: single complex
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin: 8.5948 GrB: 4.9319 speedup: 1.74 err: 1.70413e-07
trial 2: builtin: 8.6180 GrB: 4.9453 speedup: 1.74 err: 1.70413e-07
trial 3: builtin: 8.6200 GrB: 4.9454 speedup: 1.74 err: 1.70413e-07
trial 4: builtin: 8.6130 GrB: 4.9449 speedup: 1.74 err: 1.70413e-07
average: builtin: 8.6115 GrB: 4.9419 speedup: 1.74
C=A*x: sparse matrix times sparse vector:
trial 1: builtin: 0.3120 GrB: 0.0911 speedup: 3.43 err: 4.56897e-08
trial 2: builtin: 0.2903 GrB: 0.0752 speedup: 3.86 err: 4.56897e-08
trial 3: builtin: 0.2899 GrB: 0.0756 speedup: 3.83 err: 4.56897e-08
trial 4: builtin: 0.2896 GrB: 0.0752 speedup: 3.85 err: 4.56897e-08
average: builtin: 0.2955 GrB: 0.0793 speedup: 3.73
C=A*x: sparse matrix times dense vector:
trial 1: builtin: 0.1341 GrB: 0.0786 speedup: 1.71 err: 5.75158e-08
trial 2: builtin: 0.1348 GrB: 0.0788 speedup: 1.71 err: 5.75158e-08
trial 3: builtin: 0.1344 GrB: 0.0788 speedup: 1.71 err: 5.75158e-08
trial 4: builtin: 0.1344 GrB: 0.0787 speedup: 1.71 err: 5.75158e-08
average: builtin: 0.1344 GrB: 0.0787 speedup: 1.71
=== builtin: double complex vs GraphBLAS: double complex
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin: 8.6011 GrB: 5.4293 speedup: 1.58 err: 0
trial 2: builtin: 8.6057 GrB: 5.4520 speedup: 1.58 err: 0
trial 3: builtin: 8.6198 GrB: 5.4410 speedup: 1.58 err: 0
trial 4: builtin: 8.6164 GrB: 5.4510 speedup: 1.58 err: 0
average: builtin: 8.6108 GrB: 5.4433 speedup: 1.58
C=A*x: sparse matrix times sparse vector:
trial 1: builtin: 0.3115 GrB: 0.1522 speedup: 2.05 err: 0
trial 2: builtin: 0.2987 GrB: 0.1311 speedup: 2.28 err: 0
trial 3: builtin: 0.2888 GrB: 0.1310 speedup: 2.20 err: 0
trial 4: builtin: 0.2886 GrB: 0.1310 speedup: 2.20 err: 0
average: builtin: 0.2969 GrB: 0.1363 speedup: 2.18
C=A*x: sparse matrix times dense vector:
trial 1: builtin: 0.1372 GrB: 0.1406 speedup: 0.98 err: 0
trial 2: builtin: 0.1343 GrB: 0.1411 speedup: 0.95 err: 0
trial 3: builtin: 0.1344 GrB: 0.1406 speedup: 0.96 err: 0
trial 4: builtin: 0.1343 GrB: 0.1411 speedup: 0.95 err: 0
average: builtin: 0.1351 GrB: 0.1409 speedup: 0.96
-------------------------------------------------
Testing performance of C=A*B using 20 threads:
-------------------------------------------------
=== builtin: double (real) vs GraphBLAS: single
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin: 3.7822 GrB: 0.2777 speedup: 13.62 err: 1.66754e-07
trial 2: builtin: 3.8007 GrB: 0.2139 speedup: 17.77 err: 1.66754e-07
trial 3: builtin: 3.8010 GrB: 0.2176 speedup: 17.47 err: 1.66754e-07
trial 4: builtin: 3.8017 GrB: 0.2331 speedup: 16.31 err: 1.66754e-07
average: builtin: 3.7964 GrB: 0.2356 speedup: 16.12
C=A*x: sparse matrix times sparse vector:
trial 1: builtin: 0.2214 GrB: 0.0194 speedup: 11.41 err: 3.59323e-08
trial 2: builtin: 0.2209 GrB: 0.0088 speedup: 25.22 err: 3.59853e-08
trial 3: builtin: 0.2086 GrB: 0.0078 speedup: 26.91 err: 3.59694e-08
trial 4: builtin: 0.2207 GrB: 0.0082 speedup: 26.99 err: 3.59819e-08
average: builtin: 0.2179 GrB: 0.0110 speedup: 19.77
C=A*x: sparse matrix times dense vector:
trial 1: builtin: 0.0411 GrB: 0.0143 speedup: 2.87 err: 4.87039e-08
trial 2: builtin: 0.0377 GrB: 0.0127 speedup: 2.98 err: 4.86783e-08
trial 3: builtin: 0.0373 GrB: 0.0123 speedup: 3.02 err: 4.86985e-08
trial 4: builtin: 0.0374 GrB: 0.0140 speedup: 2.68 err: 4.87141e-08
average: builtin: 0.0384 GrB: 0.0133 speedup: 2.88
=== builtin: double (real) vs GraphBLAS: double
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin: 3.7809 GrB: 0.3733 speedup: 10.13 err: 0
trial 2: builtin: 3.8024 GrB: 0.3289 speedup: 11.56 err: 0
trial 3: builtin: 3.8035 GrB: 0.2579 speedup: 14.75 err: 0
trial 4: builtin: 3.8014 GrB: 0.2569 speedup: 14.80 err: 0
average: builtin: 3.7971 GrB: 0.3042 speedup: 12.48
C=A*x: sparse matrix times sparse vector:
trial 1: builtin: 0.2223 GrB: 0.0272 speedup: 8.18 err: 1.98831e-18
trial 2: builtin: 0.2055 GrB: 0.0118 speedup: 17.41 err: 1.62657e-18
trial 3: builtin: 0.2060 GrB: 0.0107 speedup: 19.25 err: 2.37914e-18
trial 4: builtin: 0.2056 GrB: 0.0108 speedup: 18.99 err: 1.94308e-18
average: builtin: 0.2098 GrB: 0.0151 speedup: 13.87
C=A*x: sparse matrix times dense vector:
trial 1: builtin: 0.0372 GrB: 0.0214 speedup: 1.74 err: 6.77465e-18
trial 2: builtin: 0.0371 GrB: 0.0188 speedup: 1.97 err: 5.30567e-18
trial 3: builtin: 0.0378 GrB: 0.0172 speedup: 2.19 err: 5.5798e-18
trial 4: builtin: 0.0375 GrB: 0.0159 speedup: 2.36 err: 4.66091e-18
average: builtin: 0.0374 GrB: 0.0183 speedup: 2.04
=== builtin: double complex vs GraphBLAS: single complex
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin: 8.5904 GrB: 0.3827 speedup: 22.45 err: 1.67076e-07
trial 2: builtin: 8.6116 GrB: 0.4502 speedup: 19.13 err: 1.67076e-07
trial 3: builtin: 8.6200 GrB: 0.2959 speedup: 29.13 err: 1.67076e-07
trial 4: builtin: 8.6117 GrB: 0.2875 speedup: 29.95 err: 1.67076e-07
average: builtin: 8.6084 GrB: 0.3541 speedup: 24.31
C=A*x: sparse matrix times sparse vector:
trial 1: builtin: 0.3085 GrB: 0.0249 speedup: 12.39 err: 4.57594e-08
trial 2: builtin: 0.2876 GrB: 0.0102 speedup: 28.11 err: 4.57466e-08
trial 3: builtin: 0.2876 GrB: 0.0103 speedup: 27.82 err: 4.57482e-08
trial 4: builtin: 0.2873 GrB: 0.0102 speedup: 28.18 err: 4.57895e-08
average: builtin: 0.2927 GrB: 0.0139 speedup: 21.04
C=A*x: sparse matrix times dense vector:
trial 1: builtin: 0.1337 GrB: 0.0223 speedup: 5.99 err: 5.73863e-08
trial 2: builtin: 0.1344 GrB: 0.0202 speedup: 6.66 err: 5.73787e-08
trial 3: builtin: 0.1343 GrB: 0.0216 speedup: 6.23 err: 5.74014e-08
trial 4: builtin: 0.1344 GrB: 0.0209 speedup: 6.42 err: 5.74188e-08
average: builtin: 0.1342 GrB: 0.0212 speedup: 6.32
=== builtin: double complex vs GraphBLAS: double complex
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin: 8.6014 GrB: 0.3149 speedup: 27.31 err: 0
trial 2: builtin: 8.6027 GrB: 0.3432 speedup: 25.07 err: 0
trial 3: builtin: 8.6165 GrB: 0.3425 speedup: 25.16 err: 0
trial 4: builtin: 8.6144 GrB: 0.3351 speedup: 25.71 err: 0
average: builtin: 8.6088 GrB: 0.3339 speedup: 25.78
C=A*x: sparse matrix times sparse vector:
trial 1: builtin: 0.3109 GrB: 0.0371 speedup: 8.37 err: 2.50978e-18
trial 2: builtin: 0.2982 GrB: 0.0176 speedup: 16.96 err: 2.34916e-18
trial 3: builtin: 0.2878 GrB: 0.0170 speedup: 16.97 err: 2.37391e-18
trial 4: builtin: 0.2877 GrB: 0.0155 speedup: 18.60 err: 2.56515e-18
average: builtin: 0.2961 GrB: 0.0218 speedup: 13.59
C=A*x: sparse matrix times dense vector:
trial 1: builtin: 0.1372 GrB: 0.0251 speedup: 5.47 err: 7.24933e-18
trial 2: builtin: 0.1343 GrB: 0.0257 speedup: 5.23 err: 6.71833e-18
trial 3: builtin: 0.1344 GrB: 0.0246 speedup: 5.47 err: 7.52593e-18
trial 4: builtin: 0.1347 GrB: 0.0251 speedup: 5.38 err: 6.99933e-18
average: builtin: 0.1352 GrB: 0.0251 speedup: 5.39
-------------------------------------------------
Testing performance of C=A*B using 40 threads:
-------------------------------------------------
=== builtin: double (real) vs GraphBLAS: single
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin: 3.7816 GrB: 0.1910 speedup: 19.80 err: 1.66299e-07
trial 2: builtin: 3.8018 GrB: 0.2026 speedup: 18.77 err: 1.66299e-07
trial 3: builtin: 3.8025 GrB: 0.2173 speedup: 17.50 err: 1.66299e-07
trial 4: builtin: 3.8030 GrB: 0.2190 speedup: 17.37 err: 1.66299e-07
average: builtin: 3.7972 GrB: 0.2075 speedup: 18.30
C=A*x: sparse matrix times sparse vector:
trial 1: builtin: 0.2252 GrB: 0.0176 speedup: 12.81 err: 3.58846e-08
trial 2: builtin: 0.2232 GrB: 0.0074 speedup: 29.98 err: 3.5885e-08
trial 3: builtin: 0.2109 GrB: 0.0073 speedup: 28.85 err: 3.58637e-08
trial 4: builtin: 0.2233 GrB: 0.0059 speedup: 37.74 err: 3.59073e-08
average: builtin: 0.2206 GrB: 0.0096 speedup: 23.07
C=A*x: sparse matrix times dense vector:
trial 1: builtin: 0.0411 GrB: 0.0127 speedup: 3.25 err: 4.84918e-08
trial 2: builtin: 0.0374 GrB: 0.0117 speedup: 3.19 err: 4.8497e-08
trial 3: builtin: 0.0373 GrB: 0.0104 speedup: 3.57 err: 4.84767e-08
trial 4: builtin: 0.0425 GrB: 0.0104 speedup: 4.07 err: 4.84922e-08
average: builtin: 0.0396 GrB: 0.0113 speedup: 3.50
=== builtin: double (real) vs GraphBLAS: double
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin: 3.8404 GrB: 0.2141 speedup: 17.94 err: 0
trial 2: builtin: 3.7882 GrB: 0.2379 speedup: 15.92 err: 0
trial 3: builtin: 3.7892 GrB: 0.2248 speedup: 16.85 err: 0
trial 4: builtin: 3.7888 GrB: 0.2227 speedup: 17.01 err: 0
average: builtin: 3.8017 GrB: 0.2249 speedup: 16.90
C=A*x: sparse matrix times sparse vector:
trial 1: builtin: 0.2198 GrB: 0.0228 speedup: 9.62 err: 2.75062e-18
trial 2: builtin: 0.2039 GrB: 0.0083 speedup: 24.68 err: 3.26793e-18
trial 3: builtin: 0.2043 GrB: 0.0078 speedup: 26.23 err: 2.58955e-18
trial 4: builtin: 0.2039 GrB: 0.0076 speedup: 27.00 err: 2.94408e-18
average: builtin: 0.2080 GrB: 0.0116 speedup: 17.91
C=A*x: sparse matrix times dense vector:
trial 1: builtin: 0.0426 GrB: 0.0139 speedup: 3.07 err: 7.66364e-18
trial 2: builtin: 0.0410 GrB: 0.0109 speedup: 3.74 err: 8.1437e-18
trial 3: builtin: 0.0410 GrB: 0.0126 speedup: 3.25 err: 7.96734e-18
trial 4: builtin: 0.0410 GrB: 0.0127 speedup: 3.23 err: 8.23416e-18
average: builtin: 0.0414 GrB: 0.0125 speedup: 3.30
=== builtin: double complex vs GraphBLAS: single complex
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin: 8.6778 GrB: 0.2846 speedup: 30.49 err: 1.71854e-07
trial 2: builtin: 8.6130 GrB: 0.3069 speedup: 28.06 err: 1.71854e-07
trial 3: builtin: 8.6291 GrB: 0.3064 speedup: 28.16 err: 1.71854e-07
trial 4: builtin: 8.6014 GrB: 0.2850 speedup: 30.18 err: 1.71854e-07
average: builtin: 8.6303 GrB: 0.2957 speedup: 29.18
C=A*x: sparse matrix times sparse vector:
trial 1: builtin: 0.3070 GrB: 0.0258 speedup: 11.88 err: 4.58149e-08
trial 2: builtin: 0.2858 GrB: 0.0098 speedup: 29.15 err: 4.57967e-08
trial 3: builtin: 0.2862 GrB: 0.0091 speedup: 31.51 err: 4.5832e-08
trial 4: builtin: 0.2859 GrB: 0.0105 speedup: 27.14 err: 4.57913e-08
average: builtin: 0.2912 GrB: 0.0138 speedup: 21.08
C=A*x: sparse matrix times dense vector:
trial 1: builtin: 0.1325 GrB: 0.0199 speedup: 6.65 err: 5.74647e-08
trial 2: builtin: 0.1332 GrB: 0.0197 speedup: 6.77 err: 5.74833e-08
trial 3: builtin: 0.1334 GrB: 0.0185 speedup: 7.21 err: 5.74665e-08
trial 4: builtin: 0.1333 GrB: 0.0211 speedup: 6.32 err: 5.74735e-08
average: builtin: 0.1331 GrB: 0.0198 speedup: 6.72
=== builtin: double complex vs GraphBLAS: double complex
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin: 8.6641 GrB: 0.2967 speedup: 29.20 err: 0
trial 2: builtin: 8.6113 GrB: 0.3463 speedup: 24.87 err: 0
trial 3: builtin: 8.6190 GrB: 0.3444 speedup: 25.02 err: 0
trial 4: builtin: 8.6290 GrB: 0.3700 speedup: 23.32 err: 0
average: builtin: 8.6308 GrB: 0.3394 speedup: 25.43
C=A*x: sparse matrix times sparse vector:
trial 1: builtin: 0.3201 GrB: 0.0364 speedup: 8.79 err: 2.91584e-18
trial 2: builtin: 0.2986 GrB: 0.0143 speedup: 20.93 err: 3.0252e-18
trial 3: builtin: 0.2883 GrB: 0.0142 speedup: 20.27 err: 3.11557e-18
trial 4: builtin: 0.2885 GrB: 0.0148 speedup: 19.44 err: 2.88111e-18
average: builtin: 0.2989 GrB: 0.0199 speedup: 14.99
C=A*x: sparse matrix times dense vector:
trial 1: builtin: 0.1377 GrB: 0.0233 speedup: 5.91 err: 9.77031e-18
trial 2: builtin: 0.1348 GrB: 0.0221 speedup: 6.10 err: 9.39256e-18
trial 3: builtin: 0.1353 GrB: 0.0230 speedup: 5.87 err: 8.59556e-18
trial 4: builtin: 0.1349 GrB: 0.0232 speedup: 5.81 err: 8.17499e-18
average: builtin: 0.1357 GrB: 0.0229 speedup: 5.92
|