File: mxm_demo_DGX_Station.txt

package info (click to toggle)
suitesparse-graphblas 7.4.0%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 67,112 kB
  • sloc: ansic: 1,072,243; cpp: 8,081; sh: 512; makefile: 506; asm: 369; python: 125; awk: 10
file content (287 lines) | stat: -rw-r--r-- 17,032 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
mxm_demo

Prob = 

  struct with fields:

         A: [9000x9000 double]
      name: 'ND/nd3k'
     title: 'ND problem set, matrix nd3k'
        id: 936
      date: '2003'
    author: 'author unknown'
        ed: 'T. Davis'
      kind: '2D/3D problem'


Prob2 = 

  struct with fields:

      name: 'Freescale/Freescale2'
     title: 'circuit simulation matrix from Freescale'
         A: [2999349x2999349 double]
     Zeros: [2999349x2999349 double]
        id: 2662
      date: '2015'
    author: 'K. Gullapalli'
        ed: 'T. Davis'
      kind: 'circuit simulation matrix'
     notes: [4x59 char]

hypersparse.cs.tamu.edu
MATLAB version: 9.9 release: (R2020b)
GraphBLAS version: 4.0.1 (Jan 4, 2021)

-------------------------------------------------
Testing single-threaded performance of C=A*B:
-------------------------------------------------

=== builtin: double (real) vs GraphBLAS: single
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin:     3.7771 GrB:     3.5406 speedup:       1.07 err: 1.63154e-07
trial 2: builtin:     3.7923 GrB:     3.5586 speedup:       1.07 err: 1.63154e-07
trial 3: builtin:     3.8008 GrB:     3.5582 speedup:       1.07 err: 1.63154e-07
trial 4: builtin:     3.7916 GrB:     3.5497 speedup:       1.07 err: 1.63154e-07
average: builtin:     3.7904 GrB:     3.5518 speedup:       1.07
C=A*x: sparse matrix times sparse vector:
trial 1: builtin:     0.2234 GrB:     0.0773 speedup:       2.89 err: 3.60006e-08
trial 2: builtin:     0.2225 GrB:     0.0660 speedup:       3.37 err: 3.60006e-08
trial 3: builtin:     0.2102 GrB:     0.0661 speedup:       3.18 err: 3.60006e-08
trial 4: builtin:     0.2221 GrB:     0.0662 speedup:       3.36 err: 3.60006e-08
average: builtin:     0.2196 GrB:     0.0689 speedup:       3.19
C=A*x: sparse matrix times dense vector:
trial 1: builtin:     0.0416 GrB:     0.0611 speedup:       0.68 err: 4.86966e-08
trial 2: builtin:     0.0372 GrB:     0.0611 speedup:       0.61 err: 4.86966e-08
trial 3: builtin:     0.0375 GrB:     0.0611 speedup:       0.61 err: 4.86966e-08
trial 4: builtin:     0.0372 GrB:     0.0611 speedup:       0.61 err: 4.86966e-08
average: builtin:     0.0384 GrB:     0.0611 speedup:       0.63

=== builtin: double (real) vs GraphBLAS: double
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin:     3.7769 GrB:     3.8013 speedup:       0.99 err: 0
trial 2: builtin:     3.7918 GrB:     3.8230 speedup:       0.99 err: 0
trial 3: builtin:     3.7907 GrB:     3.8220 speedup:       0.99 err: 0
trial 4: builtin:     3.7894 GrB:     3.8224 speedup:       0.99 err: 0
average: builtin:     3.7872 GrB:     3.8172 speedup:       0.99
C=A*x: sparse matrix times sparse vector:
trial 1: builtin:     0.2227 GrB:     0.0814 speedup:       2.74 err: 0
trial 2: builtin:     0.2061 GrB:     0.0653 speedup:       3.16 err: 0
trial 3: builtin:     0.2054 GrB:     0.0654 speedup:       3.14 err: 0
trial 4: builtin:     0.2057 GrB:     0.0653 speedup:       3.15 err: 0
average: builtin:     0.2100 GrB:     0.0694 speedup:       3.03
C=A*x: sparse matrix times dense vector:
trial 1: builtin:     0.0379 GrB:     0.0653 speedup:       0.58 err: 0
trial 2: builtin:     0.0372 GrB:     0.0650 speedup:       0.57 err: 0
trial 3: builtin:     0.0377 GrB:     0.0653 speedup:       0.58 err: 0
trial 4: builtin:     0.0371 GrB:     0.0649 speedup:       0.57 err: 0
average: builtin:     0.0375 GrB:     0.0651 speedup:       0.58

=== builtin: double complex vs GraphBLAS: single complex
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin:     8.5948 GrB:     4.9319 speedup:       1.74 err: 1.70413e-07
trial 2: builtin:     8.6180 GrB:     4.9453 speedup:       1.74 err: 1.70413e-07
trial 3: builtin:     8.6200 GrB:     4.9454 speedup:       1.74 err: 1.70413e-07
trial 4: builtin:     8.6130 GrB:     4.9449 speedup:       1.74 err: 1.70413e-07
average: builtin:     8.6115 GrB:     4.9419 speedup:       1.74
C=A*x: sparse matrix times sparse vector:
trial 1: builtin:     0.3120 GrB:     0.0911 speedup:       3.43 err: 4.56897e-08
trial 2: builtin:     0.2903 GrB:     0.0752 speedup:       3.86 err: 4.56897e-08
trial 3: builtin:     0.2899 GrB:     0.0756 speedup:       3.83 err: 4.56897e-08
trial 4: builtin:     0.2896 GrB:     0.0752 speedup:       3.85 err: 4.56897e-08
average: builtin:     0.2955 GrB:     0.0793 speedup:       3.73
C=A*x: sparse matrix times dense vector:
trial 1: builtin:     0.1341 GrB:     0.0786 speedup:       1.71 err: 5.75158e-08
trial 2: builtin:     0.1348 GrB:     0.0788 speedup:       1.71 err: 5.75158e-08
trial 3: builtin:     0.1344 GrB:     0.0788 speedup:       1.71 err: 5.75158e-08
trial 4: builtin:     0.1344 GrB:     0.0787 speedup:       1.71 err: 5.75158e-08
average: builtin:     0.1344 GrB:     0.0787 speedup:       1.71

=== builtin: double complex vs GraphBLAS: double complex
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin:     8.6011 GrB:     5.4293 speedup:       1.58 err: 0
trial 2: builtin:     8.6057 GrB:     5.4520 speedup:       1.58 err: 0
trial 3: builtin:     8.6198 GrB:     5.4410 speedup:       1.58 err: 0
trial 4: builtin:     8.6164 GrB:     5.4510 speedup:       1.58 err: 0
average: builtin:     8.6108 GrB:     5.4433 speedup:       1.58
C=A*x: sparse matrix times sparse vector:
trial 1: builtin:     0.3115 GrB:     0.1522 speedup:       2.05 err: 0
trial 2: builtin:     0.2987 GrB:     0.1311 speedup:       2.28 err: 0
trial 3: builtin:     0.2888 GrB:     0.1310 speedup:       2.20 err: 0
trial 4: builtin:     0.2886 GrB:     0.1310 speedup:       2.20 err: 0
average: builtin:     0.2969 GrB:     0.1363 speedup:       2.18
C=A*x: sparse matrix times dense vector:
trial 1: builtin:     0.1372 GrB:     0.1406 speedup:       0.98 err: 0
trial 2: builtin:     0.1343 GrB:     0.1411 speedup:       0.95 err: 0
trial 3: builtin:     0.1344 GrB:     0.1406 speedup:       0.96 err: 0
trial 4: builtin:     0.1343 GrB:     0.1411 speedup:       0.95 err: 0
average: builtin:     0.1351 GrB:     0.1409 speedup:       0.96

-------------------------------------------------
Testing performance of C=A*B using 20 threads:
-------------------------------------------------

=== builtin: double (real) vs GraphBLAS: single
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin:     3.7822 GrB:     0.2777 speedup:      13.62 err: 1.66754e-07
trial 2: builtin:     3.8007 GrB:     0.2139 speedup:      17.77 err: 1.66754e-07
trial 3: builtin:     3.8010 GrB:     0.2176 speedup:      17.47 err: 1.66754e-07
trial 4: builtin:     3.8017 GrB:     0.2331 speedup:      16.31 err: 1.66754e-07
average: builtin:     3.7964 GrB:     0.2356 speedup:      16.12
C=A*x: sparse matrix times sparse vector:
trial 1: builtin:     0.2214 GrB:     0.0194 speedup:      11.41 err: 3.59323e-08
trial 2: builtin:     0.2209 GrB:     0.0088 speedup:      25.22 err: 3.59853e-08
trial 3: builtin:     0.2086 GrB:     0.0078 speedup:      26.91 err: 3.59694e-08
trial 4: builtin:     0.2207 GrB:     0.0082 speedup:      26.99 err: 3.59819e-08
average: builtin:     0.2179 GrB:     0.0110 speedup:      19.77
C=A*x: sparse matrix times dense vector:
trial 1: builtin:     0.0411 GrB:     0.0143 speedup:       2.87 err: 4.87039e-08
trial 2: builtin:     0.0377 GrB:     0.0127 speedup:       2.98 err: 4.86783e-08
trial 3: builtin:     0.0373 GrB:     0.0123 speedup:       3.02 err: 4.86985e-08
trial 4: builtin:     0.0374 GrB:     0.0140 speedup:       2.68 err: 4.87141e-08
average: builtin:     0.0384 GrB:     0.0133 speedup:       2.88

=== builtin: double (real) vs GraphBLAS: double
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin:     3.7809 GrB:     0.3733 speedup:      10.13 err: 0
trial 2: builtin:     3.8024 GrB:     0.3289 speedup:      11.56 err: 0
trial 3: builtin:     3.8035 GrB:     0.2579 speedup:      14.75 err: 0
trial 4: builtin:     3.8014 GrB:     0.2569 speedup:      14.80 err: 0
average: builtin:     3.7971 GrB:     0.3042 speedup:      12.48
C=A*x: sparse matrix times sparse vector:
trial 1: builtin:     0.2223 GrB:     0.0272 speedup:       8.18 err: 1.98831e-18
trial 2: builtin:     0.2055 GrB:     0.0118 speedup:      17.41 err: 1.62657e-18
trial 3: builtin:     0.2060 GrB:     0.0107 speedup:      19.25 err: 2.37914e-18
trial 4: builtin:     0.2056 GrB:     0.0108 speedup:      18.99 err: 1.94308e-18
average: builtin:     0.2098 GrB:     0.0151 speedup:      13.87
C=A*x: sparse matrix times dense vector:
trial 1: builtin:     0.0372 GrB:     0.0214 speedup:       1.74 err: 6.77465e-18
trial 2: builtin:     0.0371 GrB:     0.0188 speedup:       1.97 err: 5.30567e-18
trial 3: builtin:     0.0378 GrB:     0.0172 speedup:       2.19 err: 5.5798e-18
trial 4: builtin:     0.0375 GrB:     0.0159 speedup:       2.36 err: 4.66091e-18
average: builtin:     0.0374 GrB:     0.0183 speedup:       2.04

=== builtin: double complex vs GraphBLAS: single complex
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin:     8.5904 GrB:     0.3827 speedup:      22.45 err: 1.67076e-07
trial 2: builtin:     8.6116 GrB:     0.4502 speedup:      19.13 err: 1.67076e-07
trial 3: builtin:     8.6200 GrB:     0.2959 speedup:      29.13 err: 1.67076e-07
trial 4: builtin:     8.6117 GrB:     0.2875 speedup:      29.95 err: 1.67076e-07
average: builtin:     8.6084 GrB:     0.3541 speedup:      24.31
C=A*x: sparse matrix times sparse vector:
trial 1: builtin:     0.3085 GrB:     0.0249 speedup:      12.39 err: 4.57594e-08
trial 2: builtin:     0.2876 GrB:     0.0102 speedup:      28.11 err: 4.57466e-08
trial 3: builtin:     0.2876 GrB:     0.0103 speedup:      27.82 err: 4.57482e-08
trial 4: builtin:     0.2873 GrB:     0.0102 speedup:      28.18 err: 4.57895e-08
average: builtin:     0.2927 GrB:     0.0139 speedup:      21.04
C=A*x: sparse matrix times dense vector:
trial 1: builtin:     0.1337 GrB:     0.0223 speedup:       5.99 err: 5.73863e-08
trial 2: builtin:     0.1344 GrB:     0.0202 speedup:       6.66 err: 5.73787e-08
trial 3: builtin:     0.1343 GrB:     0.0216 speedup:       6.23 err: 5.74014e-08
trial 4: builtin:     0.1344 GrB:     0.0209 speedup:       6.42 err: 5.74188e-08
average: builtin:     0.1342 GrB:     0.0212 speedup:       6.32

=== builtin: double complex vs GraphBLAS: double complex
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin:     8.6014 GrB:     0.3149 speedup:      27.31 err: 0
trial 2: builtin:     8.6027 GrB:     0.3432 speedup:      25.07 err: 0
trial 3: builtin:     8.6165 GrB:     0.3425 speedup:      25.16 err: 0
trial 4: builtin:     8.6144 GrB:     0.3351 speedup:      25.71 err: 0
average: builtin:     8.6088 GrB:     0.3339 speedup:      25.78
C=A*x: sparse matrix times sparse vector:
trial 1: builtin:     0.3109 GrB:     0.0371 speedup:       8.37 err: 2.50978e-18
trial 2: builtin:     0.2982 GrB:     0.0176 speedup:      16.96 err: 2.34916e-18
trial 3: builtin:     0.2878 GrB:     0.0170 speedup:      16.97 err: 2.37391e-18
trial 4: builtin:     0.2877 GrB:     0.0155 speedup:      18.60 err: 2.56515e-18
average: builtin:     0.2961 GrB:     0.0218 speedup:      13.59
C=A*x: sparse matrix times dense vector:
trial 1: builtin:     0.1372 GrB:     0.0251 speedup:       5.47 err: 7.24933e-18
trial 2: builtin:     0.1343 GrB:     0.0257 speedup:       5.23 err: 6.71833e-18
trial 3: builtin:     0.1344 GrB:     0.0246 speedup:       5.47 err: 7.52593e-18
trial 4: builtin:     0.1347 GrB:     0.0251 speedup:       5.38 err: 6.99933e-18
average: builtin:     0.1352 GrB:     0.0251 speedup:       5.39

-------------------------------------------------
Testing performance of C=A*B using 40 threads:
-------------------------------------------------

=== builtin: double (real) vs GraphBLAS: single
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin:     3.7816 GrB:     0.1910 speedup:      19.80 err: 1.66299e-07
trial 2: builtin:     3.8018 GrB:     0.2026 speedup:      18.77 err: 1.66299e-07
trial 3: builtin:     3.8025 GrB:     0.2173 speedup:      17.50 err: 1.66299e-07
trial 4: builtin:     3.8030 GrB:     0.2190 speedup:      17.37 err: 1.66299e-07
average: builtin:     3.7972 GrB:     0.2075 speedup:      18.30
C=A*x: sparse matrix times sparse vector:
trial 1: builtin:     0.2252 GrB:     0.0176 speedup:      12.81 err: 3.58846e-08
trial 2: builtin:     0.2232 GrB:     0.0074 speedup:      29.98 err: 3.5885e-08
trial 3: builtin:     0.2109 GrB:     0.0073 speedup:      28.85 err: 3.58637e-08
trial 4: builtin:     0.2233 GrB:     0.0059 speedup:      37.74 err: 3.59073e-08
average: builtin:     0.2206 GrB:     0.0096 speedup:      23.07
C=A*x: sparse matrix times dense vector:
trial 1: builtin:     0.0411 GrB:     0.0127 speedup:       3.25 err: 4.84918e-08
trial 2: builtin:     0.0374 GrB:     0.0117 speedup:       3.19 err: 4.8497e-08
trial 3: builtin:     0.0373 GrB:     0.0104 speedup:       3.57 err: 4.84767e-08
trial 4: builtin:     0.0425 GrB:     0.0104 speedup:       4.07 err: 4.84922e-08
average: builtin:     0.0396 GrB:     0.0113 speedup:       3.50

=== builtin: double (real) vs GraphBLAS: double
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin:     3.8404 GrB:     0.2141 speedup:      17.94 err: 0
trial 2: builtin:     3.7882 GrB:     0.2379 speedup:      15.92 err: 0
trial 3: builtin:     3.7892 GrB:     0.2248 speedup:      16.85 err: 0
trial 4: builtin:     3.7888 GrB:     0.2227 speedup:      17.01 err: 0
average: builtin:     3.8017 GrB:     0.2249 speedup:      16.90
C=A*x: sparse matrix times sparse vector:
trial 1: builtin:     0.2198 GrB:     0.0228 speedup:       9.62 err: 2.75062e-18
trial 2: builtin:     0.2039 GrB:     0.0083 speedup:      24.68 err: 3.26793e-18
trial 3: builtin:     0.2043 GrB:     0.0078 speedup:      26.23 err: 2.58955e-18
trial 4: builtin:     0.2039 GrB:     0.0076 speedup:      27.00 err: 2.94408e-18
average: builtin:     0.2080 GrB:     0.0116 speedup:      17.91
C=A*x: sparse matrix times dense vector:
trial 1: builtin:     0.0426 GrB:     0.0139 speedup:       3.07 err: 7.66364e-18
trial 2: builtin:     0.0410 GrB:     0.0109 speedup:       3.74 err: 8.1437e-18
trial 3: builtin:     0.0410 GrB:     0.0126 speedup:       3.25 err: 7.96734e-18
trial 4: builtin:     0.0410 GrB:     0.0127 speedup:       3.23 err: 8.23416e-18
average: builtin:     0.0414 GrB:     0.0125 speedup:       3.30

=== builtin: double complex vs GraphBLAS: single complex
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin:     8.6778 GrB:     0.2846 speedup:      30.49 err: 1.71854e-07
trial 2: builtin:     8.6130 GrB:     0.3069 speedup:      28.06 err: 1.71854e-07
trial 3: builtin:     8.6291 GrB:     0.3064 speedup:      28.16 err: 1.71854e-07
trial 4: builtin:     8.6014 GrB:     0.2850 speedup:      30.18 err: 1.71854e-07
average: builtin:     8.6303 GrB:     0.2957 speedup:      29.18
C=A*x: sparse matrix times sparse vector:
trial 1: builtin:     0.3070 GrB:     0.0258 speedup:      11.88 err: 4.58149e-08
trial 2: builtin:     0.2858 GrB:     0.0098 speedup:      29.15 err: 4.57967e-08
trial 3: builtin:     0.2862 GrB:     0.0091 speedup:      31.51 err: 4.5832e-08
trial 4: builtin:     0.2859 GrB:     0.0105 speedup:      27.14 err: 4.57913e-08
average: builtin:     0.2912 GrB:     0.0138 speedup:      21.08
C=A*x: sparse matrix times dense vector:
trial 1: builtin:     0.1325 GrB:     0.0199 speedup:       6.65 err: 5.74647e-08
trial 2: builtin:     0.1332 GrB:     0.0197 speedup:       6.77 err: 5.74833e-08
trial 3: builtin:     0.1334 GrB:     0.0185 speedup:       7.21 err: 5.74665e-08
trial 4: builtin:     0.1333 GrB:     0.0211 speedup:       6.32 err: 5.74735e-08
average: builtin:     0.1331 GrB:     0.0198 speedup:       6.72

=== builtin: double complex vs GraphBLAS: double complex
C=A*B: sparse matrix times sparse matrix:
trial 1: builtin:     8.6641 GrB:     0.2967 speedup:      29.20 err: 0
trial 2: builtin:     8.6113 GrB:     0.3463 speedup:      24.87 err: 0
trial 3: builtin:     8.6190 GrB:     0.3444 speedup:      25.02 err: 0
trial 4: builtin:     8.6290 GrB:     0.3700 speedup:      23.32 err: 0
average: builtin:     8.6308 GrB:     0.3394 speedup:      25.43
C=A*x: sparse matrix times sparse vector:
trial 1: builtin:     0.3201 GrB:     0.0364 speedup:       8.79 err: 2.91584e-18
trial 2: builtin:     0.2986 GrB:     0.0143 speedup:      20.93 err: 3.0252e-18
trial 3: builtin:     0.2883 GrB:     0.0142 speedup:      20.27 err: 3.11557e-18
trial 4: builtin:     0.2885 GrB:     0.0148 speedup:      19.44 err: 2.88111e-18
average: builtin:     0.2989 GrB:     0.0199 speedup:      14.99
C=A*x: sparse matrix times dense vector:
trial 1: builtin:     0.1377 GrB:     0.0233 speedup:       5.91 err: 9.77031e-18
trial 2: builtin:     0.1348 GrB:     0.0221 speedup:       6.10 err: 9.39256e-18
trial 3: builtin:     0.1353 GrB:     0.0230 speedup:       5.87 err: 8.59556e-18
trial 4: builtin:     0.1349 GrB:     0.0232 speedup:       5.81 err: 8.17499e-18
average: builtin:     0.1357 GrB:     0.0229 speedup:       5.92