File: GrB_objects_Serialize.tex

package info (click to toggle)
suitesparse 1%3A7.10.1%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, trixie
  • size: 254,920 kB
  • sloc: ansic: 1,134,743; cpp: 46,133; makefile: 4,875; fortran: 2,087; java: 1,826; sh: 996; ruby: 725; python: 495; asm: 371; sed: 166; awk: 44
file content (405 lines) | stat: -rw-r--r-- 17,611 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405

\newpage
%===============================================================================
\subsection{Serialize/deserialize methods}
%===============================================================================
\label{serialize_deserialize}

{\em Serialization} takes an opaque GraphBLAS object (a vector or matrix) and
encodes it in a single non-opaque array of bytes, the {\em blob}.  The blob can
only be deserialized by the same library that created it (SuiteSparse:GraphBLAS
in this case).  The array of bytes can be written to a file, sent to another
process over an MPI channel, or operated on in any other way that moves the
bytes around.  The contents of the array cannot be interpreted except by
deserialization back into a vector or matrix, by the same library (and
sometimes the same version) that created the blob.

All versions of SuiteSparse:GraphBLAS that implement
serialization/deserialization use essentially the same format for the blob, so
the library versions are compatible with each other.  Version v9.0.0 adds the
\verb'GrB_NAME' and \verb'GrB_EL_TYPE_STRING' to the blob in an upward
compatible manner, so that older versions of SuiteSparse:GraphBLAS can read the blobs
created by v9.0.0; they simply ignore those components.

SuiteSparse:GraphBLAS v10 adds
32/64-bit integers, and can read the blobs created by any prior version of
GraphBLAS (they are deserialized with all 64-bit integers however).  If an older
version of SuiteSparse:GraphBLAS (v9 or earlier) attempts to deserialize a blob
containing a matrix with 32-bit integers, it will safely report that the blob
is invalid and refuse to deserialize it.  If SuiteSparse:GraphBLAS v10 creates a
serialized blob with all-64-bit integers, then it can be read correctly by
SuiteSparse:GraphBLAS v9, and likely also by earlier versions of the library.

There are two forms of serialization: \verb'GrB*serialize' and
\verb'GxB*serialize'.  For the \verb'GrB' form, the blob must first be
allocated by the user application, and it must be large enough to hold the
serialized matrix or vector.  By contrast \verb'GxB*serialize' allocates
the blob itself.

By default, ZSTD (level 1) compression is used for serialization, but other
options can be selected via the descriptor:
\verb'GrB_set (desc, method,' \verb'GxB_COMPRESSION)', where \verb'method' is an
integer selected from the following options:

\vspace{0.2in}
{\footnotesize
\begin{tabular}{ll}
\hline
method                           &  description \\
\hline
\verb'GxB_COMPRESSION_NONE'      &  no compression \\
\verb'GxB_COMPRESSION_DEFAULT'   &  ZSTD, with default level 1 \\
\verb'GxB_COMPRESSION_LZ4'       &  LZ4 \\
\verb'GxB_COMPRESSION_LZ4HC'     &  LZ4HC, with default level 9 \\
\verb'GxB_COMPRESSION_ZSTD'      &  ZSTD, with default level 1 \\
\hline
\end{tabular} }
\vspace{0.2in}

The LZ4HC method can be modified by adding a level of zero to 9, with 9 being
the default.  Higher levels lead to a more compact blob, at the cost of extra
computational time. This level is simply added to the method, so to compress a
vector with LZ4HC with level 6, use:

    {\footnotesize
    \begin{verbatim}
    GrB_set (desc, GxB_COMPRESSION_LZ4HC + 6, GxB_COMPRESSION) ; \end{verbatim}}

The ZSTD method can be specified as level 1 to 19, with 1 being the default.
To compress with ZSTD at level 6, use:

    {\footnotesize
    \begin{verbatim}
    GrB_set (desc, GxB_COMPRESSION_ZSTD + 6, GxB_COMPRESSION) ; \end{verbatim}}

Deserialization of untrusted data is a common security problem; see
\url{https://cwe.mitre.org/data/definitions/502.html}. The deserialization
methods in SuiteSparse:GraphBLAS do a few basic checks so that no out-of-bounds
access occurs during deserialization, but the output matrix or vector itself
may still be corrupted.  If the data is untrusted, use \verb'GxB_*_fprint' with
the print level set to \verb'GxB_SILENT' to
check the matrix or vector after deserializing it:

{\footnotesize
\begin{verbatim}
    info = GxB_Vector_fprint (w, "w deserialized", GxB_SILENT, NULL) ;
    if (info != GrB_SUCCESS) GrB_free (&w) ;
    info = GxB_Matrix_fprint (A, "A deserialized", GxB_SILENT, NULL) ;
    if (info != GrB_SUCCESS) GrB_free (&A) ; \end{verbatim}}

The following methods are described in this Section:

\vspace{0.2in}
\noindent
{\footnotesize
\begin{tabular}{lll}
\hline
GraphBLAS function   & purpose                                      & Section \\
\hline
% \verb'GrB_Vector_serializeSize'  & return size of serialized vector & \ref{vector_serialize_size} \\
% \verb'GrB_Vector_serialize'      & serialize a vector               & \ref{vector_serialize} \\
\verb'GxB_Vector_serialize'      & serialize a vector               & \ref{vector_serialize_GxB} \\
% \verb'GrB_Vector_deserialize'    & deserialize a vector             & \ref{vector_deserialize} \\
\verb'GxB_Vector_deserialize'    & deserialize a vector             & \ref{vector_deserialize_GxB} \\
\hline
\verb'GrB_Matrix_serializeSize' & return size of serialized matrix & \ref{matrix_serialize_size} \\
\verb'GrB_Matrix_serialize'     & serialize a matrix               & \ref{matrix_serialize} \\
\verb'GxB_Matrix_serialize'     & serialize a matrix               & \ref{matrix_serialize_GxB} \\
\verb'GrB_Matrix_deserialize'   & deserialize a matrix             & \ref{matrix_deserialize} \\
\verb'GxB_Matrix_deserialize'   & deserialize a matrix             & \ref{matrix_deserialize_GxB} \\
\hline
\verb'GrB_get' & get blob properties & \ref{get_set_blob} \\
\hline
\end{tabular}
}

%-------------------------------------------------------------------------------
% \subsubsection{{\sf GrB\_Vector\_serializeSize:}  return size of serialized vector}
%-------------------------------------------------------------------------------
% \label{vector_serialize_size}

% \begin{mdframed}[userdefinedwidth=6in]
% {\footnotesize
% \begin{verbatim}
% GrB_Info GrB_Vector_serializeSize   // estimate the size of a blob
% (
%    // output:
%    GrB_Index *blob_size_handle,    // upper bound on the required size of the
%                                    // blob on output.
%    // input:
%    GrB_Vector u                    // vector to serialize
%) ;
%\end{verbatim}
%} \end{mdframed}
%
% \verb'GrB_Vector_serializeSize' returns an upper bound on the size of the blob
% needed to serialize a \verb'GrB_Vector' using \verb'GrB_Vector_serialize'.
% After the vector is serialized, the actual size used is returned, and the blob
% may be \verb'realloc''d to that size if desired.
% This method is not required for \verb'GxB_Vector_serialize'.

% \newpage
%-------------------------------------------------------------------------------
% \subsubsection{{\sf GrB\_Vector\_serialize:}      serialize a vector}
%-------------------------------------------------------------------------------
% \label{vector_serialize}

% \begin{mdframed}[userdefinedwidth=6in]
% {\footnotesize
% \begin{verbatim}
% GrB_Info GrB_Vector_serialize       // serialize a GrB_Vector to a blob
% (
%    // output:
%    void *blob,                     // the blob, already allocated in input
%    // input/output:
%    GrB_Index *blob_size_handle,    // size of the blob on input.  On output,
%                                    // the # of bytes used in the blob.
%    // input:
%    GrB_Vector u                    // vector to serialize
% ) ;
% \end{verbatim}
% } \end{mdframed}
%
% \verb'GrB_Vector_serialize' serializes a vector into a single array of bytes
% (the blob), which must be already allocated by the user application.
% On input, \verb'&blob_size' is the size of the allocated blob in bytes.
% On output, it is reduced to the numbed of bytes actually used to serialize
% the vector.  After calling \verb'GrB_Vector_serialize', the blob may be
% \verb'realloc''d to this revised size if desired (this is optional).
% ZSTD (level 1) compression is used to construct a compact blob.

\newpage
%-------------------------------------------------------------------------------
\subsubsection{{\sf GxB\_Vector\_serialize:}      serialize a vector}
%-------------------------------------------------------------------------------
\label{vector_serialize_GxB}

\begin{mdframed}[userdefinedwidth=6in]
{\footnotesize
\begin{verbatim}
GrB_Info GxB_Vector_serialize       // serialize a GrB_Vector to a blob
(
    // output:
    void **blob_handle,             // the blob, allocated on output
    GrB_Index *blob_size_handle,    // size of the blob on output
    // input:
    GrB_Vector u,                   // vector to serialize
    const GrB_Descriptor desc       // descriptor to select compression method
) ;
\end{verbatim}
} \end{mdframed}

\verb'GxB_Vector_serialize' serializes a vector into a single array of bytes
(the blob), which is \verb'malloc''ed and filled with the serialized vector.
By default, ZSTD (level 1) compression is used, but other options can be
selected via the descriptor.  Serializing a vector is identical to serializing
a matrix; see Section \ref{matrix_serialize_GxB} for more information.

%-------------------------------------------------------------------------------
% \subsubsection{{\sf GrB\_Vector\_deserialize:}    deserialize a vector}
%-------------------------------------------------------------------------------
% \label{vector_deserialize}

% \begin{mdframed}[userdefinedwidth=6in]
% {\footnotesize
% \begin{verbatim}
% GrB_Info GrB_Vector_deserialize     // deserialize blob into a GrB_Vector
% (
%     // output:
%     GrB_Vector *w,      // output vector created from the blob
%     // input:
%     GrB_Type type,      // type of the vector w.  Required if the blob holds a
%                         // vector of user-defined type.  May be NULL if blob
%                         // holds a built-in type; otherwise must match the
%                         // type of w.
%     const void *blob,       // the blob
%     GrB_Index blob_size     // size of the blob
% ) ;
% \end{verbatim}
% } \end{mdframed}
%
% This method creates a vector \verb'w' by deserializing the contents of the
% blob, constructed by either \verb'GrB_Vector_serialize' or
% \verb'GxB_Vector_serialize'.

%-------------------------------------------------------------------------------
\subsubsection{{\sf GxB\_Vector\_deserialize:}    deserialize a vector}
%-------------------------------------------------------------------------------
\label{vector_deserialize_GxB}

\begin{mdframed}[userdefinedwidth=6in]
{\footnotesize
\begin{verbatim}
GrB_Info GxB_Vector_deserialize     // deserialize blob into a GrB_Vector
(
    // output:
    GrB_Vector *w,      // output vector created from the blob
    // input:
    GrB_Type type,      // type of the vector w.  See GxB_Matrix_deserialize.
    const void *blob,       // the blob
    GrB_Index blob_size,    // size of the blob
    const GrB_Descriptor desc
) ;
\end{verbatim}
} \end{mdframed}

This method creates a vector \verb'w' by deserializing the contents of the
blob, constructed by
% either \verb'GrB_Vector_serialize' or
\verb'GxB_Vector_serialize'.
Deserializing a vector is identical to deserializing a matrix;
see Section \ref{matrix_deserialize_GxB} for more information.

The blob is allocated with the \verb'malloc' function passed to
\verb'GxB_init', or the C11 \verb'malloc' if \verb'GrB_init' was used
to initialize GraphBLAS.  The blob must be freed by the matching \verb'free'
method, either the \verb'free' function passed to \verb'GxB_init' or
the C11 \verb'free' if \verb'GrB_init' was used.

\newpage
%-------------------------------------------------------------------------------
\subsubsection{{\sf GrB\_Matrix\_serializeSize:}  return size of serialized matrix}
%-------------------------------------------------------------------------------
\label{matrix_serialize_size}

\begin{mdframed}[userdefinedwidth=6in]
{\footnotesize
\begin{verbatim}
GrB_Info GrB_Matrix_serializeSize   // estimate the size of a blob
(
    // output:
    GrB_Index *blob_size_handle,    // upper bound on the required size of the
                                    // blob on output.
    // input:
    GrB_Matrix A                    // matrix to serialize
) ;
\end{verbatim}
} \end{mdframed}

\verb'GrB_Matrix_serializeSize' returns an upper bound on the size of the blob
needed to serialize a \verb'GrB_Matrix' with \verb'GrB_Matrix_serialize'.
After the matrix is serialized, the actual size used is returned, and the blob
may be \verb'realloc''d to that size if desired.
This method is not required for \verb'GxB_Matrix_serialize'.

%-------------------------------------------------------------------------------
\subsubsection{{\sf GrB\_Matrix\_serialize:}      serialize a matrix}
%-------------------------------------------------------------------------------
\label{matrix_serialize}

\begin{mdframed}[userdefinedwidth=6in]
{\footnotesize
\begin{verbatim}
GrB_Info GrB_Matrix_serialize       // serialize a GrB_Matrix to a blob
(
    // output:
    void *blob,                     // the blob, already allocated in input
    // input/output:
    GrB_Index *blob_size_handle,    // size of the blob on input.  On output,
                                    // the # of bytes used in the blob.
    // input:
    GrB_Matrix A                    // matrix to serialize
) ;
\end{verbatim}
} \end{mdframed}

\verb'GrB_Matrix_serialize' serializes a matrix into a single array of bytes
(the blob), which must be already allocated by the user application.
On input, \verb'&blob_size' is the size of the allocated blob in bytes.
On output, it is reduced to the numbed of bytes actually used to serialize
the matrix.  After calling \verb'GrB_Matrix_serialize', the blob may be
\verb'realloc''d to this revised size if desired (this is optional).
ZSTD (level 1) compression is used to construct a compact blob.

\newpage
%-------------------------------------------------------------------------------
\subsubsection{{\sf GxB\_Matrix\_serialize:}      serialize a matrix}
%-------------------------------------------------------------------------------
\label{matrix_serialize_GxB}

\begin{mdframed}[userdefinedwidth=6in]
{\footnotesize
\begin{verbatim}
GrB_Info GxB_Matrix_serialize       // serialize a GrB_Matrix to a blob
(
    // output:
    void **blob_handle,             // the blob, allocated on output
    GrB_Index *blob_size_handle,    // size of the blob on output
    // input:
    GrB_Matrix A,                   // matrix to serialize
    const GrB_Descriptor desc       // descriptor to select compression method
) ;
\end{verbatim}
} \end{mdframed}

\verb'GxB_Matrix_serialize' is identical to \verb'GrB_Matrix_serialize', except
that it does not require a pre-allocated blob.  Instead, it allocates the blob
internally, and fills it with the serialized matrix.  By default, ZSTD (level 1)
compression is used, but other options can be selected via the descriptor.

The blob is allocated with the \verb'malloc' function passed to
\verb'GxB_init', or the C11 \verb'malloc' if \verb'GrB_init' was used
to initialize GraphBLAS.  The blob must be freed by the matching \verb'free'
method, either the \verb'free' function passed to \verb'GxB_init' or
the C11 \verb'free' if \verb'GrB_init' was used.

%-------------------------------------------------------------------------------
\subsubsection{{\sf GrB\_Matrix\_deserialize:}    deserialize a matrix}
%-------------------------------------------------------------------------------
\label{matrix_deserialize}

\begin{mdframed}[userdefinedwidth=6in]
{\footnotesize
\begin{verbatim}
GrB_Info GrB_Matrix_deserialize     // deserialize blob into a GrB_Matrix
(
    // output:
    GrB_Matrix *C,      // output matrix created from the blob
    // input:
    GrB_Type type,      // type of the matrix C.  Required if the blob holds a
                        // matrix of user-defined type.  May be NULL if blob
                        // holds a built-in type; otherwise must match the
                        // type of C.
    const void *blob,       // the blob
    GrB_Index blob_size     // size of the blob
) ;
\end{verbatim}
} \end{mdframed}

This method creates a matrix \verb'A' by deserializing the contents of the
blob, constructed by either \verb'GrB_Matrix_serialize' or
\verb'GxB_Matrix_serialize'.

% extended in the v2.1 C API (type may be NULL):
The \verb'type' may be \verb'NULL' if the blob holds a serialized matrix with a
built-in type.  In this case, the type is determined automatically.  For
user-defined types, the \verb'type' must match the type of the matrix in the
blob.  The \verb'GrB_get' method can be used to query the blob for the name of
this type.

%-------------------------------------------------------------------------------
\subsubsection{{\sf GxB\_Matrix\_deserialize:}    deserialize a matrix}
%-------------------------------------------------------------------------------
\label{matrix_deserialize_GxB}

\begin{mdframed}[userdefinedwidth=6in]
{\footnotesize
\begin{verbatim}
GrB_Info GxB_Matrix_deserialize     // deserialize blob into a GrB_Matrix
(
    // output:
    GrB_Matrix *C,      // output matrix created from the blob
    // input:
    GrB_Type type,      // type of the matrix C.  Required if the blob holds a
                        // matrix of user-defined type.  May be NULL if blob
                        // holds a built-in type; otherwise must match the
                        // type of C.
    const void *blob,       // the blob
    GrB_Index blob_size,    // size of the blob
    const GrB_Descriptor desc
) ;
\end{verbatim}
} \end{mdframed}

Identical to \verb'GrB_Matrix_deserialize'.