File: special.html

package info (click to toggle)
boost1.35 1.35.0-5
  • links: PTS
  • area: main
  • in suites: lenny
  • size: 203,856 kB
  • ctags: 337,867
  • sloc: cpp: 938,683; xml: 56,847; ansic: 41,589; python: 18,999; sh: 11,566; makefile: 664; perl: 494; yacc: 456; asm: 353; csh: 6
file content (439 lines) | stat: -rw-r--r-- 21,031 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!--
(C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com . 
Use, modification and distribution is subject to the Boost Software
License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt)
-->
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<link rel="stylesheet" type="text/css" href="../../../boost.css">
<link rel="stylesheet" type="text/css" href="style.css">
<title>Serialization - Special Considerations</title>
</head>
<body link="#0000ff" vlink="#800080">
<table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header">
  <tr> 
    <td valign="top" width="300"> 
      <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
    </td>
    <td valign="top"> 
      <h1 align="center">Serialization</h1>
      <h2 align="center">Special Considerations</h2>
    </td>
  </tr>
</table>
<hr>
<dl class="page-index">
  <dt><a href="#objecttracking">Object Tracking</a>
  <dt><a href="#export">Exporting Class Serialization</a>
  <dt><a href="#classinfo">Class Information</a>
  <dt><a href="#portability">Archive Portability</a>
  <dl class="page-index">
    <dt><a href="#numerics">Numerics</a>
    <dt><a href="#traits">Traits</a>
  </dl>
  <dt><a href="#binary_archives">Binary Archives</a>
  <dt><a href="#xml_archives">XML Archives</a>
  <dt><a href="#dlls">DLLS - Serialization and Runtime Linking</a>
  <dt><a href="#multi_threading">Multi-Threading</a>
  <dt><a href="#optimizations">Optimizations</a>
  <dt><a href="exceptions.html">Archive Exceptions</a>
  <dt><a href="exception_safety.html">Exception Safety</a>
</dl>

<h3><a name="objecttracking">Object Tracking</a></h3>
Depending on how the class is used and other factors, serialized objects
may be tracked by memory address.  This prevents the same object from being
written to or read from an archive multiple times. These stored addresses
can also be used to delete objects created during a loading process
that has been interrupted by throwing of an exception.  
<p>
This could cause problems in 
progams where the copies of different objects are saved from the same address.
<pre><code>
template&lt;class Archive&gt;
void save(boost::basic_oarchive  &amp; ar, const unsigned int version) const
{
    for(int i = 0; i &lt; 10; ++i){
        A x = a[i];
        ar &lt;&lt; x;
    }
}
</code></pre>
In this case, the data to be saved exists on the stack.  Each iteration
of the loop updates the value on the stack.  So although the data changes
each iteration, the address of the data doesn't.  If a[i] is an array of
objects being tracked by memory address, the library will skip storing
objects after the first as it will be assumed that objects at the same address
are really the same object.
<p>
To help detect such cases, output archive operators expect to be passed
<code style="white-space: normal">const</code> reference arguments.
<p>
Given this, the above code will invoke a compile time assertion.
The obvious fix in this example is to use
<pre><code>
template&lt;class Archive&gt;
void save(boost::basic_oarchive &amp; ar, const unsigned int version) const
{
    for(int i = 0; i &lt; 10; ++i){
        ar &lt;&lt; a[i];
    }
}
</code></pre>
which will compile and run without problem.  
The usage of <code style="white-space: normal">const</code> by the output archive operators
will ensure that the process of serialization doesn't
change the state of the objects being serialized.  An attempt to do this
would constitute augmentation of the concept of saving of state with
some sort of non-obvious side effect. This would almost surely be a mistake 
and a likely source of very subtle bugs.
<p>
Unfortunately, implementation issues currently prevent the detection of this kind of
error when the data item is wrapped as a name-value pair.
<p>
A similar problem can occur when different objects are loaded to and address
which is different from the final location:
<pre><code>
template&lt;class Archive&gt;
void load(boost::basic_oarchive  &amp; ar, const unsigned int version) const
{
    for(int i = 0; i &lt; 10; ++i){
        A x;
        ar &gt;&gt; x;
        std::m_set.insert(x);
    }
}
</code></pre>
In this case, the address of <code>x</code> is the one that is tracked rather than
the address of the new item added to the set.  Left unaddressed
this will break the features that depend on tracking such as loading object through a pointer.
Subtle bugs will be introduced into the program.  This can be
addressed by altering the above code thusly:

<pre><code>
template&lt;class Archive&gt;
void load(boost::basic_iarchive  &amp; ar, const unsigned int version) const
{
    for(int i = 0; i &lt; 10; ++i){
        A x;
        ar &gt;&gt; x;
        std::pair&lt;std::set::const_iterator, bool&gt; result;
        result = std::m_set.insert(x);
        ar.reset_object_address(& (*result.first), &x);
    }
}
</code></pre>
This will adjust the tracking information to reflect the final resting place of 
the moved variable and thereby rectify the above problem.
<p>
If it is known a priori that no pointer
values are duplicated, overhead associated with object tracking can
be eliminated by setting the object tracking class serialization trait
appropriately.
<p>
By default, data types designated primitive by 
<a target="detail" href="traits.html#level">Implementation Level</a>
class serialization trait are never tracked. If it is desired to
track a shared primitive object through a pointer (e.g. a
<code style="white-space: normal">long</code> used as a reference count), It should be wrapped
in a class/struct so that it is an identifiable type.
The alternative of changing the implementation level of a <code style="white-space: normal">long</code>
would affect all <code style="white-space: normal">long</code>s serialized in the whole
program - probably not what one would intend.
<p>
It is possible that we may want to track addresses even though
the object is never serialized through a pointer.  For example,
a virtual base class need be saved/loaded only once.  By setting
this serialization trait to <code style="white-space: normal">track_always</code>, we can suppress 
redundant save/load operations.
<pre><code>
BOOST_CLASS_TRACKING(my_virtual_base_class, boost::serialization::track_always)
</code></pre>

<h3><a name="export">Exporting Class Serialization</a></h3>
<a target="detail" href="traits.html#export">Elsewhere</a> in this manual, we have described 
<code style="white-space: normal">BOOST_CLASS_EXPORT</code>. This is used to make the serialization library aware
that code should be instantiated for serialization of a given class even though the
class hasn't been otherwise referred to by the program.  This functionality
is necessary to implement serialization of pointers through a virtual base
class pointer.  That is, a polymorphic pointer.
<p>
This macro specifies a "<b>G</b>lobally <b>U</b>nique <b>ID</b>entifier".  
This is an string which identifies the class to be created when data is loaded.
Generally a text representation of the class name is sufficient for this purpose, 
but in certain cases it maybe necessary to specify a different string by using 
<code style="white-space: normal">BOOST_CLASS_EXPORT_GUID</code> 
rather than a simple 
<code style="white-space: normal">BOOST_CLASS_EXPORT</code>.

<p>
<code style="white-space: normal">BOOST_CLASS_EXPORT</code> would usually 
be specified in the same header file as the class declaration to which it 
corresponds.  That is, <code style="white-space: normal">BOOST_CLASS_EXPORT(T)</code>
is a "trait" of the class T. So a program using this class will look 
something like:

<pre><code>
#include &lt;boost/archive/xml_oarchive.hpp&gt;
.... // any other archive classes
#include "my_class.hpp"  // which contains BOOST_CLASS_EXPORT(my_class)
</code></pre>

These headers can be in any order.  (In boost versions 1.34
and earlier, the archive headers had to go before any headers which
contain <code style="white-space: normal">BOOST_CLASS_EXPORT</code>.)
Any code required to serialize types specified
by <code style="white-space: normal">BOOST_CLASS_EXPORT</code> will be
instantiated for each archive whose header is included.  (note that the code 
is instantiated regardless of whether or not it is actually invoked.) 
If no archive headers are included - no code should be instantiated. 
This will permit <code style="white-space: normal">BOOST_CLASS_EXPORT</code> 
to be a permanent part of the <code style="white-space: normal">my_class.hpp</code> .

<p>
Note that the implementation of this functionality depends upon vendor
specific extensions to the C++ language.  So, there is no guarenteed portability
of programs which use this facility.  However, all C++ compilers which
are tested with boost provide the required extensions.  The library
includes the extra declarations required by each of these compilers.
It's reasonable to expect that future C++ compilers will support
these extensions or something equivalent.

<h3><a name="classinfo">Class Information</a></h3>
By default, for each class serialized, class information is written to the archive.
This information includes version number, implementation level and tracking
behavior.  This is necessary so that the archive can be correctly
deserialized even if a subsequent version of the program changes
some of the current trait values for a class.  The space overhead for
this data is minimal.  There is a little bit of runtime overhead
since each class has to be checked to see if it has already had its
class information included in the archive.  In some cases, even this
might be considered too much.  This extra overhead can be eliminated
by setting the 
<a target="detail" href="traits.html#level">implementation level</a>
class trait to: <code style="white-space: normal">boost::serialization::object_serializable</code>. 
<p>
<i>Turning off tracking and class information serialization will result
in pure template inline code that in principle could be optimised down
to a simple stream write/read.</i>  Elimination of all serialization overhead
in this manner comes at a cost.  Once archives are released to users, the
class serialization traits cannot be changed without invalidating the old
archives.  Including the class information in the archive assures us
that they will be readable in the future even if the class definition
is revised.  A light weight structure such as display pixel might be
declared in a header like this:

<pre><code>
#include &lt;boost/serialization/serialization.hpp&gt;
#include &lt;boost/serialization/level.hpp&gt;
#include &lt;boost/serialization/tracking.hpp&gt;

// a pixel is a light weight struct which is used in great numbers.
struct pixel
{
    unsigned char red, green, blue;
    template&lt;class Archive&gt;
    void serialize(Archive &amp; ar, const unsigned int /* version */){
        ar &lt;&lt; red &lt;&lt; green &lt;&lt; blue;
    }
};

// elminate serialization overhead at the cost of
// never being able to increase the version.
BOOST_CLASS_IMPLEMENTATION(pixel, boost::serialization::object_serializable);

// eliminate object tracking (even if serialized through a pointer)
// at the risk of a programming error creating duplicate objects.
BOOST_CLASS_TRACKING(pixel, boost::serialization::track_never)
</code></pre>

<h3><a name="portability">Archive Portability</a></h3>
Several archive classes create their data in the form of text or portable a binary format.  
It should be possible to save such an of such a class on one platform and load it on another.  
This is subject to a couple of conditions.
<h4><a name="numerics">Numerics</a></h4>
The architecture of the machine reading the archive must be able hold the data
saved.  For example, the gcc compiler reserves 4 bytes to store a variable of type
<code style="white-space: normal">wchar_t</code> while other compilers reserve only 2 bytes.  
So its possible that   a value could be written that couldn't be represented by the loading program.  This is a
fairly obvious situation and easily handled by using the numeric types in
<a target="cstding" href="../../../boost/cstdint.hpp">&lt;boost/cstdint.hpp&gt;</a>
<P>
A special integral type is <code>std::size_t</code> which is a typedef
of an integral types guaranteed to be large enough
to hold the size of any collection, but its actual size can differ depending
on the platform. The 
<a href="wrappers.html#collection_size_type"><code>collection_size_type</code></a>
wrapper exists to enable a portable serialization of collection sizes by an archive.
Recommended choices for a portable serialization of collection sizes are to 
use either 64-bit or variable length integer representation.


<h4><a name="traits">Traits</a></h4>
Another potential problem is illustrated by the following example:
<pre><code>
template&lt;class T&gt;
struct my_wrapper {
    template&lt;class Archive&gt;
    Archive & serialize ...
};

...

class my_class {
    wchar_t a;
    short unsigned b;
    template<&lt;class Archive&gt;
    Archive & serialize(Archive & ar, unsigned int version){
        ar & my_wrapper(a);
        ar & my_wrapper(b);
    }
};
</code></pre>
If <code style="white-space: normal">my_wrapper</code> uses default serialization
traits there could be a problem.  With the default traits, each time a new type is
added to the archive, bookkeeping information is added. So in this example, the
archive would include such bookkeeping information for 
<code style="white-space: normal">my_wrapper&lt;wchar_t&gt;</code> and for
<code style="white-space: normal">my_wrapper&lt;short_unsigned&gt;</code>.
Or would it?  What about compilers that treat 
<code style="white-space: normal">wchar_t</code> as a
synonym for <code style="white-space: normal">unsigned short</code>?
In this case there is only one distinct type - not two.  If archives are passed between
programs with compilers that differ in their treatment 
of <code style="white-space: normal">wchar_t</code> the load operation will fail
in a catastrophic way.
<p>
One remedy for this is to assign serialization traits to the template
<code style="white-space: normal">my_template</code> such that class
information for instantiations of this template is never serialized.  This 
process is described <a target="detail" href="traits.html#templates">above</a> and
has been used for <a target="detail" href="wrappers.html#nvp"><strong>Name-Value Pairs</strong></a>.
Wrappers would typically be assigned such traits.
<p>
Another way to avoid this problem is to assign serialization traits
to all specializations of the template <code style="white-space: normal">my_wrapper</code>
for all primitive types so that class information is never saved.  This is what has
been done for our implementation of serializations for STL collections.

<h3><a name="binary_archives">Binary Archives</a></h3>
Standard stream i/o on some systems will expand linefeed characters to carriage-return/linefeed 
on output. This creates a problem for binary archives.  The easiest way to handle this is to 
open streams for binary archives in "binary mode" by using the flag 
<code style="white-space: normal">ios::binary</code>.  If this is not done, the archive generated
will be unreadable.
<p>
Unfortunately, no way has been found to detect this error before loading the archive.  Debug builds
will assert when this is detected so that may be helpful in catching this error.

<h3><a name="xml_archives">XML Archives</a></h3>
XML archives present a somewhat special case. 
XML format has a nested structure that maps well to the "recursive class member visitor" pattern 
used by the serialization system. However, XML differs from other formats in that it 
requires a name for each data member. Our goal is to add this information to the 
class serialization specification while still permiting the the serialization code to be 
used with any archive. This is achived by requiring that all data serialized to an XML archive
be serialized as a <a target="detail" href="wrappers.html#nvp">name-value pair</a>.
The first member is the name to be used as the XML tag for the
data item while the second is a reference to the data item itself. Any attempt to serialize data
not wrapped in a in a <a target="detail" href="wrappers.html#nvp">name-value pair</a> will
be trapped at compile time. The system is implemented in such a way that for other archive classes,
just the value portion of the data is serialized. The name portion is discarded during compilation.
So by always using <a target="detail" href="wrappers.html#nvp">name-value pairs</a>, it will
be guarenteed that all data can be serialized to all archive classes with maximum efficiency.

<h3><a name="dlls">DLLS - Serialization and Runtime Linking</a></h3>
Serialization code can be placed in libraries to be linked at runtime.  That is,
code can be placed in DLLS(Windows) or Shared Libraries(*nix).  
Along with the "export" facility, this
permits a program to written without knowledge of the actual types to be serialized.
This package doesn't include an example of this technique - but coding would be
very similar to the example 
<a href = "../example/demo_pimpl.cpp" target="demo_pimpl">
<code style="white-space: normal">demo_pimpl.cpp</code>
</a>,
<a href = "../example/demo_pimpl_A.cpp" target="demo_pimpl">
<code style="white-space: normal">demo_pimpl_A.cpp</code>
</a>
and
<a href = "../example/demo_pimpl_A.hpp" target="demo_pimpl">
<code style="white-space: normal">demo_pimpl_A.hpp</code>
</a>
where implementation of serializaton is completely separate
from the main program.

<h3><a name="multi_threading">Multi-Threading</a></h3>
The nature of serialization would conflict with multiple thread concurrently
writing/reading from/to a single open archive.  Since each archive is
independent from ever other one, there should be no problem
in having multiple open archives from one or more threads.
<p>
Well, not quite.
<p>
There are a couple of global data structures for holding
information of serializable types.  These structures are
used to dispatch to correct code to handle each pair
of serializable types and archive types.  Since this
information is shared among all archives, there is 
potential for problems.  This has been addressed
carefully implementing the library so that these 
structures are all initialized before 
<code style="white-space: normal">main(...)</code>
is called.  From then on they are never altered.  So
there SHOULD be no problem having mulitple archives
open simultaneously - be it from the same or different
threads.
<p>
Well, almost.
<p>
With dynamically loaded code - DLLS or Shared Libraries, 
these global data structures can be altered when a library
is loaded or unloaded.  That is, in this case, these
globa data structures can be altered after 
<code style="white-space: normal">main(...)</code>
is called.  So if a thread is dynamically loading/unloading
modules which contain serialization code while an 
archive is open there could be problems.  Also, if
such loading/unloading is happening concurrently
in different threads, there could also be problems.
<p>
It might not be easy to control this.  Is possible that
some systems may not actually load modules until they
are actually needed.  So even though we think that
there is not dynamic loading/unloading of such code
it could be occurring as "help" to manage resources.
On such systems, access to archive code would have 
to be syncronized with some multi-threading construct
in order to be functional.

<h3><a name="optimizations">Optimizations</a></h3>
In performance critical applications that serialize large sets of contiguous data of homogeneous
types one wants to avoid the overhead of serializing each element individually, which is
the motivation for the <a href="wrappers.html#arrays"><code>array</code></a>
wrapper.

Serialization functions for data types containing contiguous arrays of homogeneous
types, such as for <code>std::vector</code>, <code>std::valarray</code> or  
<code>boost::multiarray</code> should serialize them using an
<a href="wrappers.html#arrays"><code>array</code></a> wrapper to make use of 
these optimizations.

Archive types that can provide optimized serialization for contiguous arrays of 
homogeneous types should implement these by overloading the serialization of
the  <a href="wrappers.html#arrays"><code>array</code></a> wrapper, as is done
for the binary archives.


<h3><a href="exceptions.html">Archive Exceptions</a></h3>
<h3><a href="exception_safety.html">Exception Safety</a></h3>

<hr>
<p><i>&copy; Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004. 
Distributed under the Boost Software License, Version 1.0. (See
accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
</i></p>
</body>
</html>