File: engine.rst

package info (click to toggle)
adios2 2.10.2%2Bdfsg1-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, trixie
  • size: 33,764 kB
  • sloc: cpp: 175,964; ansic: 160,510; f90: 14,630; yacc: 12,668; python: 7,275; perl: 7,126; sh: 2,825; lisp: 1,106; xml: 1,049; makefile: 579; lex: 557
file content (596 lines) | stat: -rw-r--r-- 23,379 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
******
Engine
******

.. _sec:basics_interface_components_engine:

The Engine abstraction component serves as the base interface to the actual IO systems executing the heavy-load tasks performed when producing and consuming data.

Engine functionality works around two concepts:

1. Variables are published (``Put``) and consumed (``Get``) in "steps" in either "File" random-access (all steps are available) or "Streaming" (steps are available as they are produced in a step-by-step fashion).
2. Variables are published (``Put``) and consumed (``Get``) using a "sync" or "deferred" (lazy evaluation) policy.

.. caution::

   The ADIOS2 "step" is a logical abstraction that means different things depending on the application context.
   Examples: "time step", "iteration step", "inner loop step", or "interpolation step", "variable section", etc.
   It only indicates how the variables were passed into ADIOS2 (e.g. I/O steps) without the user having to index this information on their own.

.. tip::
   
   Publishing and consuming data is a round-trip in ADIOS2.
   ``Put`` and ``Get`` APIs for write/append and read modes aim to be "symmetric", reusing functions, objects, and semantics as much as possible.

The rest of the section explains the important concepts.

BeginStep
---------

   Begins a logical step and return the status (via an enum) of the stream to be read/written.
   In streaming engines ``BeginStep`` is where the receiver tries to acquire a new step in the reading process.
   The full signature allows for a mode and timeout parameters.
   See :ref:`Supported Engines` for more information on what engine allows.
   A simplified signature allows each engine to pick reasonable defaults.

.. code-block:: c++

   // Full signature
   StepStatus BeginStep(const StepMode mode,
                        const float timeoutSeconds = -1.f); 

   // Simplified signature
   StepStatus BeginStep();

EndStep
-------
        
   Ends logical step, flush to transports depending on IO parameters and engine default behavior.


.. tip::
   
   To write portable code for a step-by-step access across ADIOS2 engines (file and streaming engines) use ``BeginStep`` and ``EndStep``.

.. danger:: 
   
   Accessing random steps in read mode (e.g. ``Variable<T>::SetStepSelection`` in file engines) will create a conflict with ``BeginStep`` and ``EndStep`` and will throw an exception.
   In file engines, data is either consumed in a random-access or step-by-step mode, but not both.


Close
-----

   Close current engine and underlying transports.
   An ``Engine`` object can't be used after this call.


Put: modes and memory contracts
-------------------------------

``Put`` publishes data in ADIOS2.
It is unavailable unless the ``Engine`` is created in ``Write`` or ``Append`` mode.

The most common signature is the one that passes a ``Variable<T>``
object for the metadata, a ``const`` piece of contiguous memory for
the data, and a mode for either ``Deferred`` (data may be collected at
Put() or not until EndStep/PerformPuts/Close) or ``Sync`` (data is reusable immediately).
This is the most common use case in applications.

1. Deferred (default) or Sync mode, data is contiguous memory 

   .. code-block:: c++

      void Put(Variable<T> variable, const T* data, const adios2::Mode = adios2::Mode::Deferred);

ADIOS2 Engines also provide direct access to their buffer memory.
``Variable<T>::Span`` is based on a subset of the upcoming `C++20 std::span <https://en.cppreference.com/w/cpp/container/span>`_, which is a non-owning reference to a block of contiguous memory.
Spans act as a 1D container meant to be filled out by the application.
They provide the standard API of an STL container, providing ``begin()`` and ``end()`` iterators, ``operator[]`` and ``at()``, as well as ``data()`` and ``size()``.

``Variable<T>::Span`` is helpful in situations in which temporaries are needed to create contiguous pieces of memory from non-contiguous pieces (*e.g.* tables, arrays without ghost-cells), or just to save memory as the returned ``Variable<T>::Span`` can be used for computation, thus avoiding an extra copy from user memory into the ADIOS2 buffer.
``Variable<T>::Span`` combines a hybrid ``Sync`` and ``Deferred`` mode, in which the initial value and memory allocations are ``Sync``, while data population and metadata collection are done at EndStep/PerformPuts/Close.
Memory contracts are explained later in this chapter followed by examples.

The following ``Variable<T>::Span`` signatures are available:

2. Return a span setting a default ``T()`` value into a default buffer
 
   .. code-block:: c++
   
      Variable<T>::Span Put(Variable<T> variable);
      
3. Return a span setting an initial fill value into a certain buffer.
If span is not returned then the ``fillValue`` is fixed for that block.

   .. code-block:: c++

      Variable<T>::Span Put(Variable<T> variable, const size_t bufferID, const T fillValue);


In summary, the following are the current Put signatures for publishing data in ADIOS 2:

1. ``Deferred`` (default) or ``Sync`` mode, data is contiguous memory put in an ADIOS2 buffer.

   .. code-block:: c++

      void Put(Variable<T> variable, const T* data, const adios2::Mode = adios2::Mode::Deferred);
   
2. Return a span setting a default ``T()`` value into a default ADIOS2 buffer.
If span is not returned then the default ``T()`` is fixed for that block (e.g. zeros).
 
   .. code-block:: c++
   
      Variable<T>::Span Put(Variable<T> variable);
   
3. Return a span setting an initial fill value into a certain buffer.
If span is not returned then the ``fillValue`` is fixed for that block.

   .. code-block:: c++

      Variable<T>::Span Put(Variable<T> variable, const size_t bufferID, const T fillValue);


The following table summarizes the memory contracts required by ADIOS2 engines between ``Put`` signatures and the data memory coming from an application:

+----------+-------------+----------------------------------------------------+
| Put      | Data Memory | Contract                                           |
+----------+-------------+----------------------------------------------------+
|          | Pointer     | do not modify until PerformPuts/EndStep/Close      |
| Deferred |             |                                                    |
|          | Contents    | consumed at Put or PerformPuts/EndStep/Close       |
+----------+-------------+----------------------------------------------------+
|          | Pointer     | modify after Put                                   |
| Sync     |             |                                                    |
|          | Contents    | consumed at Put                                    |
+----------+-------------+----------------------------------------------------+
|          | Pointer     | modified by new Spans, updated span iterators/data |
| Span     |             |                                                    |
|          | Contents    | consumed at PerformPuts/EndStep/Close              |
+----------+-------------+----------------------------------------------------+


.. note::

   In Fortran (array) and Python (numpy array) avoid operations that modify the internal structure of an array (size) to preserve the address. 
   
   
Each ``Engine`` will give a concrete meaning to  each functions signatures, but all of them must follow the same memory contracts to the "data pointer": the memory address itself, and the "data contents": memory bits (values).
   
1. **Put in Deferred or lazy evaluation mode (default)**: this is the preferred mode as it allows ``Put`` calls to be "grouped" before potential data transport at the first encounter of ``PerformPuts``, ``EndStep`` or ``Close``.
   
     .. code-block:: c++
         
         Put(variable, data);
         Put(variable, data, adios2::Mode::Deferred);
         

   Deferred memory contracts: 
      
   - "data pointer" do not modify (e.g. resize) until first call to ``PerformPuts``, ``EndStep`` or ``Close``.
      
   - "data contents" may be consumed immediately or at first call to
     ``PerformPuts``, ``EndStep`` or ``Close``.  Do not modify data contents after Put.


   Usage:

      .. code-block:: c++
         
         // recommended use: 
         // set "data pointer" and "data contents"
         // before Put
         data[0] = 10;  
         
         // Puts data pointer into adios2 engine
         // associated with current variable metadata
         engine.Put(variable, data);
         
         // Modifying data after Put(Deferred) may result in different
	 // results with different engines
         // Any resize of data after Put(Deferred) may result in
	 // memory corruption or segmentation faults
         data[1] = 10; 
         
         // "data contents" must not have been changed
         // "data pointer" must be the same as in Put
         engine.EndStep();   
         //engine.PerformPuts();  
         //engine.Close();
         
         // now data pointer can be reused or modified
        
   .. tip::

      It's recommended practice to set all data contents before ``Put`` in deferred mode to minimize the risk of modifying the data pointer (not just the contents) before PerformPuts/EndStep/Close.


2.  **Put in Sync mode**: this is the special case, data pointer becomes reusable right after ``Put``.
Only use it if absolutely necessary (*e.g.* memory bound application or out of scope data, temporary).
   
      .. code-block:: c++
         
         Put(variable, *data, adios2::Mode::Sync);
         

   Sync memory contracts:
      
   - "data pointer" and "data contents" can be modified after this call.
   
   
   Usage:

      .. code-block:: c++
         
         // set "data pointer" and "data contents"
         // before Put in Sync mode
         data[0] = 10;  
         
         // Puts data pointer into adios2 engine
         // associated with current variable metadata
         engine.Put(variable, data, adios2::Mode::Sync);
         
         // data pointer and contents can be reused
         // in application 
   
   
3. **Put returning a Span**: signature that allows access to adios2 internal buffer. 

   Use cases: 
   
   -  population from non-contiguous memory structures
   -  memory-bound applications 


   Limitations:
   
   -  does not allow operations (compression)
   -  must keep engine and variables within scope of span usage 
     


   Span memory contracts: 
      
   - "data pointer" provided by the engine and returned by ``span.data()``, might change with the generation of a new span. It follows iterator invalidation rules from std::vector. Use `span.data()` or iterators, `span.begin()`, `span.end()` to keep an updated data pointer.
      
   - span "data contents" are published at the first call to ``PerformPuts``, ``EndStep`` or ``Close``


   Usage:

       .. code-block:: c++
         
         // return a span into a block of memory
         // set memory to default T()
         adios2::Variable<int32_t>::Span span1 = Put(var1);
         
         // just like with std::vector::data()
         // iterator invalidation rules
         // dataPtr might become invalid
         // always use span1.data() directly
         T* dataPtr = span1.data();
         
         // set memory value to -1 in buffer 0
         adios2::Variable<float>::Span span2 = Put(var2, 0, -1);

         // not returning a span just sets a constant value 
         Put(var3);
         Put(var4, 0, 2);
         
         // fill span1
         span1[0] = 0;
         span1[1] = 1;
         span1[2] = 2;
         
         // fill span2
         span2[1] = 1;
         span2[2] = 2;
         
         // here collect all spans
         // they become invalid
         engine.EndStep();
         //engine.PerformPuts();  
         //engine.Close();
         
         // var1 = { 0, 1, 2 };
         // var2 = { -1., 1., 2.};
         // var3 = { 0, 0, 0};
         // var4 = { 2, 2, 2};


The ``data`` fed to the ``Put`` function is assumed to be allocated on the Host (default mode). In order to use data allocated on the device, the memory space of the variable needs to be set to Cuda.

     .. code-block:: c++

         variable.SetMemorySpace(adios2::MemorySpace::CUDA);
         engine.Put(variable, gpuData, mode);

.. note::

   Only CUDA allocated buffers are supported for device data.
   Only the BP4 and BP5 engines are capable of receiving device allocated buffers.


PerformPuts
-----------

   Executes all pending ``Put`` calls in deferred mode and collects
   span data.  Specifically this call copies Put(Deferred) data into
   internal ADIOS buffers, as if Put(Sync) had been used instead.

.. note::

   This call allows the reuse of user buffers, but may negatively
   impact performance on some engines.


PerformDataWrite
----------------

   If supported by the engine, moves data from prior ``Put`` calls to disk

.. note::

   Currently only supported by the BP5 file engine.



Get: modes and memory contracts
-------------------------------

``Get`` is the function for consuming data in ADIOS2.
It is available when an Engine is created using ``Read`` mode at ``IO::Open``.
ADIOS2 ``Put`` and ``Get`` semantics are as symmetric as possible considering that they are opposite operations (*e.g.* ``Put`` passes ``const T*``, while ``Get`` populates a non-const ``T*``).

The ``Get`` signatures are described below.

1. ``Deferred`` (default) or ``Sync`` mode, data is contiguous pre-allocated memory:

   .. code-block:: c++

      Get(Variable<T> variable, const T* data, const adios2::Mode = adios2::Mode::Deferred);


2. In this signature, ``dataV`` is automatically resized by ADIOS2 based on the ``Variable`` selection:

   .. code-block:: c++

      Get(Variable<T> variable, std::vector<T>& dataV, const adios2::Mode = adios2::Mode::Deferred);


The following table summarizes the memory contracts required by ADIOS2 engines between ``Get`` signatures and the pre-allocated (except when using C++11 ``std::vector``) data memory coming from an application:

+----------+-------------+-----------------------------------------------+
| Get      | Data Memory | Contract                                      |
+----------+-------------+-----------------------------------------------+
|          | Pointer     | do not modify until PerformGets/EndStep/Close |
| Deferred |             |                                               |
|          | Contents    | populated at Get or PerformGets/EndStep/Close |
+----------+-------------+-----------------------------------------------+
|          | Pointer     | modify after Get                              |
| Sync     |             |                                               |
|          | Contents    | populated at Get                              |
+----------+-------------+-----------------------------------------------+


1. **Get in Deferred or lazy evaluation mode (default)**: this is the preferred mode as it allows ``Get`` calls to be "grouped" before potential data transport at the first encounter of ``PerformPuts``, ``EndStep`` or ``Close``.
   
     .. code-block:: c++
         
         Get(variable, data);
         Get(variable, data, adios2::Mode::Deferred);
         

   Deferred memory contracts: 
      
   - "data pointer": do not modify (e.g. resize) until first call to ``PerformPuts``, ``EndStep`` or ``Close``.
      
   - "data contents": populated at ``Put``, or at first call to ``PerformPuts``, ``EndStep`` or ``Close``.

   Usage:`

      .. code-block:: c++

         std::vector<double> data;

         // resize memory to expected size 
         data.resize(varBlockSize);
         // valid if all memory is populated 
         // data.reserve(varBlockSize);

         // Gets data pointer to adios2 engine
         // associated with current variable metadata
         engine.Get(variable, data.data() );

         // optionally pass data std::vector 
         // leave resize to adios2
         //engine.Get(variable, data);

         // "data pointer" must be the same as in Get
         engine.EndStep();   
         // "data contents" are now ready
         //engine.PerformPuts();  
         //engine.Close();

         // now data pointer can be reused or modified



2.  **Put in Sync mode**: this is the special case, data pointer becomes reusable right after Put.
Only use it if absolutely necessary (*e.g.* memory bound application or out of scope data, temporary).
   
      .. code-block:: c++
         
         Get(variable, *data, adios2::Mode::Sync);
         

   Sync memory contracts:
      
   - "data pointer" and "data contents" can be modified after this call.
   
   
   Usage:

      .. code-block:: c++
         
         .. code-block:: c++
         
         std::vector<double> data;
         
         // resize memory to expected size 
         data.resize(varBlockSize);
         // valid if all memory is populated 
         // data.reserve(varBlockSize);
         
         // Gets data pointer to adios2 engine
         // associated with current variable metadata
         engine.Get(variable, data.data() );
         
         // "data contents" are ready
         // "data pointer" can be reused by the application

.. note::

   ``Get`` doesn't support returning spans.


PerformGets
-----------

   Executes all pending ``Get`` calls in deferred mode.


Engine usage example
--------------------

The following example illustrates the basic API usage in write mode for data generated at each application step:

.. code-block:: c++

   adios2::Engine engine = io.Open("file.bp", adios2::Mode::Write);

   for( size_t i = 0; i < steps; ++i )
   {
      // ... Application *data generation

      engine.BeginStep(); //next "logical" step for this application

      engine.Put(varT, dataT, adios2::Mode::Sync);
      // dataT memory already consumed by engine
      // Application can modify dataT address and contents
      
      // deferred functions return immediately (lazy evaluation),
      // dataU, dataV and dataW pointers and contents must not be modified
      // until PerformPuts, EndStep or Close.
      // 1st batch
      engine.Put(varU, dataU);
      engine.Put(varV, dataV);
      
      // in this case adios2::Mode::Deferred is redundant,
      // as this is the default option
      engine.Put(varW, dataW, adios2::Mode::Deferred);

      // effectively dataU, dataV, dataW are "deferred"
      // possibly until the first call to PerformPuts, EndStep or Close.
      // Application MUST NOT modify the data pointer (e.g. resize
      // memory) or change data contents.
      engine.PerformPuts();

      // dataU, dataV, dataW pointers/values can now be reused
      
      // ... Application modifies dataU, dataV, dataW 

      //2nd batch
      dataU[0] = 10
      dataV[0] = 10
      dataW[0] = 10 
      engine.Put(varU, dataU);
      engine.Put(varV, dataV);
      engine.Put(varW, dataW);
      // Application MUST NOT modify dataU, dataV and dataW pointers (e.g. resize),
      // Contents should also not be modified after Put() and before
      // PerformPuts() because ADIOS may access the data immediately
      // or not until PerformPuts(), depending upon the engine
      engine.PerformPuts();
      
      // dataU, dataV, dataW pointers/values can now be reused
      
      // Puts a varP block of zeros
      adios2::Variable<double>::Span spanP = Put<double>(varP);
      
      // Not recommended mixing static pointers, 
      // span follows 
      // the same pointer/iterator invalidation  
      // rules as std::vector
      T* p = spanP.data();

      // Puts a varMu block of 1e-6
      adios2::Variable<double>::Span spanMu = Put<double>(varMu, 0, 1e-6);
      
      // p might be invalidated 
      // by a new span, use spanP.data() again
      foo(spanP.data());

      // Puts a varRho block with a constant value of 1.225
      Put<double>(varMu, 0, 1.225);
      
      // it's preferable to start modifying spans 
      // after all of them are created
      foo(spanP.data());
      bar(spanMu.begin(), spanMu.end()); 
      
      
      engine.EndStep();
      // spanP, spanMu are consumed by the library
      // end of current logical step,
      // default behavior: transport data
   }

   engine.Close();
   // engine is unreachable and all data should be transported
   ...

.. tip::

   Prefer default ``Deferred`` (lazy evaluation) functions as they have the potential to group several variables with the trade-off of not being able to reuse the pointers memory space until ``EndStep``, ``PerformPuts``, ``PerformGets``, or ``Close``.
   Only use ``Sync`` if you really have to (*e.g.* reuse memory space from pointer).
   ADIOS2 prefers a step-based IO in which everything is known ahead of time when writing an entire step.


.. danger::
   The default behavior of ADIOS2 ``Put`` and ``Get`` calls IS NOT synchronized, but rather deferred.
   It's actually the opposite of ``MPI_Put`` and more like ``MPI_rPut``.
   Do not assume the data pointer is usable after a ``Put`` and ``Get``, before ``EndStep``, ``Close`` or the corresponding ``PerformPuts``/``PerformGets``.
   Avoid using temporaries, r-values, and out-of-scope variables in ``Deferred`` mode.
   Use ``adios2::Mode::Sync`` in these cases.


Available Engines
-----------------

A particular engine is set within the ``IO`` object that creates it with the ``IO::SetEngine`` function in a case insensitive manner.
If the ``SetEngine`` function is not invoked the default engine is the ``BPFile``.

+-------------------------+---------+---------------------------------------------+
| Application             | Engine  | Description                                 |
+-------------------------+---------+---------------------------------------------+
| File                    | BP5     | DEFAULT write/read ADIOS2 native bp files   |
|                         |         |                                             |
|                         | HDF5    | write/read interoperability with HDF5 files |
+-------------------------+---------+---------------------------------------------+
| Wide-Area-Network (WAN) | DataMan | write/read TCP/IP streams                   |
+-------------------------+---------+---------------------------------------------+
| Staging                 | SST     | write/read to a "staging" area: *e.g.* RDMA |
+-------------------------+---------+---------------------------------------------+


``Engine`` polymorphism has two goals:

1. Each ``Engine`` implements an orthogonal IO scenario targeting a use case (e.g. Files, WAN, InSitu MPI, etc) using a simple, unified API.

2. Allow developers to build their own custom system solution based on their particular requirements in the own playground space.
Reusable toolkit objects are available inside ADIOS2 for common tasks: bp buffering, transport management, transports, etc.

A class that extends ``Engine`` must be thought of as a solution to a range of IO applications.
Each engine must provide a list of supported parameters, set in the IO object creating this engine using ``IO::SetParameters``, and supported transports (and their parameters) in ``IO::AddTransport``.
Each Engine's particular options are documented in :ref:`Supported Engines`.