File: ChangeLog

package info (click to toggle)
otf2 3.1.1-4
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 29,000 kB
  • sloc: ansic: 92,997; python: 16,977; cpp: 9,057; sh: 6,299; makefile: 238; awk: 54
file content (532 lines) | stat: -rw-r--r-- 23,758 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
------------------ Released version 3.1.1 ------------------------------

- Fix build error in `otf2-print`.

------------------- Released version 3.1 -----------------------------

- Remove support for Python 2.x for binding and generator.

- Add support for Python packaging.  The package is available via PyPI
  and  can be  installed via  pip or  any  compatible packaging  tool.
  Building a package from source is also supported.

- Add paradigm for OpenMP target regions.

- Add region roles for OpenMP cancel directive and accelerator kernels.

- Add thumbnail types for communicators, RMA windows, and I/O handles.

- A project file for CMake was contributed.  See `contrib/Readme.txt`.

- Fix SIONlib support with Intel oneAPI compiler suite.

- Fix reader example in Python documentation.

------------------ Released version 3.0.3 ----------------------------

- Fix reading thumbnails.

- Restore ability to build on non-64-bit platforms.

- Allow building the locking headers with newer compilers.

- Python 2.x  is now  deprecated for  the bindings  and generator,  as
  Python 2 reached its end-of-life as of 1 Jan 2020.  Python 2 support
  is no longer tested and will be removed no later than OTF2 4.0.
  Please  migrate  to Python  3.x if  you are  using either  of  these
  features.

------------------ Released version 3.0.2 ----------------------------

- Add proper support for Intel oneAPI compilers to build system
  via `--with-nocross-compiler-suite=oneapi`.
- Add proper support for AMD ROCm compilers to build system via
  `--with-nocross-compiler-suite=amdclang`.

------------------ Released version 3.0.1 ----------------------------

- Add proper support for NVIDIA HPC SDK compilers to build system
  via `--with-nocross-compiler-suite=nvhpc`.

------------------- Released version 3.0 -----------------------------

- Add support for accelerator and network devices  in the system tree.

- Rename the location type for GPUs to accelerator streams.

- Add location group type for accelerator contexts.  Such a group must
  contain  at least  one accelerator stream  location,  any number  of
  metric locations,  and no CPU thread locations.  Similarly,  process
  location groups must  now contain at least  one CPU thread location,
  may  contain  any  number of  metric  locations, and  may no  longer
  contain accelerator stream locations.

- Add creating location  group to the location  group definition.  For
  process location  groups,  this may be undefined or another  process
  location  group.  For  accelerator  location  groups, this must be a
  process location group.

- Add support for mapping of location groups.

- Add  event   records  for  non-blocking  collective   operations  on
  communicators.
  - NonBlockingCollectiveRequest
  - NonBlockingCollectiveComplete

- Add event records for communicator creation and destruction.
  - CommCreate
  - CommDestroy
  To indicate  whether these  are present in the  event stream,  a new
  flags attribute  `OTF2_CommFlag` was added to  the Comm definitions.
  The corresponding flag is not set when a pre-3.0 trace is read.  For
  symmetry reasons,  a similar flag  (`OTF2_RmaWinFlag`) was  added to
  the RmaWin  definitions, which is  automatically set  when a pre-3.0
  trace is read.

- Add support for inter-communicators.
  - The Comm definition is now a polymorphic definition.
  - Add new InterComm definition which is in the same namespace as the
    Comm definition.
  - Add enum  `OTF2_CollectiveRoot` for collective  root constants, to
    denote special values in collective operation events.

- Add a date/time  attribute to the  ClockProperties record,  denoting
  when the trace was recorded.

- The `otf2-config` tool can  now show  the  configuration summary via
  the `--config-summary` parameter.

- Using deprecated API functions will now issue warnings, if supported
  by the compiler used.  Disable by adding
      `-DOTF2_IGNORE_ATTRIBUTE_DEPRECATED`
  to you compiler flags.

- Remove zlib  code and deprecate  OTF2_COMPRESSION_ZLIB.  The API  is
  retained though, but will be removed in the next major release.

- Use a more  inclusive language.  Some functions  are available under
  a new name and the old ones are mark as deprecated.

- The archive  names used by the  auxiliary 'trace_gen' tools  dropped
  one `trace`, from `otf2_trace_gen_trace_` to `otf2_trace_gen_.
  Aligning these with the tool name.

------------------- Released version 2.3 -----------------------------

- Update Jinja template engine to 2.11.2 and  support Python 3 for the
  generator. Minimum version is now 2.7.

- Add paradigms for HIP accelerators and Kokkos.

------------------- Released version 2.2 -----------------------------

- Added a definition record to attach a parameter to a callpath.

- Removed the restriction,  that AttributeLists can  only have at most
  1024 entries.  If the list  does not fit  into the chunk,  the write
  routines return OTF2_ERROR_INVALID_SIZE_GIVEN.

- Reduce  loss of precision  when interpolating timestamps.  Thanks to
  Alexander Grund for the suggestion.

- Native build support for LLVM/Clang compiler.

------------------ Released version 2.1.1 ----------------------------

- Writing a SION-substrate trace  with the high level Python API works
  now.

- Fix  possible  deadlocks, when  closing  a global event  or snapshot
  reader,  before reading  all records,  or  in  case of  errors  when
  creating these global readers.

- Fix reading  SION-substrate traces,  if locations have  no events or
  snapshots.

- Improve handling of mappings to `UNDEFINED` ids.

------------------- Released version 2.1 -----------------------------

- A new set  of definition and event  records were added  to model and
  record I/O activities of applications.

- Add OTF2 python bindings.
  - Documentation is included in doc/python.tar.gz as html.
  - Two modules are provided: A low-level one similar to C: '_otf2',
    and a high level more pythonic API: 'otf2'.
  - Both python2 and python3 are fully supported.

- A  new set  of  event  records  were  added  to  denote the  program
  executed, the passed arguments, and the exit status.

- Added enum value OTF2_RMA_ATOMIC_TYPE_FETCH_AND_ACCUMULATE to model
  atomic operations that retrieve the initial value and perform a
  system/user-specified operation on the remote value.

- Fix scalability bottleneck in reading OTF2 traces stored in  SIONlib
  containers.

- A project file for Microsoft Visual Studio 2014 was contributed.
  See contrib-build-vs/Readme.txt.

- OTF2 builds  now by default  position independent  object code, also
  for static libraries.  Pass `--without-pic` to  configure to get the
  previous mode.

------------------- Released version 2.0 -----------------------------

- The  experimental  CallingContextSample event and  the  accompanying
  definition records have been redesigned and are now declared stable.
  Though there  is no  conversion done  for traces  with these records
  written with OTF2 1.5.
  The changes includes:
  - The CallingContext definition  uses a SourceCodeLocation attribute
    now.  The  previously offset line number  to the beginning  of the
    referenced region, was very fragile.
  - The  IP address  attribute  was  removed  from the  CallingContext
    definition.    Though   with    the   new   CallingContextProperty
    definition,  the writer  is able to  pass arbitrary  attributes to
    each node.
  - The  InterruptGenerator  definition lost  its 'unit'  attribute in
    favor of  a mode/base/exponent tuple, similar  to the MetricMember
    definition.   The  mode  is  expressed  with  the  new  enum  type
    OTF2_InterruptGeneratorMode, which includes a time based interrupt
    generator  (OTF2_INTERRUPT_GENERATOR_MODE_TIME) and a  count based
    one (OTF2_INTERRUPT_GENERATOR_MODE_COUNT).  The unit is implicitly
    given by the mode than.
  - The  addition of  the new  CallingContextEnter/CallingContextLeave
    records.   These  complete  the  CallingContextSample  event  when
    instrumentation  and  sampling  is   used  in  conjunction.   OTF2
    includes  a fallback  conversion  for  old  readers which  do  not
    register for the  new events.  They are than  converted to the old
    Enter/Leave  events.  The  old  event pair  and  the  new  calling
    context based events must be used mutual exclusive in one trace.

- Specifying the  chunk size for the  definitions can now be postponed
  before opening any definition or marker writers.
  See OTF2_Archive_SetDefChunkSize for more details.

- The estimator API and tool learned to estimate the chunk size needed
  for the  definitions.  The  result can  than  be used in  a call  to
  OTF2_Archive_SetDefChunkSize.

- Three new region roles for  functions which allocate, deallocate, or
  deallocate memory were added.

- A new OTF2_Paradigm entry was added which can be used to denote that
  the definition entity does not belong to any specific paradigm.

- The move to version 2.0 was used  to cleanup API inconsistencies and
  remove deprecated API functions.  Namely:
  - OTF2_MetricBase was  renamed to OTF2_Base.  The enum  entries were
    missing the 'METRIC' in their name anyway.
  - The following functions were removed (deprecated since 1.1):
    - OTF2_AttributeList_AddString
    - OTF2_AttributeList_GetString
  Additionally, the  following property definitions  were changed from
  a (String, String) tuple to  a (String, Type, AttributeValue) tuple.
  Conversion from the old record format is provided.
  - SystemTreeNodeProperty
  - LocationGroupProperty
  - LocationProperty

- The Callsite definition was marked as deprecated.

------------------ Released version 1.5.1 ----------------------------

- Fix build errors on AIX.

------------------- Released version 1.5 -----------------------------

- A new  set of  callbacks can  now be registered  to OTF2  to make it
  thread safe.  These  callbacks are  optional.  Predefined  callbacks
  are provided  for OpenMP and Pthread.  And  new usage  examples were
  added too.

- The new hint API  for OTF2 will be used to optimize  the writing and
  reading process.  The first  OTF2_HINT_GLOBAL_READER hint  should be
  set  by readers,  which  intent  to use  only the  global event  and
  snapshot  readers.  In this  case the SION  substrate wont  allocate
  additional file descriptors for each location to read.  On the other
  side,  the local event and snapshot  readers are independent and can
  be used  concurrently.  Using  the  readers  concurrently  with  the
  OTF2_HINT_GLOBAL_READER hint set  requires proper  locking callbacks
  than.

- Added a new region role  OTF2_REGION_ROLE_TASK_UNTIED to distinguish
  tied from untied tasks.

- OTF2 learned to identify new paradigms.
  The list includes:
  - Windows threads
  - Qt threads
  - ACE threads
  - TBB threads
  - OpenACC directives
  - OpenCL API functions and kernels
  - Multicore Task API functions
  - Functions recorded by sampling

- The new Paradigm definition was introduced to attests that a certain
  parallel paradigm was available at  the time the trace was recorded,
  and  vice versa.  Additionally the  new ParadigmProperty  definition
  can  be used  to  further define  a specific  paradigm.  The overall
  intention is to help trace readers  to handle future  paradigms, not
  yet added to  the known  list of paradigms  in OTF2.  In conjunction
  with  these   new  definition   records,  the   new   ParadigmClass,
  ParadigmProperty, and Boolean enum where also introduced.

- The new SourceCodeLocation  definition can be  used to attach source
  code annotations to all events.  To avoid addition record attributes
  for  the events,  the AttributeList is the  preferred way to use the
  new definition.  The used Attribute definition  should have the name
  "SOURCE_CODE_LOCATION" though.

- The build  process now ensures that only a  Python 2 version will be
  used for OTF2.

- OTF2 now needs at least version 1.5.3 of SIONlib, as it uses the new
  key-value  API to support  writing an arbitrary number  of locations
  per process.

- An  experimental  set  of  new  records  to  be  used  for  sampling
  measurements were added.  No stability guarantee is given.

- Added support for Intel Xeon Phi

------------------- Released version 1.4 -----------------------------

- Read-only buffer arguments  in the collective  callbacks got 'const'
  annotations.

- The Attribute definition was extended with a description key.

- Definition  records  are  using  the  available  buffer  space  more
  efficiently,  in trade-off  of a  small  performance  penalty.  This
  particularly results  in an increase of the  maximum size of records
  with array members.  Though no trace format change was done.

- OTF2  learned to identify new  paradigms.  The list  includes GASPI,
  Unified Parallel C, and SHMEM and its derivatives.

- OTF2  learned  new atomic  operations to  be used  in the  RmaAtomic
  record:
  - OTF2_RMA_ATOMIC_TYPE_SWAP
  - OTF2_RMA_ATOMIC_TYPE_FETCH_AND_ADD
  - OTF2_RMA_ATOMIC_TYPE_FETCH_AND_INCREMENT

- OTF2 now provides also an OTF2_Archive_CloseGlobalDefWriter API  for
  completeness.  The call itself is optional.

- The 'gethostid' function  is used  as an  additional  entropy source
  when generating the trace ID on systems which provide this function.

- The 'otf2-print' tool shows the  metric member name in Metric events
  in addition to the type and value.  Additionally when ranks are used
  in  conjunction   with  communicators,  RMA  windows,  or  cartesian
  topologies, they are resolved to the location.

- The reading and writing usage examples from the documentation are
  now provided as working C code and Makefile under:

    <prefix>/share/doc/otf2/examples

- OTF2 now provides example  collective callbacks to be used with MPI.
  To prevent compiling  and installing  these callbacks for  different
  MPI  implementation and  to keep  the build  system of  OTF2 simple,
  these callbacks are provided as a header.  They are usable for C and
  C++ and detailed usage examples for reading and writing are provided
  in the aforementioned installation location.

- The estimator API learned to estimate the size of an  AttributeList.

- Added  the  'otf2-estimator' tool  which  provides  a  command  line
  interface to the estimator API, introduced in 1.3.

- The  '--cuda'  option  from  the  'otf2-config'  tool  is  marked as
  deprecated and will issue a warning when specified.

------------------ Released version 1.3.1 ----------------------------

- The  'Future prove  reading and  writing of  not-yet-known attribute
  types'  changes done  in  1.1.1  now also  applies when  reading and
  writing snapshot records.  Which where introduced in 1.2.

- OTF2 now returns an error if the user does not specify a collective
  context.  This particularly helps when converting from the 1.2 API.

- OTF2 fixes several issues, when dealing with absent local definition
  files and the SIONlib substrate.  In particular  the open-file calls
  now explicitly  return an  error code indicating  that files of this
  type are missing.  As the local  definition files are optional,  the
  OTF2 user can catch this error and handle it gracefully.

- OTF2 was a little sloppy when operating in a collective context  and
  only one rank encountered an error, but the other ranks were waiting
  in a  collective operation.  These kind of errors  are now broadcast
  to all ranks and all can than notify the caller about this error.

------------------- Released version 1.3 -----------------------------

- OTF2  now integrates  SIONlib via  its new  generic interface.  This
  enables paradigm independent reading and writing of OTF2 traces with
  the  SION  substrate.  The SIONlib  configure  option  changed  from
  --with-sionconfig  to --with-sionlib.  SIONlib  is auto-detected  if
  'sionconfig' is in $PATH.

- OTF2  learned to  identify new  paradigms.  The list  includes POSIX
  threads, HMPP, OpmSs, and for generic hardware.

- The OTF2 tools 'otf2-marker' and  'otf2-snapshots' where broken when
  compiled with an PGI compiler.

- The two new definitions  LocationGroupProperty and  LocationProperty
  complete the arbitrary property annotation of the system tree.

- New events for create/wait based threading paradigms were added.  In
  conjunction  with this  two  new  region  roles  were added  too  to
  indicate functions which created and waited for an thread.

- A new API was added to estimate the  resulting size of an trace file
  based  on  the number  of  expected events  and  also accounting the
  number of definitions, to accurately predict the online compression.

- New definitions records to  specify cartesian topologies, dimensions
  and coordinates were added.

- Native build support for Mac OS X and MinGW platforms.

------------------ Released version 1.2.1 ----------------------------

Maintainance release. Low upgrade urgency.

- Fix  build  when  the  user  has set  the  GREP_OPTIONS  environment
  variable.

- Fix output of the 'otf2-marker' tool.

------------------- Released version 1.2 -----------------------------

- This version  introduces a new set of event records  for generic RMA
  operations. It is described in the following paper:

  A. Knüpfer,  R. Dietrich,  J. Doleschal,  M. Geimer, M.-A. Hermanns,
  C. Rössel, R. Tschüter, B. Wesarg & F. Wolf:
  "Generic Support for Remote  Memory Access Operations in Score-P and
  OTF2", Parallel Tools Workshop 2012

  Which  also  serves  as  a whitepaper on  the  usage of  these event
  records.

- In conjunction with the new RMA event record set, there were changes
  to existing definitions and types. Namely:
  - The Group  definition was extended  to indicate in  which paradigm
    a group,  and therefore also the referencing communicators and RMA
    windows, operate;  the  corresponding OTF2_GroupType  entries were
    also renamed accordingly.
  - The OTF2_MpiCollectiveType and the corresponding enum entries were
    renamed to OTF2_CollectiveOp and OTF2_COLLECTIVE_OP_ respectively.
  - The MpiComm definition was  renamed to just Comm, to indicate that
    this definition is not restricted to MPI anymore.

- OTF2_Paradigm   learned  the  new   OTF2_PARADIGM_MEASUREMENT_SYSTEM
  paradigm  which is  intended  to be used by  the measurement  system
  which writes a trace.  Besides this the OTF2_RegionRole learned the
  new  OTF2_REGION_ROLE_ARTIFICIAL  role  which  can  be used  by  the
  measurement system too.

- The MetricClass  definition was  extended with the  information what
  kind of  location this  MetricClass  was  recorded by.  See the  new
  OTF2_RecorderKind  type.  This is  also  used  to  specify that  the
  MetricClass  will   only  be   recorded  via  MetricInstance's,  and
  MetricInstance's should not only  reference MetricClass's which have
  a recorder  kind  of  OTF2_RECORDER_KIND_ABSTRACT.  Additionally the
  new MetricClassRecorder  definition was introduced  which narrow the
  set of recorders of a specific MetricClass further.

- There are  two new definitions to  more accurately define the system
  tree of the machine the trace was run on:
  - SystemTreeNodeProperty: Attach  tree-form  properties to one node.
  - SystemTreeNodeDomain: Attach  defined semantics to one  node.  See
    the new OTF2_SystemTreeDomain type.

- A new  set of  generic threading event  records for  fork-join based
  threading models is introduced. Because of technical constraints and
  the  enhanced  level of  detail of  the  new events,  they  were not
  implemented by extending the previous OpenMP specific event records,
  but  deprecate  them.  Nevertheless, for  some of  the  new  records
  backward compatibility is prepared.

- Snapshots are a new feature to support partial  loading of the trace
  data.  For that a snapshot  holds  all  information  describing  the
  current  state  of a location. Reading this snapshot and  afterwards
  continuing  reading  events  results in the same state,  as  reading
  from the beginning. A trace can contain many snapshots in increasing
  timestamp  order,  so  that  it  is possible to start  reading  from
  these  points  on.  The 'otf2-snapshots'  tool  is  provided  to add
  snapshots to an existing trace.

- In conjunction  with  the snapshots the thumbnail feature provides a
  way to attach sampled time-series metrics  to the trace. A thumbnail
  can  sample  multiple  metrics of a trace, which are reflected as an
  stacked  graph  without unit.  Metrics  must be one of the currently
  supported classes: existing attributes, regions, or  metric members.
  The  already  mentioned new 'otf2-snapshots'  tool creates  one such
  thumbnail while generating the snapshots.

- Both snapshots and thumbnails can be generated for an existing trace
  without altering the original content of the trace.  Only the anchor
  file holds new meta data to  indicate the existence of snapshots and
  thumbnails.

- OTF2 traces  can now have so called markers attached.  Markers are a
  temporal and spatial annotation of the trace with a severity  and an
  arbitrary message. It can be a point in time or a time range.  These
  markers  can  be  generated by users as well as by tools to pinpoint
  analysis  results  at  the  time of trace generation or post-mortem.
  Markers  are  intended  for  human consumption and  therefore  their
  number  should  be kept small.  Markers can also be shared by users,
  because they are only  loosely coupled with  the trace itself.  This
  feature  is   currently  an   experimental  addition   and  will  be
  re-evaluated in the next release.

------------------- Released version 1.1.1 ---------------------------

- OTF2's library installation directory matches the system library
  directory (lib/lib64). Installation directory and the flags returned
  by otf2-config now match.

- Minor documentation, portabiliy and style improvements.

- Harden reading and writing of the anchor file.

- Future prove reading and writing of not-yet-known attribute types.

- The local definition reader returns now an error, when it detected
  multiple definitions of the same mapping type. Though the user can
  ignore this error and can continue reading definitions.

------------------- Released version 1.1 -----------------------------

- A trace can now have arbitrary properties attached (which are stored
  in the anchor file), to help tools to decide whether the trace can
  be used or not.
- A trace also gets now a unique id attached.
- The AttributeList learned to reference all definitions, and the IDs
  will than be mapped to the global definitions.
- The Region definition learned a new canonical name attribute. This
  could be used to also store the mangled name of a C++ function in
  the definition. It also splits the region type into the programming
  paradigm and the role of the region in this paradigm, plus a new
  flags field. This opens the definition for more paradigms without
  duplicating many of the old region types. Forward reading
  (ie. reading with OTF2 1.0.x a OTF2 1.1.x generated trace) is
  ensured and also backward reading.
- The new buffer rewind feature enables to discard a preceding section
  of the event trace at user defined control points while writing an
  event trace.
- The return value of the record callbacks is now honored by OTF2. For
  this a new type was introduced, returning something other that
  OTF2_CALLBACK_SUCCESS will stop the reading and returns
  OTF2_ERROR_INTERRUPTED_BY_CALLBACK to the caller. Reading of records
  can still be resumed after this error.