1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532
|
------------------ Released version 3.1.1 ------------------------------
- Fix build error in `otf2-print`.
------------------- Released version 3.1 -----------------------------
- Remove support for Python 2.x for binding and generator.
- Add support for Python packaging. The package is available via PyPI
and can be installed via pip or any compatible packaging tool.
Building a package from source is also supported.
- Add paradigm for OpenMP target regions.
- Add region roles for OpenMP cancel directive and accelerator kernels.
- Add thumbnail types for communicators, RMA windows, and I/O handles.
- A project file for CMake was contributed. See `contrib/Readme.txt`.
- Fix SIONlib support with Intel oneAPI compiler suite.
- Fix reader example in Python documentation.
------------------ Released version 3.0.3 ----------------------------
- Fix reading thumbnails.
- Restore ability to build on non-64-bit platforms.
- Allow building the locking headers with newer compilers.
- Python 2.x is now deprecated for the bindings and generator, as
Python 2 reached its end-of-life as of 1 Jan 2020. Python 2 support
is no longer tested and will be removed no later than OTF2 4.0.
Please migrate to Python 3.x if you are using either of these
features.
------------------ Released version 3.0.2 ----------------------------
- Add proper support for Intel oneAPI compilers to build system
via `--with-nocross-compiler-suite=oneapi`.
- Add proper support for AMD ROCm compilers to build system via
`--with-nocross-compiler-suite=amdclang`.
------------------ Released version 3.0.1 ----------------------------
- Add proper support for NVIDIA HPC SDK compilers to build system
via `--with-nocross-compiler-suite=nvhpc`.
------------------- Released version 3.0 -----------------------------
- Add support for accelerator and network devices in the system tree.
- Rename the location type for GPUs to accelerator streams.
- Add location group type for accelerator contexts. Such a group must
contain at least one accelerator stream location, any number of
metric locations, and no CPU thread locations. Similarly, process
location groups must now contain at least one CPU thread location,
may contain any number of metric locations, and may no longer
contain accelerator stream locations.
- Add creating location group to the location group definition. For
process location groups, this may be undefined or another process
location group. For accelerator location groups, this must be a
process location group.
- Add support for mapping of location groups.
- Add event records for non-blocking collective operations on
communicators.
- NonBlockingCollectiveRequest
- NonBlockingCollectiveComplete
- Add event records for communicator creation and destruction.
- CommCreate
- CommDestroy
To indicate whether these are present in the event stream, a new
flags attribute `OTF2_CommFlag` was added to the Comm definitions.
The corresponding flag is not set when a pre-3.0 trace is read. For
symmetry reasons, a similar flag (`OTF2_RmaWinFlag`) was added to
the RmaWin definitions, which is automatically set when a pre-3.0
trace is read.
- Add support for inter-communicators.
- The Comm definition is now a polymorphic definition.
- Add new InterComm definition which is in the same namespace as the
Comm definition.
- Add enum `OTF2_CollectiveRoot` for collective root constants, to
denote special values in collective operation events.
- Add a date/time attribute to the ClockProperties record, denoting
when the trace was recorded.
- The `otf2-config` tool can now show the configuration summary via
the `--config-summary` parameter.
- Using deprecated API functions will now issue warnings, if supported
by the compiler used. Disable by adding
`-DOTF2_IGNORE_ATTRIBUTE_DEPRECATED`
to you compiler flags.
- Remove zlib code and deprecate OTF2_COMPRESSION_ZLIB. The API is
retained though, but will be removed in the next major release.
- Use a more inclusive language. Some functions are available under
a new name and the old ones are mark as deprecated.
- The archive names used by the auxiliary 'trace_gen' tools dropped
one `trace`, from `otf2_trace_gen_trace_` to `otf2_trace_gen_.
Aligning these with the tool name.
------------------- Released version 2.3 -----------------------------
- Update Jinja template engine to 2.11.2 and support Python 3 for the
generator. Minimum version is now 2.7.
- Add paradigms for HIP accelerators and Kokkos.
------------------- Released version 2.2 -----------------------------
- Added a definition record to attach a parameter to a callpath.
- Removed the restriction, that AttributeLists can only have at most
1024 entries. If the list does not fit into the chunk, the write
routines return OTF2_ERROR_INVALID_SIZE_GIVEN.
- Reduce loss of precision when interpolating timestamps. Thanks to
Alexander Grund for the suggestion.
- Native build support for LLVM/Clang compiler.
------------------ Released version 2.1.1 ----------------------------
- Writing a SION-substrate trace with the high level Python API works
now.
- Fix possible deadlocks, when closing a global event or snapshot
reader, before reading all records, or in case of errors when
creating these global readers.
- Fix reading SION-substrate traces, if locations have no events or
snapshots.
- Improve handling of mappings to `UNDEFINED` ids.
------------------- Released version 2.1 -----------------------------
- A new set of definition and event records were added to model and
record I/O activities of applications.
- Add OTF2 python bindings.
- Documentation is included in doc/python.tar.gz as html.
- Two modules are provided: A low-level one similar to C: '_otf2',
and a high level more pythonic API: 'otf2'.
- Both python2 and python3 are fully supported.
- A new set of event records were added to denote the program
executed, the passed arguments, and the exit status.
- Added enum value OTF2_RMA_ATOMIC_TYPE_FETCH_AND_ACCUMULATE to model
atomic operations that retrieve the initial value and perform a
system/user-specified operation on the remote value.
- Fix scalability bottleneck in reading OTF2 traces stored in SIONlib
containers.
- A project file for Microsoft Visual Studio 2014 was contributed.
See contrib-build-vs/Readme.txt.
- OTF2 builds now by default position independent object code, also
for static libraries. Pass `--without-pic` to configure to get the
previous mode.
------------------- Released version 2.0 -----------------------------
- The experimental CallingContextSample event and the accompanying
definition records have been redesigned and are now declared stable.
Though there is no conversion done for traces with these records
written with OTF2 1.5.
The changes includes:
- The CallingContext definition uses a SourceCodeLocation attribute
now. The previously offset line number to the beginning of the
referenced region, was very fragile.
- The IP address attribute was removed from the CallingContext
definition. Though with the new CallingContextProperty
definition, the writer is able to pass arbitrary attributes to
each node.
- The InterruptGenerator definition lost its 'unit' attribute in
favor of a mode/base/exponent tuple, similar to the MetricMember
definition. The mode is expressed with the new enum type
OTF2_InterruptGeneratorMode, which includes a time based interrupt
generator (OTF2_INTERRUPT_GENERATOR_MODE_TIME) and a count based
one (OTF2_INTERRUPT_GENERATOR_MODE_COUNT). The unit is implicitly
given by the mode than.
- The addition of the new CallingContextEnter/CallingContextLeave
records. These complete the CallingContextSample event when
instrumentation and sampling is used in conjunction. OTF2
includes a fallback conversion for old readers which do not
register for the new events. They are than converted to the old
Enter/Leave events. The old event pair and the new calling
context based events must be used mutual exclusive in one trace.
- Specifying the chunk size for the definitions can now be postponed
before opening any definition or marker writers.
See OTF2_Archive_SetDefChunkSize for more details.
- The estimator API and tool learned to estimate the chunk size needed
for the definitions. The result can than be used in a call to
OTF2_Archive_SetDefChunkSize.
- Three new region roles for functions which allocate, deallocate, or
deallocate memory were added.
- A new OTF2_Paradigm entry was added which can be used to denote that
the definition entity does not belong to any specific paradigm.
- The move to version 2.0 was used to cleanup API inconsistencies and
remove deprecated API functions. Namely:
- OTF2_MetricBase was renamed to OTF2_Base. The enum entries were
missing the 'METRIC' in their name anyway.
- The following functions were removed (deprecated since 1.1):
- OTF2_AttributeList_AddString
- OTF2_AttributeList_GetString
Additionally, the following property definitions were changed from
a (String, String) tuple to a (String, Type, AttributeValue) tuple.
Conversion from the old record format is provided.
- SystemTreeNodeProperty
- LocationGroupProperty
- LocationProperty
- The Callsite definition was marked as deprecated.
------------------ Released version 1.5.1 ----------------------------
- Fix build errors on AIX.
------------------- Released version 1.5 -----------------------------
- A new set of callbacks can now be registered to OTF2 to make it
thread safe. These callbacks are optional. Predefined callbacks
are provided for OpenMP and Pthread. And new usage examples were
added too.
- The new hint API for OTF2 will be used to optimize the writing and
reading process. The first OTF2_HINT_GLOBAL_READER hint should be
set by readers, which intent to use only the global event and
snapshot readers. In this case the SION substrate wont allocate
additional file descriptors for each location to read. On the other
side, the local event and snapshot readers are independent and can
be used concurrently. Using the readers concurrently with the
OTF2_HINT_GLOBAL_READER hint set requires proper locking callbacks
than.
- Added a new region role OTF2_REGION_ROLE_TASK_UNTIED to distinguish
tied from untied tasks.
- OTF2 learned to identify new paradigms.
The list includes:
- Windows threads
- Qt threads
- ACE threads
- TBB threads
- OpenACC directives
- OpenCL API functions and kernels
- Multicore Task API functions
- Functions recorded by sampling
- The new Paradigm definition was introduced to attests that a certain
parallel paradigm was available at the time the trace was recorded,
and vice versa. Additionally the new ParadigmProperty definition
can be used to further define a specific paradigm. The overall
intention is to help trace readers to handle future paradigms, not
yet added to the known list of paradigms in OTF2. In conjunction
with these new definition records, the new ParadigmClass,
ParadigmProperty, and Boolean enum where also introduced.
- The new SourceCodeLocation definition can be used to attach source
code annotations to all events. To avoid addition record attributes
for the events, the AttributeList is the preferred way to use the
new definition. The used Attribute definition should have the name
"SOURCE_CODE_LOCATION" though.
- The build process now ensures that only a Python 2 version will be
used for OTF2.
- OTF2 now needs at least version 1.5.3 of SIONlib, as it uses the new
key-value API to support writing an arbitrary number of locations
per process.
- An experimental set of new records to be used for sampling
measurements were added. No stability guarantee is given.
- Added support for Intel Xeon Phi
------------------- Released version 1.4 -----------------------------
- Read-only buffer arguments in the collective callbacks got 'const'
annotations.
- The Attribute definition was extended with a description key.
- Definition records are using the available buffer space more
efficiently, in trade-off of a small performance penalty. This
particularly results in an increase of the maximum size of records
with array members. Though no trace format change was done.
- OTF2 learned to identify new paradigms. The list includes GASPI,
Unified Parallel C, and SHMEM and its derivatives.
- OTF2 learned new atomic operations to be used in the RmaAtomic
record:
- OTF2_RMA_ATOMIC_TYPE_SWAP
- OTF2_RMA_ATOMIC_TYPE_FETCH_AND_ADD
- OTF2_RMA_ATOMIC_TYPE_FETCH_AND_INCREMENT
- OTF2 now provides also an OTF2_Archive_CloseGlobalDefWriter API for
completeness. The call itself is optional.
- The 'gethostid' function is used as an additional entropy source
when generating the trace ID on systems which provide this function.
- The 'otf2-print' tool shows the metric member name in Metric events
in addition to the type and value. Additionally when ranks are used
in conjunction with communicators, RMA windows, or cartesian
topologies, they are resolved to the location.
- The reading and writing usage examples from the documentation are
now provided as working C code and Makefile under:
<prefix>/share/doc/otf2/examples
- OTF2 now provides example collective callbacks to be used with MPI.
To prevent compiling and installing these callbacks for different
MPI implementation and to keep the build system of OTF2 simple,
these callbacks are provided as a header. They are usable for C and
C++ and detailed usage examples for reading and writing are provided
in the aforementioned installation location.
- The estimator API learned to estimate the size of an AttributeList.
- Added the 'otf2-estimator' tool which provides a command line
interface to the estimator API, introduced in 1.3.
- The '--cuda' option from the 'otf2-config' tool is marked as
deprecated and will issue a warning when specified.
------------------ Released version 1.3.1 ----------------------------
- The 'Future prove reading and writing of not-yet-known attribute
types' changes done in 1.1.1 now also applies when reading and
writing snapshot records. Which where introduced in 1.2.
- OTF2 now returns an error if the user does not specify a collective
context. This particularly helps when converting from the 1.2 API.
- OTF2 fixes several issues, when dealing with absent local definition
files and the SIONlib substrate. In particular the open-file calls
now explicitly return an error code indicating that files of this
type are missing. As the local definition files are optional, the
OTF2 user can catch this error and handle it gracefully.
- OTF2 was a little sloppy when operating in a collective context and
only one rank encountered an error, but the other ranks were waiting
in a collective operation. These kind of errors are now broadcast
to all ranks and all can than notify the caller about this error.
------------------- Released version 1.3 -----------------------------
- OTF2 now integrates SIONlib via its new generic interface. This
enables paradigm independent reading and writing of OTF2 traces with
the SION substrate. The SIONlib configure option changed from
--with-sionconfig to --with-sionlib. SIONlib is auto-detected if
'sionconfig' is in $PATH.
- OTF2 learned to identify new paradigms. The list includes POSIX
threads, HMPP, OpmSs, and for generic hardware.
- The OTF2 tools 'otf2-marker' and 'otf2-snapshots' where broken when
compiled with an PGI compiler.
- The two new definitions LocationGroupProperty and LocationProperty
complete the arbitrary property annotation of the system tree.
- New events for create/wait based threading paradigms were added. In
conjunction with this two new region roles were added too to
indicate functions which created and waited for an thread.
- A new API was added to estimate the resulting size of an trace file
based on the number of expected events and also accounting the
number of definitions, to accurately predict the online compression.
- New definitions records to specify cartesian topologies, dimensions
and coordinates were added.
- Native build support for Mac OS X and MinGW platforms.
------------------ Released version 1.2.1 ----------------------------
Maintainance release. Low upgrade urgency.
- Fix build when the user has set the GREP_OPTIONS environment
variable.
- Fix output of the 'otf2-marker' tool.
------------------- Released version 1.2 -----------------------------
- This version introduces a new set of event records for generic RMA
operations. It is described in the following paper:
A. Knüpfer, R. Dietrich, J. Doleschal, M. Geimer, M.-A. Hermanns,
C. Rössel, R. Tschüter, B. Wesarg & F. Wolf:
"Generic Support for Remote Memory Access Operations in Score-P and
OTF2", Parallel Tools Workshop 2012
Which also serves as a whitepaper on the usage of these event
records.
- In conjunction with the new RMA event record set, there were changes
to existing definitions and types. Namely:
- The Group definition was extended to indicate in which paradigm
a group, and therefore also the referencing communicators and RMA
windows, operate; the corresponding OTF2_GroupType entries were
also renamed accordingly.
- The OTF2_MpiCollectiveType and the corresponding enum entries were
renamed to OTF2_CollectiveOp and OTF2_COLLECTIVE_OP_ respectively.
- The MpiComm definition was renamed to just Comm, to indicate that
this definition is not restricted to MPI anymore.
- OTF2_Paradigm learned the new OTF2_PARADIGM_MEASUREMENT_SYSTEM
paradigm which is intended to be used by the measurement system
which writes a trace. Besides this the OTF2_RegionRole learned the
new OTF2_REGION_ROLE_ARTIFICIAL role which can be used by the
measurement system too.
- The MetricClass definition was extended with the information what
kind of location this MetricClass was recorded by. See the new
OTF2_RecorderKind type. This is also used to specify that the
MetricClass will only be recorded via MetricInstance's, and
MetricInstance's should not only reference MetricClass's which have
a recorder kind of OTF2_RECORDER_KIND_ABSTRACT. Additionally the
new MetricClassRecorder definition was introduced which narrow the
set of recorders of a specific MetricClass further.
- There are two new definitions to more accurately define the system
tree of the machine the trace was run on:
- SystemTreeNodeProperty: Attach tree-form properties to one node.
- SystemTreeNodeDomain: Attach defined semantics to one node. See
the new OTF2_SystemTreeDomain type.
- A new set of generic threading event records for fork-join based
threading models is introduced. Because of technical constraints and
the enhanced level of detail of the new events, they were not
implemented by extending the previous OpenMP specific event records,
but deprecate them. Nevertheless, for some of the new records
backward compatibility is prepared.
- Snapshots are a new feature to support partial loading of the trace
data. For that a snapshot holds all information describing the
current state of a location. Reading this snapshot and afterwards
continuing reading events results in the same state, as reading
from the beginning. A trace can contain many snapshots in increasing
timestamp order, so that it is possible to start reading from
these points on. The 'otf2-snapshots' tool is provided to add
snapshots to an existing trace.
- In conjunction with the snapshots the thumbnail feature provides a
way to attach sampled time-series metrics to the trace. A thumbnail
can sample multiple metrics of a trace, which are reflected as an
stacked graph without unit. Metrics must be one of the currently
supported classes: existing attributes, regions, or metric members.
The already mentioned new 'otf2-snapshots' tool creates one such
thumbnail while generating the snapshots.
- Both snapshots and thumbnails can be generated for an existing trace
without altering the original content of the trace. Only the anchor
file holds new meta data to indicate the existence of snapshots and
thumbnails.
- OTF2 traces can now have so called markers attached. Markers are a
temporal and spatial annotation of the trace with a severity and an
arbitrary message. It can be a point in time or a time range. These
markers can be generated by users as well as by tools to pinpoint
analysis results at the time of trace generation or post-mortem.
Markers are intended for human consumption and therefore their
number should be kept small. Markers can also be shared by users,
because they are only loosely coupled with the trace itself. This
feature is currently an experimental addition and will be
re-evaluated in the next release.
------------------- Released version 1.1.1 ---------------------------
- OTF2's library installation directory matches the system library
directory (lib/lib64). Installation directory and the flags returned
by otf2-config now match.
- Minor documentation, portabiliy and style improvements.
- Harden reading and writing of the anchor file.
- Future prove reading and writing of not-yet-known attribute types.
- The local definition reader returns now an error, when it detected
multiple definitions of the same mapping type. Though the user can
ignore this error and can continue reading definitions.
------------------- Released version 1.1 -----------------------------
- A trace can now have arbitrary properties attached (which are stored
in the anchor file), to help tools to decide whether the trace can
be used or not.
- A trace also gets now a unique id attached.
- The AttributeList learned to reference all definitions, and the IDs
will than be mapped to the global definitions.
- The Region definition learned a new canonical name attribute. This
could be used to also store the mangled name of a C++ function in
the definition. It also splits the region type into the programming
paradigm and the role of the region in this paradigm, plus a new
flags field. This opens the definition for more paradigms without
duplicating many of the old region types. Forward reading
(ie. reading with OTF2 1.0.x a OTF2 1.1.x generated trace) is
ensured and also backward reading.
- The new buffer rewind feature enables to discard a preceding section
of the event trace at user defined control points while writing an
event trace.
- The return value of the record callbacks is now honored by OTF2. For
this a new type was introduced, returning something other that
OTF2_CALLBACK_SUCCESS will stop the reading and returns
OTF2_ERROR_INTERRUPTED_BY_CALLBACK to the caller. Reading of records
can still be resumed after this error.
|