1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
|
# Internal Instrumentation for MPICH
This text is out-of-date but is provided as a starting point for
discussions. The major update needed is to make this interface
compatible with the MPIT interface, which (currently) defines a handle
to be passed to the routines that access or update performance
information.
To understand and tune the performance of MPICH, there is a need for a
uniform way to instrument and report on the MPICH code. This section
suggests an approach similar to that used for adding debug messages,
which is [Debug Event Logging](Debug_Event_Logging.md).
## Requirements
The design of the implementation is based on a clear set of
requirements.
1. Low to zero overhead for all operations that may be in a critical
path.
1. Compile-time selection for no overhead in the production
version. That is, it must be possible to build MPICH with no
instrumentation at all.
2. Run-time selection with low overhead. This allows the inclusion
of instrumentation in the "typical" builds. The Run-time
selection must also include turning the instrumentation on and
off in response to a number of events, including explicit
control and through automatic controls such as limits on the
amount of data.
3. Thread-safe as an option (see below).
2. Simple instrumentation of the common cases. This is to both
encourage the inclusion of instrumentation and to ensure that the
presence of instrumentation does not harm the readability or
maintainability of the code.
3. Easy method for adding or changing instrumentation.
4. Modularity for the instrumentation (definitions must be local to the
module that requires them)
5. Easy hook for adding performance callbacks (but without adding
overhead when callbacks are not required).
6. Compatible with the proposed MPIT tool interface in MPI-3.
The requirement for compile-time selection implies that macros be used
for any operations that may be in a performance-critical path.
The requirement for compatibility with MPIT suggests that the macros
take a handle that specifies the counter, which can be implemented as a
pointer to the variable to update, or a structure containing the
pointer.
Thread safety can introduce significant overheads that may be
unnecessary in accomplishing the purpose of the interface - tuning
MPICH. That is, in some cases, the extra overhead of ensuring thread
safety may make the data less valuable than data that may have some
errors (e.g., missing updates) due to thread races. Thus, the interface
should allow the developer to make that tradeoff.
## Possible Design
- `MPIU_INSTR_DURATION_DECL(handle)` - Declare an instrumentation
handle
- `MPIU_INSTR_DURATION_INIT(handle,ncounter,description)` -
Initialize a named duration and provide a text description
- `MPIU_INSTR_DURATION_START(handle)` - Begin a timing "epoch" for
name
- `MPIU_INSTR_DURATION_END(handle)` - End a timing "epoch" for name
and increment the time in the duration by the time since the
corresponding start.
- `MPIU_INSTR_DURNATION_INCR(handle,index,amount)` - Increment the
index'th counter in the named duration by amount
The description field is used to create the code that writes out the
summary. Combined with the `extractstrings` script, this allows
instrumentation to be added in a single location.
A sample implementation for the single-threaded case might be:
```
#define MPIU_INSTR_DURATION_INCR(name,index,amount) \
MPIU_INSTRUM[MPIU_INSTRUM_##name].val += amount
```
A script, similar to the `extractstates` script, would determine the
size of the array and define the various `MPIU_INSTR_name` values. A
more complex version could be
```
#define MPIU_INSTR_DURATION_INCR(name,amount) \
{ MPIU_Instrum_t *_p = MPIU_INSTRUM + MPIU_INSTRUM_##name; \
_p->val += amount; _p->count++; if (_p->val > _p->max) _p->max = _p->val; \
if (_p->val < _p->min) _p->min = _p->val; }
```
Next steps: Determine if these are adequate for the needed
instrumentation. Note that the code that handles initialization and
finalization is generated by reading the source code, in the same manor
as `extractstrings`.
## History
The original design document included mechanisms to instrument important
internal states. This information is in the file `stat.tex` in the
archived MPICH document (in `/home/MPI`). However, while designed and
documented, it was not used in the initial implementation.
The original design has limitations; since that original design, there
have been published papers on instrumentation of MPI, including one at
IEEE Cluster 2006.
|