1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399
|
'\" t
.\"___INFO__MARK_BEGIN__
.\"
.\" Copyright: 2004 by Sun Microsystems, Inc.
.\"
.\"___INFO__MARK_END__
.\" $RCSfile$ Last Update: $Date$ Revision: $Revision$
.\"
.\"
.\" Some handy macro definitions [from Tom Christensen's man(1) manual page].
.\"
.de SB \" small and bold
.if !"\\$1"" \\s-2\\fB\&\\$1\\s0\\fR\\$2 \\$3 \\$4 \\$5
..
.\"
.de T \" switch to typewriter font
.ft CW \" probably want CW if you don't have TA font
..
.\"
.de TY \" put $1 in typewriter font
.if t .T
.if n ``\c
\\$1\c
.if t .ft P
.if n \&''\c
\\$2
..
.\"
.de M \" man page reference
\\fI\\$1\\fR\\|(\\$2)\\$3
..
.TH REPORTING 5 "$Date$" "xxRELxx" "xxQS_NAMExx File Formats"
.\"
.SH NAME
reporting \- xxQS_NAMExx reporting file format
.\"
.SH DESCRIPTION
A xxQS_NAMExx system writes a reporting file to
$SGE_ROOT/default/common/reporting.
The reporting file contains data that can be used for accounting, monitoring and analysis purposes.
It contains information about the cluster (hosts, queues, load values, consumables, etc.), about the jobs running in the cluster and about sharetree configuration and usage.
All information is time related, events are dumped to the reporting file in a configurable interval.
It allows to monitor a "real time" status of the cluster as well as historical analysis.
.\"
.\"
.SH FORMAT
The reporting file is an ASCII file.
Each line contains one record, and the fields of a record are separated by a delimiter (:).
The reporting file contains records of different type. Each record type has a specific record structure.
.PP
The first two fields are common to all reporting records:
.IP "\fBtime\fP"
Time (GMT unix timestamp) when the record was created.
.IP "\fBrecord type\fP"
Type of the accounting record.
The different types of records and their structure are described in the following text.
.SS new_job
The new_job record is written whenever a new job enters the system (usually by a submitting command). It has the following fields:
.IP "\fBsubmission_time\fP"
Time (GMT unix time stamp) when the job was submitted.
.IP "\fBjob_number\fP"
The job number.
.IP "\fBtask_number\fP"
The array task id. Always has the value -1 for new_job records (as we don't have array tasks yet).
.IP "\fBpe_taskid\fP"
The task id of parallel tasks. Always has the value "none" for new_job records.
.IP "\fBjob_name\fP"
The job name (from -N submission option)
.IP "\fBowner\fP"
The job owner.
.IP "\fBgroup\fP"
The unix group of the job owner.
.IP "\fBproject\fP"
The project the job is running in.
.IP "\fBdepartment\fP"
The department the job owner is in.
.IP "\fBaccount\fP"
The account string specified for the job (from -A submission option).
.IP "\fBpriority\fP"
The job priority (from -p submission option).
.SS job_log
The job_log record is written whenever a job, an array task or a pe tasks is changing status. A status change can be the transition from pending to running, but can also be triggered by user actions like suspension of a job.
It has the following fields:
.IP "\fBevent_time\fP"
Time (GMT unix time stamp) when the event was generated.
.IP "\fBevent\fP"
A one word description of the event.
.IP "\fBjob_number\fP"
The job number.
.IP "\fBtask_number\fP"
The array task id. Always has the value -1 for new_job records (as we don't have array tasks yet).
.IP "\fBpe_taskid\fP"
The task id of parallel tasks. Always has the value "none" for new_job records.
.IP "\fBstate\fP"
The state of the job after the event was processed.
.IP "\fBuser\fP"
The user who initiated the event (or special usernames "qmaster", "scheduler"
and "execd" for actions of the system itself like scheduling jobs, executing jobs etc.).
.IP "\fBhost\fP"
The host from which the action was initiated (e.g. the submit host, the qmaster host, etc.).
.IP "\fBstate_time\fP"
Reserved field for later use.
.IP "\fBsubmission_time\fP"
Time (GMT unix time stamp) when the job was submitted.
.IP "\fBjob_name\fP"
The job name (from -N submission option)
.IP "\fBowner\fP"
The job owner.
.IP "\fBgroup\fP"
The unix group of the job owner.
.IP "\fBproject\fP"
The project the job is running in.
.IP "\fBdepartment\fP"
The department the job owner is in.
.IP "\fBaccount\fP"
The account string specified for the job (from -A submission option).
.IP "\fBpriority\fP"
The job priority (from -p submission option).
.IP "\fBmessage\fP"
A message describing the reported action.
.SS acct
Records of type acct are accounting records. Normally, they are written whenever a job, a task of an array job,
or the task of a parallel job terminates. However, for long running jobs an intermediate acct record is created once a
day after a midnight. This results in multiple accounting records for a particular job and allows for a fine-grained
resource usage monitoring over time.
Accounting records comprise the following fields:
.IP "\fBqname\fP"
Name of the cluster queue in which the job has run.
.IP "\fBhostname\fP"
Name of the execution host.
.IP "\fBgroup\fP"
The effective group id of the job owner when executing the job.
.IP "\fBowner\fP"
Owner of the xxQS_NAMExx job.
.IP "\fBjob_name\fP"
Job name.
.IP "\fBjob_number\fP"
Job identifier - job number.
.IP "\fBaccount\fP"
An account string as specified by the
.M qsub 1
or
.M qalter 1
\fB\-A\fP option.
.IP "\fBpriority\fP"
Priority value assigned to the job corresponding to the \fBpriority\fP
parameter in the queue configuration (see
.M queue_conf 5 ).
.IP "\fBsubmission_time\fP"
Submission time (GMT unix time stamp).
.IP "\fBstart_time\fP"
Start time (GMT unix time stamp).
.IP "\fBend_time\fP"
End time (GMT unix time stamp).
.IP "\fBfailed\fP"
Indicates the problem which occurred in case a job could not be started on
the execution host (e.g. because the owner of the job did not have a valid
account on that machine). If xxQS_NAMExx tries to start a job multiple times,
this may lead to multiple entries in the accounting file corresponding to
the same job ID.
.IP "\fBexit_status\fP"
Exit status of the job script (or xxQS_NAMExx specific status in case
of certain error conditions).
.IP "\fBru_wallclock\fP"
Difference between \fBend_time\fP and \fBstart_time\fP (see above).
.PP
The remainder of the accounting entries follows the contents of the
standard UNIX rusage structure as described in
.M getrusage 2 .
Depending on the operating system where the job was executed some of the
fields may be 0. The following entries are provided:
.PP
.nf
.RS
.B ru_utime
.B ru_stime
.B ru_maxrss
.B ru_ixrss
.B ru_ismrss
.B ru_idrss
.B ru_isrss
.B ru_minflt
.B ru_majflt
.B ru_nswap
.B ru_inblock
.B ru_oublock
.B ru_msgsnd
.B ru_msgrcv
.B ru_nsignals
.B ru_nvcsw
.B ru_nivcsw
.RE
.fi
.PP
.IP "\fBproject\fP"
The project which was assigned to the job.
.IP "\fBdepartment\fP"
The department which was assigned to the job.
.IP "\fBgranted_pe\fP"
The parallel environment which was selected for that job.
.IP "\fBslots\fP"
The number of slots which were dispatched to the job by the scheduler.
.IP "\fBtask_number\fP"
Array job task index number.
.IP "\fBcpu\fP"
The cpu time usage in seconds.
.IP "\fBmem\fP"
The integral memory usage in Gbytes seconds.
.IP "\fBio\fP"
The amount of data transferred in input/output operations.
.IP "\fBcategory\fP"
A string specifying the job category.
.IP "\fBiow\fP"
The io wait time in seconds.
.IP "\fBpe_taskid\fP"
If this identifier is set the task was part of a parallel job and was
passed to xxQS_NAMExx via the qrsh -inherit interface.
.IP "\fBmaxvmem\fP"
The maximum vmem size in bytes.
.IP "\fBarid\fP"
Advance reservation identifier. If the job used resources of an advance
reservation then this field contains a positive integer identifier otherwise the
value is "\fB0\fP" .
.SS queue
Records of type queue contain state information for queues (queue instances).
A queue record has the following fields:
.IP "\fBqname\fP"
The cluster queue name.
.IP "\fBhostname\fP"
The hostname of a specific queue instance.
.IP "\fBreport_time\fP"
The time (GMT unix time stamp) when a state change was triggered.
.IP "\fBstate\fP"
The new queue state.
.SS queue_consumable
A queue_consumable record contains information about queue consumable values in addition to queue state information:
.IP "\fBqname\fP"
The cluster queue name.
.IP "\fBhostname\fP"
The hostname of a specific queue instance.
.IP "\fBreport_time\fP"
The time (GMT unix time stamp) when a state change was triggered.
.IP "\fBstate\fP"
The new queue state.
.IP "\fBconsumables\fP"
Description of consumable values. Information about multiple consumables is separated by space.
A consumable description has the format <name>=<actual_value>=<configured value>.
.SS host
A host record contains information about hosts and host load values.
It contains the following information:
.IP "\fBhostname\fP"
The name of the host.
.IP "\fBreport_time\fP"
The time (GMT unix time stamp) when the reported information was generated.
.IP "\fBstate\fP"
The new host state.
Currently, xxQS_NAMExx doesn't track a host state, the field is reserved for
future use. Always contains the value X.
.IP "\fBload values\fP"
Description of load values. Information about multiple load values is separated by space.
A load value description has the format <name>=<actual_value>.
.\"
.SS host_consumable
A host_consumable record contains information about hosts and host consumables.
Host consumables can for example be licenses.
It contains the following information:
.IP "\fBhostname\fP"
The name of the host.
.IP "\fBreport_time\fP"
The time (GMT unix time stamp) when the reported information was generated.
.IP "\fBstate\fP"
The new host state.
Currently, xxQS_NAMExx doesn't track a host state, the field is reserved for
future use. Always contains the value X.
.IP "\fBconsumables\fP"
Description of consumable values. Information about multiple consumables is separated by space.
A consumable description has the format <name>=<actual_value>=<configured value>.
.SS sharelog
The xxQS_NAMExx qmaster can dump information about sharetree configuration and use to the reporting file.
The parameter \fIsharelog\fP sets an interval in which sharetree information will be dumped.
It is set in the format HH:MM:SS. A value of 00:00:00 configures qmaster not to
dump sharetree information. Intervals of several minutes up to hours are sensible values for this parameter.
The record contains the following fields
.IP "\fBcurrent time\fP"
The present time
.IP "\fBusage time\fP"
The time used so far
.IP "\fBnode name\fP"
The node name
.IP "\fBuser name\fP"
The user name
.IP "\fBproject name\fP"
The project name
.IP "\fBshares \fP"
The total shares
.IP "\fBjob count \fP"
The job count
.IP "\fBlevel \fP"
The percentage of shares used
.IP "\fBtotal \fP"
The adjusted percentage of shares used
.IP "\fBlong target share \fP"
The long target percentage of resource shares used
.IP "\fBshort target share \fP"
The short target percentage of resource shares used
.IP "\fBactual share \fP"
The actual percentage of resource shares used
.IP "\fBusage \fP"
The combined shares used
.IP "\fBcpu \fP"
The cpu used
.IP "\fBmem \fP"
The memory used
.IP "\fBio \fP"
The IO used
.IP "\fBlong target cpu \fP"
The long target cpu used
.IP "\fBlong target mem \fP"
The long target memory used
.IP "\fBlong target io \fP"
The long target IO used
.\"
.SS new_ar
A new_ar record contains information about advance reservation objects. Entries of this
type will be added if an advance reservation is created.
It contains the following information:
.IP "\fBsubmission_time\fP"
The time (GMT unix time stamp) when the advance reservation was created.
.IP "\fBar_number\fP"
The advance reservation number identifying the reservation.
.IP "\fBar_owner\fP"
The owner of the advance reservation.
.\"
.SS ar_attribute
The ar_attribute record is written whenever a new advance reservation was added or the
attribute of an existing advance reservation has changed. It has following fields.
.IP "\fBevent_time\fP"
The time (GMT unix time stamp) when the event was generated.
.IP "\fBsubmission_time\fP"
The time (GMT unix time stamp) when the advance reservation was created.
.IP "\fBar_number\fP"
The advance reservation number identifying the reservation.
.IP "\fBar_name\fP"
Name of the advance reservation.
.IP "\fBar_account\fP"
An account string which was specified during the creation of the advance reservation.
.IP "\fBar_start_time\fP"
Start time.
.IP "\fBar_end_time\fP"
End time.
.IP "\fBar_granted_pe\fP"
The parallel environment which was selected for an advance reservation.
.IP "\fBar_granted_resources\fP"
The granted resources which were selected for an advance reservation.
.\"
.SS ar_log
The ar_log record is written whenever a advance reservation is changing status. A status
change can be from pending to active, but can also be triggered by system events like host
outage. It has following fields.
.IP "\fBar_state_change_time\fP"
The time (GMT unix time stamp) when the event occurred which caused a state change.
.IP "\fBsubmission_time\fP"
The time (GMT unix time stamp) when the advance reservation was created.
.IP "\fBar_number\fP"
The advance reservation number identifying the reservation.
.IP "\fBar_state\fP"
The new state.
.IP "\fBar_event\fP"
An event id identifying the event which caused the state change.
.IP "\fBar_message\fP"
A message describing the event which caused the state change.
.\"
.SS ar_acct
The ar_acct records are accounting records which are written for every queue instance
whenever a advance reservation terminates. Advance reservation accounting records comprise
following fields.
.IP "\fBar_termination_time\fP"
The time (GMT unix time stamp) when the advance reservation terminated.
.IP "\fBsubmission_time\fP"
The time (GMT unix time stamp) when the advance reservation was created.
.IP "\fBar_number\fP"
The advance reservation number identifying the reservation.
.IP "\fBar_qname\fP"
Cluster queue name which the advance reservation reserved.
.IP "\fBar_hostname\fP"
The name of the execution host.
.IP "\fBar_slots\fP"
The number of slots which were reserved.
.\"
.\"
.SH "SEE ALSO"
.M sge_conf 5 .
.M host_conf 5 .
.\"
.SH "COPYRIGHT"
See
.M xxqs_name_sxx_intro 1
for a full statement of rights and permissions.
|