1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171
|
#!/usr/bin/env expect
############################################################################
# Purpose: Test of Slurm functionality
# Confirm that job time limit function works (-t option).
############################################################################
# Copyright (C) 2002-2007 The Regents of the University of California.
# Copyright (C) 2008 Lawrence Livermore National Security.
# Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
# Written by Morris Jette <jette1@llnl.gov>
# CODE-OCEC-09-009. All rights reserved.
#
# This file is part of Slurm, a resource management program.
# For details, see <https://slurm.schedmd.com/>.
# Please also read the included file: DISCLAIMER.
#
# Slurm is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version.
#
# Slurm is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
# details.
#
# You should have received a copy of the GNU General Public License along
# with Slurm; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
############################################################################
source ./globals
set job_id 0
# NOTE: If you increase sleep_time, change job time limits as well
set sleep_time 180
#
# Make sure sleep time is no larger than InactiveLimit
set inactive_limit $sleep_time
set kill_wait $sleep_time
set over_time_limit 0
log_user 0
spawn $scontrol show config
expect {
-re "InactiveLimit *= ($number)" {
set inactive_limit $expect_out(1,string)
exp_continue
}
-re "KillWait *= ($number)" {
set kill_wait $expect_out(1,string)
exp_continue
}
-re "OverTimeLimit *= UNLIMITED" {
set over_time_limit 9999
exp_continue
}
-re "OverTimeLimit *= ($number)" {
set over_time_limit $expect_out(1,string)
exp_continue
}
timeout {
fail "scontrol not responding"
}
eof {
wait
}
}
log_user 1
if {$inactive_limit == 0} {
set inactive_limit $sleep_time
}
if {$inactive_limit < 120} {
skip "InactiveLimit ($inactive_limit) is too low for this test"
}
if {$kill_wait > 60} {
skip "KillWait ($kill_wait) is too high for this test"
}
if {$over_time_limit > 0} {
skip "OverTimeLimit too high for this test ($over_time_limit > 0)"
}
if {$inactive_limit < $sleep_time} {
set sleep_time [expr $inactive_limit + $kill_wait]
log_debug "Reset job sleep time to $sleep_time seconds"
}
proc capture_debug_for_bug_10313 {} {
global scontrol bin_cat bin_ls slurm_dir
run_command "$scontrol show config"
set testsuite_dir [file dirname [info script]]
run_command "$bin_cat ${testsuite_dir}/globals.local"
run_command "ls -l ${slurm_dir}/lib/slurm"
run_command "stat ${slurm_dir}/lib/slurm/switch_generic.so"
}
#
# Execute a couple of three minute jobs; one with a one minute time
# limit and the other with a four minute time limit. Confirm jobs
# are terminated on a timeout as required. Note that Slurm time
# limit enforcement has a resolution of about one minute.
#
# Ideally the job gets a "job exceeded timelimit" followed by a
# "Terminated" message, but if the timing is bad only the "Terminated"
# message gets sent. This is due to srun recognizing job termination
# prior to the message from slurmd being processed.
#
set timeout [expr $max_job_delay + $sleep_time]
set timed_out 0
spawn $srun -t1 $bin_sleep $sleep_time
expect {
-re "Incompatible Slurm plugin" {
capture_debug_for_bug_10313
exp_continue
}
-re "time limit" {
set timed_out 1
exp_continue
}
-re "TIME LIMIT" {
set timed_out 1
exp_continue
}
-re "Terminated" {
set timed_out 1
exp_continue
}
-re "Unable to contact" {
fail "Slurm appears to be down"
}
timeout {
fail "srun not responding"
}
eof {
wait
}
}
subtest {$timed_out == 1} "Time limit should be enforced"
set completions 0
spawn $srun -t4 $bin_sleep $sleep_time
expect {
-re "time limit" {
set completions -1
exp_continue
}
-re "TIME LIMIT" {
set completions -1
exp_continue
}
-re "Terminated" {
set completions -1
exp_continue
}
-re "Unable to contact" {
fail "Slurm appears to be down"
}
-re "error" {
set completions -1
exp_continue
}
timeout {
fail "srun not responding"
}
eof {
wait
incr completions
}
}
subtest {$completions == 1} "Job should complete properly"
|