1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143
|
#!/usr/bin/env expect
############################################################################
# Purpose: Test of Slurm functionality
# Test of batch job requeue.
############################################################################
# Copyright (C) 2006 The Regents of the University of California.
# Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
# Written by Morris Jette <jette1@llnl.gov>
# CODE-OCEC-09-009. All rights reserved.
#
# This file is part of Slurm, a resource management program.
# For details, see <https://slurm.schedmd.com/>.
# Please also read the included file: DISCLAIMER.
#
# Slurm is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 2 of the License, or (at your option)
# any later version.
#
# Slurm is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
# details.
#
# You should have received a copy of the GNU General Public License along
# with Slurm; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
############################################################################
source ./globals
set file_in "$test_dir/input"
set file_out "$test_dir/output"
set file_err "$test_dir/error"
set file_flag_1 "$test_dir/run1"
set file_flag_2 "$test_dir/run2"
set file_flag_3 "$test_dir/run3"
set file_flag_4 "$test_dir/run4"
set job_id 0
set node_cnt 1-4
if {![is_super_user]} {
skip "This test can't be run except as SlurmUser"
}
proc cleanup {} {
global job_id
cancel_job $job_id
}
#
# Delete left-over input script plus stdout/err files
# Build input script file that runs two job steps
#
make_bash_script $file_in "
if \[ -f $file_flag_3 \]
then
$bin_touch $file_flag_4
elif \[ -f $file_flag_2 \]
then
$bin_touch $file_flag_3
elif \[ -f $file_flag_1 \]
then
$bin_touch $file_flag_2
else
$bin_touch $file_flag_1
fi
$srun -n4 -O -l $bin_sleep 60
"
#
# Spawn a srun batch job that uses stdout/err and confirm their contents
#
set timeout $max_job_delay
set job_id [submit_job -fail "--requeue -N$node_cnt --output=$file_out --error=$file_err -t2 $file_in"]
#
# Wait for job to begin, then requeue it
#
wait_for_job -fail $job_id "RUNNING"
exec $bin_sleep 15
run_command -fail "$scontrol requeue $job_id"
#
# Wait for job to restart, then requeue it again
#
wait_for_job -fail $job_id "RUNNING"
exec $bin_sleep 15
run_command -fail "$scontrol requeue $job_id"
#
# Wait for job to complete and check for files
#
wait_for_job -fail $job_id "RUNNING"
wait_for_job -fail $job_id "DONE"
wait_for_file -fail $file_flag_1
wait_for_file -fail $file_flag_2
wait_for_file -fail $file_flag_3
if {[wait_for_file -timeout 5 $file_flag_4] == $::RETURN_SUCCESS} {
fail "File flag 4 ($file_flag_4) is found"
}
#
# Now run the same test, but with job requeue disabled via the
# srun --no-requeue option
#
file delete $file_flag_1 $file_flag_2 $file_flag_3 $file_flag_4
set job_id [submit_job -fail "--no-requeue --output=$file_out --error=$file_err -t2 $file_in"]
#
# Wait for job to begin, then requeue it
#
wait_for_job -fail $job_id "RUNNING"
set disabled 0
exec $bin_sleep 15
spawn $scontrol requeue $job_id
expect {
-re "Requested operation is presently disabled" {
set disabled 1
log_debug "This error was expected, no worries"
exp_continue
}
-re "error" {
fail "Some scontrol error happened"
}
timeout {
fail "scontrol not responding"
}
eof {
wait
}
}
subtest {$disabled != 0} "Verify --no-requeue option"
#
# Wait for job to complete and check for files
#
wait_for_job -fail $job_id "DONE"
wait_for_file -fail $file_flag_1
if {[wait_for_file -timeout 5 $file_flag_2] == $::RETURN_SUCCESS} {
fail "File flag 2 ($file_flag_2) is found"
}
|