1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
|
#!/bin/bash
# Copyright (C) 2020 Mate Soos
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; version 2
# of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
# 02110-1301, USA.
# This file wraps CMake invocation for TravisCI
# so we can set different configurations via environment variables.
#set -x
set -e
function generate() {
dirname="${basename}-cut1-${cut1}-cut2-${cut2}-limit-${limit}-est${est}-xbmin${xgboostminchild}-xbmd${xboostmaxdepth}-reg${regressor}-subs${xgboostsubsample}"
mkdir -p ${dirname}
rm -f ${dirname}/out_*
rm -f ${dirname}/predictor*.json
rm -f dirname.tar.gz
git rev-parse HEAD > ${dirname}/out_git
cat learn.sh >> ${dirname}/out_git
md5sum *.dat >> ${dirname}/out_git
tiers=("short" "long" "forever")
tables=("used_later" "used_later_anc")
for tier in "${tiers[@]}"; do
mypids=()
for table in "${tables[@]}"; do
# check if DAT file exists
INFILE="comb-${table}-${tier}-cut1-${cut1}-cut2-${cut2}-limit-${limit}.dat"
if test -f "$INFILE"; then
echo "$INFILE exists, OK"
else
echo "ERROR: $INFILE does not exist!!"
exit -1
fi
/usr/bin/time --verbose -o "${dirname}/out_${tier}.timeout" \
../cldata_predict.py \
$INFILE \
--tier ${tier} --regressor $regressor \
--xgboostest ${est} \
--xgboostminchild $xgboostminchild --xboostmaxdepth=${xboostmaxdepth} \
--basedir "${dirname}" \
--features "best_only" \
--table ${table} \
--xgboostsubsample "$xgboostsubsample" \
--bestfeatfile ${bestf} 2>&1 | tee "${dirname}/out-${table}-${tier}" &
pid=$!
echo "PID here is $pid"
mypids+=("$pid")
done
# wait for PIDs now
echo "PIDS to wait for are: ${mypids[*]}"
for pid2 in "${mypids[@]}"
do
echo "Waiting for $pid2 ..."
wait $pid2
done
done
tar czvf ${dirname}.tar.gz ${dirname}
}
# best was: 8march-2020-3acd81dc55df3-cut1-5.0-cut2-30.0-limit-2000-est10-w0-xbmin300-xbmd4
#xboostmaxdepth=4
#xboostminchild=300
#est=10
bestf="../../scripts/crystal/best_features-correlation2.txt"
w=0
xgboostsubsample="1.0"
basename="15-dec-b1bd8f74bc2b42"
#basename="14-april-2021-69bad529f962c"
#basename="8march-2020-3acd81dc55df3-36feats"
#basename="aes-30-march-2020-a1e0e19be0c1"
#basename="orig"
limit=1000
cut1="3.0"
cut2="25.0"
xboostmaxdepth=4
xgboostminchild=300
est=10
for xgboostsubsample in 1.0
do
for limit in 10000 #1000
do
for regressor in "xgb" #"lgbm"
do
for xboostmaxdepth in 4 6 #8 10 12
do
for xgboostminchild in 10 #300
do
for est in 10
do
generate
done
done
done
done
done
done
exit 0
# simple run:
# ./cryptominisat5 goldb-heqc-i10mul.cnf --simdrat 1 --printsol 0 --predloc ./data/15-dec-b1bd8f74bc2b42-cut1-3.0-cut2-25.0-limit-10000-est10-xbmin10-xbmd4-regxgb-subs1.0/ --predtype py --predbestfeats ../scripts/crystal/best_features-correlation2.txt --predtables 000
|