1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102
|
This directory contains sets of scripts for testing plain kprobes or
systemtap probes on a set of kernel functions. It tries to bisect the
list if a probe on a set of kernel functions crashes the system.
Here's a workflow:
1) First put a list of functions to probe in a file, by default called
'probes.all'. If you'd like a list of all kernel functions, you can
run 'gen_code_all.sh'.
2a) To test plain kprobes, run:
# ./kprobes_test.py
2b) To test systemtap probes, run:
# ./stap_probes_test.py
3) Wait for the script to finish. Feel free to watch kprobes_test.log
for progress. If the system crashes, restart it. Either
'kprobe_test.py' or 'stap_probes_test.py' should get automatically run
when the system gets restarted.
If you'd like help monitoring the target system, you can use
'monitor_system.py', which will try to automatically reboot a remote
system if the system can't be reached.
4) When the script (either ./kprobes_test.py or ./stap_probes_test.py)
finishes, you'll see the following output files.
- probes.unregistered is a file that contains the list of kernel
functions where a probe couldn't be registered.
- probes.untriggered is a file that contains the list of kernel
functions where a probe could be registered, but the kernel
function wasn't actually called. So, we don't really know whether
these functions passed or failed.
- probes.passed is a file that contains the list of kernel functions
where a probe was registered and the kerenel function was called
with no crash.
- probes.failed is a file that contains the list of kernel functions
where a probe was registered and the kernel function was called
that caused a crash.
Here are the original instructions for 'gen_whitelist', which tested
systemtap probe safety.
====
To start a kprobes test run:
runtest whitelist.exp
----
Here is an implementation of generating whitelist for safe-mode probes
based on the discussion in former thread
(http://sourceware.org/ml/systemtap/2006-q3/msg00574.html).
Its main idea is:
1) Fetch a group of probe points from probes.pending, probe them and
run some workloads(e.g. runltp) parallely meanwhile.
2) If the probe test ends without crash, those actually triggered
probe points are moved into probes.passed and those untriggered
are into probes.untriggered;
If the probe test crashes the system, it will be resumed
automatically after system reboot. Those probe points which have
been triggered are also moved into probes.passed, but those
untriggered ones are moved into probes.failed.
3) Repeat the above until probes.pending becomes empty, then:
Normally, probes.pending is reinitialized from probes.failed and
probes.untriggered, then start the next iteration;
But if max running level (now 3) is reached, or probes.pending,
probes.failed and probes.untriggered are all empty, stop the
whole test.
To be able to resume after a crash, this test will append its startup
codes to the end of /etc/rc.d/rc.local at the beginning and remove
them at the end of the test.
The gen_tapset_all.sh is used to generate a "probes.all" file based
on current tapset of systemtap.
It is suggested to use a script in a remote server to restart the
test machine automatically in case it is crashed by the probe test.
Instructions:
0) Please jump to 6) directly if you use all default settings
1) Define your list of probe points in a file named "probes.all";
otherwise, it will be automatically extracted from
stap -p2 -e 'probe kernel.function("*"){}
2) Define your own workloads by changing the "benchs" list variable;
otherwise, several LTP testcases will be used
3) Define how may times you want the test to be iterated by changing
MAX_RUNNING_LEVEL variable; otherwise, 3 is the default
4) Define the group size for different iteration level by changing
proper_current_size() function; otherwise, they will be set in
a decreasing order based on the length of probes.all
5) Remove /stp_genwhitelist_running if you want to start a test from
the beginning after a system crash
6) Start a test:
runtest whitelist.exp
|