1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350
|
Cpuset (cset) Tutorial
######################
Alex Tsariounov <alext@novell.com> +
Copyright (c) 2009-2011 Novell Inc., cset v1.5.6 +
Verbatim copying and distribution of this entire article are permitted
worldwide, without royalty, in any medium, provided this notice is preserved.
This tutorial describes basic and advanced usage of the *cset* command to
manipulate cpusets on a Linux system. See also the manpages that come with
the *cset* command: cset(1), cset-shield(1), cset-set(1), and cset-proc(1) for
more details. Additionally, the *cset* command has online help.
Introduction
============
In the Linux kernel, the cpuset facility provides a mechanism for creating
logical entities called "cpusets" that encompass definitions of CPUs and NUMA
Memory Nodes (if NUMA is available). Cpusets constrain the CPU and Memory
placement of a task to only the resources defined within that cpuset. These
cpusets can then be arranged into a nested hierarchy visible in the "cpuset"
virtual filesystem. Sets of tasks can be assigned to these cpusets to
constrain the resources that they use. The tasks can be moved from one cpuset
to another to utilize other resources defined in those other cpusets.
The *cset* command is a Python application that provides a command line front
end for the Linux cpusets functionality. Working with cpusets directly can be
confusing and slightly complex. The *cset* tool hides that complexity behind
an easy-to-use command line interface.
There are two distinct use cases for *cset*: the basic shielding use case and
the "advanced" case of using raw +set+ and +proc+ subcommands. The basic
shielding function is accessed with the +shield+ subcommand and described in
the next section. Using the raw +set+ and +proc+ subcommands allows one to
set up arbitrarily complex cpusets and is described in the later sections.
Note that in general, one either uses the +shield+ subcommand 'or' a
combination of the +set+ and +proc+ subcommands. One rarely, if ever, uses
all of these subcommands together. Doing so will likely become too
confusing. Additionally, the +shield+ subcommand sets up its required cpusets
with exclusively marked CPUs. This can interfere with your cpuset
strategy. If you find that you need more functionality for your strategy than
+shield+ provides, go ahead and transition to using +set+ and +proc+
exclusively. It is straightforward to implement what +shield+ does with a few
extra +set+ and +proc+ subcommands.
Obtaining Online Help
---------------------
For a full list of *cset* subcommands::
+# cset help+
For in-depth help on individual subcommands::
+# cset help <subcommand>+
For options of individual subcommands::
+# cset <subcommand> (-h | --help)+
The Basic Shielding Model
=========================
Although any set up of cpusets can really be described as "shielding," there
is one prevalent shielding model in use that is so common that *cset* has a
subcommand that is dedicated to its use. This subcommand is called +shield+.
The concept behind this model is the use of three cpusets. The 'root' cpuset
which is always present in all configurations and contains all CPUs. The
'system' cpuset which contains CPUs which are used for system tasks. These
are the normal tasks that are not "important," but which need to run on the
system. And finally, the 'user' cpuset which contains CPUs which are used for
"important" tasks. The 'user' cpuset is the shield. Only those tasks that
are somehow important, usually tasks whose performance determines the overall
rating for the machine, are run in the 'user' cpuset.
The +shield+ subcommand manages all of these cpusets and lets you define the
CPUs and Memory Nodes that are in the 'shielded' and 'unshielded' sets. The
subcommand automatically moves all movable tasks on the system into the
'unshielded' cpuset on shield activation, and back into the 'root' cpuset on
shield tear down. The subcommand then lets you move tasks into and out of the
shield. Additionally, you can move special tasks (kernel threads) which
normally run in the 'root' cpuset into the 'unshielded' set so that your
shield will have even less disturbance.
The +shield+ subcommand abstracts the management of these cpusets away from
you and provides options that drive how the shield is set up, which tasks are
to be shielded and which tasks are not, and status of the shield. In fact,
you need not be bothered with the naming of the required cpusets or even where
the cpuset filesystem is mounted. *Cset* and the +shield+ subcommand takes
care of all that.
If you find yourself needing to define more cpusets for your application, then
it is likely that this simple shielding is not a rich enough model for you.
In this case, you should transition to using the +set+ and +proc+ subcommands
described in a later section.
A Simple Shielding Example
--------------------------
Assume that we have a 4-way machine that is not NUMA. This means there are 4
CPUs at our disposal and there is only one Memory Node available. On such
machines, we do not need to specify any memory node parameters to *cset*, it
sets up the only available memory node by default.
Usually, one wants to dedicate as many CPUs to the shield as possible and
leave a minimal set of CPUs for normal system processing. The reasoning for
this is because the performance of the important tasks will rule the
performance of the installation as a whole and these important tasks need as
many resources available to them as possible, exclusive of other, unimportant
tasks that are running on the system.
NOTE: I use the word "task" to represent either a process or a thread that is
running on the system.
Setup and Teardown of the Shield
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To set up a shield of 3 CPUs with 1 CPU left for low priority system
processing, issue the following command.
-------------------------------------------------------------------
[zuul:cpuset-trunk]# cset shield -c 1-3
cset: --> activating shielding:
cset: moving 176 tasks from root into system cpuset...
[==================================================]%
cset: "system" cpuset of CPUSPEC(0) with 176 tasks running
cset: "user" cpuset of CPUSPEC(1-3) with 0 tasks running
-------------------------------------------------------------------
This command does a number of things. First, a 'user' cpuset is created with
what's called a CPUSPEC (CPU specification) from the +-c/\--cpu+ option. This
CPUSPEC specifies to use CPUs 1 through 3 inclusively. Next, the command
creates a 'system' cpuset with a CPUSPEC that is the inverse of the +-c+
option for the current machine. On this machine that cpuset will only contain
the first CPU, CPU0. Next, all userspace processes running in the 'root'
cpuset are transfered to the 'system' cpuset. This makes all those processes
run only on CPU0. The effect of this is that the shield consists of CPUs 1
through 3 and they are now idling.
Note that the command did not move the kernel threads that are running in the
'root' cpuset to the 'system' cpuset. This is because you may want these
kernel threads to use all available CPUs. If you do not, the you can use the
+-k/\--kthread+ option as described below.
The shield setup command above outputs the information of which cpusets were
created and how many tasks are running on each. If you want to see the
current status of the shield again, issue this command:
-------------------------------------------------------------------
[zuul:cpuset-trunk]# cset shield
cset: --> shielding system active with
cset: "system" cpuset of CPUSPEC(0) with 176 tasks running
cset: "user" cpuset of CPUSPEC(1-3) with 0 tasks running
-------------------------------------------------------------------
Which shows us that the shield is set up and that 176 tasks are running in the
'system' cpuset--the "unshielded" cpuset.
It is important to move all possible tasks from the 'root' cpuset to the
unshielded 'system' cpuset because a task's cpuset property is inherited by
its children. Since we've moved all running tasks (including init) to the
unshielded 'system' cpuset, that means that any new tasks that are spawned
will *also* run in the unshielded 'system' cpuset.
Some kernel threads can be moved into the unshielded 'system' cpuset as well.
These are the threads that are not bound to specific CPUs. If a kernel thread
is bound to a specific CPU, then it is generally not a good idea to move that
thread to the 'system' set because at worst it may hang the system and at best
it will slow the system down significantly. These threads are usually the IRQ
threads on a real time Linux kernel, for example, and you may want to not move
these kernel threads into 'system'. If you leave them in the 'root' cpuset,
then they will have access to all CPUs.
However, if your application demands an even "quieter" shield, then you can
move all movable kernel threads into the unshielded 'system' set with the
following command.
--------------------------------------------------------------------
[zuul:cpuset-trunk]# cset shield -k on
cset: --> activating kthread shielding
cset: kthread shield activated, moving 70 tasks into system cpuset...
[==================================================]%
cset: done
--------------------------------------------------------------------
You can see that this moved an additional 70 tasks to the unshielded 'system'
cpuset. Note that the +-k/\--kthread on+ parameter can be given at the shield
creation time as well and you do not need to perform these two steps
separately if you know that you will want kernel thread shielding as well.
Executing *cset shield* again shows us the current state of the shield.
--------------------------------------------------------------------
[zuul:cpuset-trunk]# cset shield
cset: --> shielding system active with
cset: "system" cpuset of CPUSPEC(0) with 246 tasks running
cset: "user" cpuset of CPUSPEC(1-3) with 0 tasks running
--------------------------------------------------------------------
You can get a detailed listing of what is running in the shield by specifying
either +-s/\--shield+ or +-u/\--unshield+ to the +shield+ subcommand and using
the verbose flag. You will get output similar to the following.
--------------------------------------------------------------------
[zuul:cpuset-trunk]# cset shield --unshield -v
cset: "system" cpuset of CPUSPEC(0) with 251 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
root 1 0 Soth init [5]
root 2 0 Soth [kthreadd]
root 84 2 Sf50 [IRQ-9]
...
alext 31796 31789 Soth less
root 32653 25222 Roth python ./cset shield --unshield -v
--------------------------------------------------------------------
Note that I abbreviated the listing; we do have 251 tasks running in the
'system' set. The output is self-explanatory; however, the "SPPr" field may
need a little explanation. "SPPr" stands for State, Policy and Priority. You
can see that the initial two tasks are Stopped and running in timeshare
priority, marked as "oth" (for "other"). The [IRQ-9] task is also stopped,
but marked at real time FIFO policy with a priority of 50. The last task in
the listing is the *cset* command itself and is marked as running. Also note
that adding a second +-v/\--verbose+ option will not restrict the output to
fit into an 80 character screen.
Tear down of the shield, stopping the shield in other words, is done with the
+-r/\--reset+ option to the +shield+ subcommand. When this command is issued,
both the 'system' and 'user' cpusets are deleted and any tasks that are
running in both of those cpusets are moved to the 'root' cpuset. Once so
moved, all tasks will have access to all resources on the system. For
example:
--------------------------------------------------------------------
[zuul:cpuset-trunk]# cset shield --reset
cset: --> deactivating/reseting shielding
cset: moving 0 tasks from "/user" user set to root set...
cset: moving 250 tasks from "/system" system set to root set...
[==================================================]%
cset: deleting "/user" and "/system" sets
cset: done
--------------------------------------------------------------------
Moving Interesting Tasks Into and Out of the Shield
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Now that we have a shield running, the objective is to run our "important"
processes in that shield. These processes can be anything, but usually they
are directly related to the purpose of the machine. There are two ways to run
tasks in the shield:
. Exec a process into the shield
. Move an already running task into the shield
Execing a Process into the Shield
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Running a new process in the shield can be done with the +-e/\--exec+ option
to the +shield+ subcommand. This is the simplest way to get a task to run in
the shield. For this example, let's exec a new bash shell into the shield
with the following commands.
--------------------------------------------------------------------
[zuul:cpuset-trunk]# cset shield -s
cset: "user" cpuset of CPUSPEC(1-3) with 0 tasks running
cset: done
[zuul:cpuset-trunk]# cset shield -e bash
cset: --> last message, executed args into cpuset "/user", new pid is: 13300
[zuul:cpuset-trunk]# cset shield -s -v
cset: "user" cpuset of CPUSPEC(1-3) with 2 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
root 13300 8583 Soth bash
root 13329 13300 Roth python ./cset shield -s -v
[zuul:cpuset-trunk]# exit
[zuul:cpuset-trunk]# cset shield -s
cset: "user" cpuset of CPUSPEC(1-3) with 0 tasks running
cset: done
--------------------------------------------------------------------
The first command above lists the status of the shield. We see that the
shield is defined as CPUs 1 through 3 inclusive and currently there are no
tasks running in it.
The second command execs the bash shell into the shield with the +-e+
option. The last message of cset lists the PID of the new process.
NOTE: *cset* follows the tradition of separating the tool options from the
command to be execed options with a double dash (+\--+). This is not shown in
this simple example, but if the command you want to exec also takes options,
separate them with the double dash like so: +# cset shield -e mycommand \-- -v+
The +-v+ will be passed to +mycommand+, and not to *cset*.
The next command lists the status of the shield again. You will note that
there are actually two tasks running shielded: our new shell and the *cset*
status command itself. Remember that the cpuset property of a task is
inherited by its children. Since we ran the new shell in the shield, its
child, which is the status command, also ran in the shield.
TIP: Execing a shell into the shield is a useful way to experiment with
running tasks in the shield since all children of the shell will also run in
the shield.
The last command exits the shell after which we request a shield status again
and see that once again, it does not contain any tasks.
You may have noticed in the output above that both the new shell and the
status command are running as the root user. This is because *cset* needs to
run as root and so all it's children will also run as root. If you need to
run a process under a different user and or group, you may use the +\--user+
and +\--group+ options for +exec+ as follows.
--------------------------------------------------------------------
[zuul:cpuset-trunk]# cset shield --user=alext --group=users -e bash
cset: --> last message, executed args into cpuset "/user", new pid is: 14212
alext@zuul> cset shield -s -v
cset: "user" cpuset of CPUSPEC(1-3) with 2 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
alext 14212 8583 Soth bash
alext 14241 14212 Roth python ./cset shield -s -v
--------------------------------------------------------------------
Moving a Running Task into and out of the Shield
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
While execing a process into the shield is undoubtably useful, most of the
time, you'll want to move already running tasks into and out of the shield.
The *cset* shield subcommand includes two options for doing this:
+-s/\--shield+ and +-u/\--unshield+. These options require what's called a
PIDSPEC (process specification) to also be specified with the +-p/\--pid+
option. The PIDSPEC defines which tasks get operated on. The PIDSPEC can be
a single process ID, a list of process IDs separated by commas, and a list of
process ID ranges separated by dashes, groups of which are separated by
commas. For example:
+\--shield --pid 1234+::
This PIDSPEC argument specifies that PID 1234 be shielded.
+\--shield --pid 1234,42,1934,15000,15001,15002+::
This PIDSPEC argument specifies that this list of PIDs only be moved into the
shield.
+\--unshield -p 5000,5100,6010-7000,9232+::
This PIDSPEC argument specifies that PIDs 5000,5100 and 9232 be unshielded
(moved out of the shield) along with any existing PID that is in the range
6010 through 7000 inclusive.
NOTE: A range in a PIDSPEC does not have to have tasks running for every
number in that range. In fact, it is not even an error if there are no tasks
running in that range; none will be moved in that case. The range simply
specifies to act on any tasks that have a PID or TID that is within that
range.
Use of the appropriate PIDSPEC can thus be handy to move tasks and groups of
tasks into and out of the shield. Additionally, there is one more option that
can help with multi-threaded processes, and that is the +\--threads+ flag. If
this flag is present in a +shield+ or +unshield+ command with a PIDSPEC and if
any of the task IDs in the PIDSPEC belong to a thread in a process container,
then *all* the sibling threads in that process container will get shielded or
unshielded as well. This flag provides an easy mechanism to shield/unshield
all threads of a process by simply specifying one thread in that process.
In the following example, we move the current shell into the shield with a
range PIDSPEC and back out with the bash variable for the current PID.
-----------------------------------------------------------------------
[zuul:cpuset-trunk]# echo $$
22018
[zuul:cpuset-trunk]# cset shield -s -p 22010-22020
cset: --> shielding following pidspec: 22010-22020
cset: done
[zuul:cpuset-trunk]# cset shield -s -v
cset: "user" cpuset of CPUSPEC(1-3) with 2 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
root 3770 22018 Roth python ./cset shield -s -v
root 22018 5034 Soth bash
cset: done
[zuul:cpuset-trunk]# cset shield -u -p $$
cset: --> unshielding following pidspec: 22018
cset: done
[zuul:cpuset-trunk]# cset shield -s
cset: "user" cpuset of CPUSPEC(1-3) with 0 tasks running
cset: done
-----------------------------------------------------------------------
NOTE: Ordinarily, the +shield+ option will shield a PIDSPEC only if it is
currently running in the 'system' set--the unshielded set. The +unshield+
option will unshield a PIDSPEC only if it is currently running in the 'user'
set--the shielded set. If you want to +shield/unshield+ a process that
happens to be running in the 'root' set (not common), then use the +\--force+
option for these commands.
Full Featured Cpuset Manipulation Commands
==========================================
While basic shielding as described above is useful and a common use model for
*cset*, there comes a time when more functionality will be desired to
implement your strategy. To implement this, *cset* provides two subcommands:
+set+, which allows you to manipulate cpusets; and +proc+, which allows you to
manipulate processes within those cpusets.
The Set Subcommand
------------------
In order to do anything with cpusets, you must be able to create, adjust,
rename, move and destroy them. The +set+ subcommand allows the management of
cpusets in such a manner.
Creating and Destroying Cpusets with Set
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The basic syntax of +set+ for cpuset creation is:
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -c 1-3 -s my_cpuset1
cset: --> created cpuset "my_cpuset1"
----------------------------------------------------------
This creates a cpuset named "my_cpuset1" with a CPUSPEC of CPU1, CPU2 and
CPU3. The CPUSPEC is the same concept as described in the "Setup and
Teardown of the Shield" section above. The +set+ subcommand also takes a
+-m/\--mem+ option that lets you specify the memory nodes the set will use as
well as flags to make the CPUs and MEMs exclusive to the cpuset. If you are
on a non-NUMA machine, just leave the +-m+ option out and the default memory
node 0 will be used.
Just like with +shield+, you can adjust the CPUs and MEMs with subsequent
calls to +set+. If, for example, you wish to adjust the "my_cpuset1" cpuset
to only use CPUs 1 and 3 (and omit CPU2), then issue the following command.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -c 1,3 -s my_cpuset1
cset: --> modified cpuset "my_cpuset
----------------------------------------------------------
*cset* will then adjust the CPUs that are assigned to the "my_cpuset1" set to
only use CPU1 and CPU3.
To rename a cpuset, use the +-n/\--newname+ option. For example:
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -s my_cpuset1 -n super_set
cset: --> renaming "/cpusets/my_cpuset1" to "super_set"
----------------------------------------------------------
Renames the cpuset called "my_cpuset1" to "super_set".
To destroy a cpuset, use the +-d/\--destroy+ option as follows.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -d super_set
cset: --> processing cpuset "super_set", moving 0 tasks to parent "/"...
cset: --> deleting cpuset "/super_set"
cset: done
----------------------------------------------------------
This command destroys the newly created cpuset called "super_set". When a
cpuset is destroyed, all the tasks running in it are moved to the parent
cpuset. The 'root' cpuset, which always exists and always contains all CPUs,
can not be destroyed. You may also give the +\--destroy+ option a list of
cpusets to destroy.
NOTE: The *cset* subcommand creates the cpusets based on a mounted cpuset
filesystem. You do not need to know where that filesystem is mounted,
although it is easy to figure out (by default it's on '/cpusets'). When you
give the +set+ subcommand a name for a new cpuset, it is created wherever the
cpuset filesystem is mounted at.
If you want to create a cpuset hierarchy, then you must give a path to the
*cset* +set+ subcommand. This path will always begin with the 'root' cpuset,
for which the path is '/'. For example.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -c 1,3 -s top_set
cset: --> created cpuset "top_set"
[zuul:cpuset-trunk]# cset set -c 3 -s /top_set/sub_set
cset: --> created cpuset "/top_set/sub_set"
----------------------------------------------------------
These commands created two cpusets: 'top_set' and 'sub_set'. The 'top_set'
uses CPU1 and CPU3. It has a subset of 'sub_set' which only uses CPU3. Once
you have created a subset with a path, then if the name is unique, you do not
have to specify the path in order to affect it. If the name is not unique,
then *cset* will complain and ask you to use the path. For example:
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -c 1,3 -s sub_set
cset: --> modified cpuset "sub_set
----------------------------------------------------------
This command adds CPU1 to the 'sub_set' cpuset for it's use. Note that using
the path in this case is optional.
If you attempt to destroy a cpuset which has sub-cpusets, *cset* will complain
and not do it unless you use the +-r/\--recurse+ and the +\--force+ options.
If you do use +\--force+, then all the tasks running in all subsets of the
deletion target cpuset will be moved to the target's parent cpuset and all
cpusets.
Moving a cpuset from under a certain cpuset to a different location is
currently not implemented and is slated for a later release of *cset*.
Listing Cpusets with Set
~~~~~~~~~~~~~~~~~~~~~~~~
To list cpusets, use the +set+ subcommand with the '-l/\--list' option. For
example:
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -l
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
root 0-3 y 0 y 320 1 /
one 3 n 0 n 0 1 /one
----------------------------------------------------------
This shows that there is currently one cpuset present called 'one'. (Of course
that there is also the 'root' set, which is always present.) The output shows
that the 'one' cpuset has no tasks running in it. The 'root' cpuset has 320
tasks running. The "-X" for "CPUs" and "MEMs" fields denotes whether the CPUs
and MEMs in the cpusets are marked exclusive to those cpusets. Note that the
'one' cpuset has subsets as indicated by a 1 in the 'Subs' field. You can
specify a cpuset to list with the +set+ subcommand as follows.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -l -s one
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
one 3 n 0 n 0 1 /one
two 3 n 0 n 0 1 /one/two
----------------------------------------------------------
This output shows that there is a cpuset called 'two' in cpuset 'one' and it
also has subset. You can also ask for a recursive listing as follows.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -l -r
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
root 0-3 y 0 y 320 1 /
one 3 n 0 n 0 1 /one
two 3 n 0 n 0 1 /one/two
three 3 n 0 n 0 0 /one/two/three
----------------------------------------------------------
This command lists all cpusets existing on the system since it asks for a
recursive listing beginning at the 'root' cpuset. Incidentally, should you
need to specify the 'root' cpuset you can use either +root+ or +/+ to specify it
explicitely--just remember that the 'root' cpuset cannot be deleted or modified.
The Proc Subcommand
-------------------
Now that we know how to create, rename and destroy cpusets with the +set+
subcommand, the next step is to manage threads and processes in those
cpusets. The subcommand to do this is called +proc+ and it allows you to exec
processes into a cpuset, move existing tasks around existing cpusets, and list
tasks running in specified cpusets. For the following examples, let us
assume a cpuset setup of two sets as follows:
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -l
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
root 0-3 y 0 y 309 2 /
two 2 n 0 n 3 0 /two
three 3 n 0 n 10 0 /three
----------------------------------------------------------
Listing Tasks with Proc
~~~~~~~~~~~~~~~~~~~~~~~
Operation of the +proc+ subcommand follows the same model as the +set+
subcommand. For example, to list tasks in a cpuset, you need to use the
+-l/\--list+ option and specify the cpuset by name or, if the name exists
multiple times in the cpuset hierarchy, by path. For example:
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -l -s two
cset: "two" cpuset of CPUSPEC(2) with 3 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
root 16141 4300 Soth bash
root 16171 16141 Soth bash
root 16703 16171 Roth python ./cset proc -l two
----------------------------------------------------------
This output shows us that the cpuset called 'two' has CPU2 only attached to it
and is running three tasks: two shells and the python command to list it.
Note that cpusets are inherited so that if a process is contained in a cpuset,
then any children it spawns also run within that set. In this case, the
python command to list set 'two' was run from a shell already running in set
'two'. This can be seen by the PPID (parent process ID) of the python
command matching the PID of the shell.
Additionally, the "SPPr" field needs explanation. "SPPr" stands for State,
Policy and Priority. You can see that the initial two tasks are Stopped and
running in timeshare priority, marked as "oth" (for "other"). The last task
is marked as running, "R" and also at timeshare priority, "oth." If any of
these tasks would have been at real time priority, then the policy would be
shown as "f" for FIFO or "r" for round robin, and the priority would be a
number from 1 to 99. See below for an example.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -l -s root | head -7
cset: "root" cpuset of CPUSPEC(0-3) with 309 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
root 1 0 Soth init [5]
root 2 0 Soth [kthreadd]
root 3 2 Sf99 [migration/0]
root 4 2 Sf99 [posix_cpu_timer]
----------------------------------------------------------
This output shows the first few tasks in the 'root' cpuset. Note that both
'init' and '[kthread]' are running at timeshare; however, the '[migration/0]'
and '[posix_cpu_timer]' kernel threads are running at real time policy of FIFO
and priority of 99. Incidentally, this output is from a system running the
real time Linux kernel which runs some kernel threads at real time
priorities. And finally, note that you can of course use *cset* as any other
Linux tool and include it in pipelines as in the example above.
Taking a peek into the third cpuset called 'three', we see:
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -l -s three
cset: "three" cpuset of CPUSPEC(3) with 10 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
alext 16165 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16169 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16170 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16237 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16491 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16492 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16493 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 17243 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 17244 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 17265 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
----------------------------------------------------------
This output shows that a lot of 'beagled' tasks are running in this cpuset and
it also shows an ellipsis (...) at the end of their listings. If you see this
ellipsis, that means that the command was too long to fit onto an 80 character
screen. To see the entire commandline, use the +-v/\--verbose+ flag, as per
following.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -l -s three -v | head -4
cset: "three" cpuset of CPUSPEC(3) with 10 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
alext 16165 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg --autostarted --indexing-delay 300
----------------------------------------------------------
Execing Tasks with Proc
~~~~~~~~~~~~~~~~~~~~~~~
To exec a task into a cpuset, the +proc+ subcommand needs to be employed with
the +-e/\--exec+ option. Let's exec a shell into the cpuset named 'two' in
our set. First we check to see what is running that set:
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -l -s two
cset: "two" cpuset of CPUSPEC(2) with 0 tasks running
[zuul:cpuset-trunk]# cset proc -s two -e bash
cset: --> last message, executed args into cpuset "/two", new pid is: 20955
[zuul:cpuset-trunk]# cset proc -l -s two
cset: "two" cpuset of CPUSPEC(2) with 2 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
root 20955 19253 Soth bash
root 20981 20955 Roth python ./cset proc -l two
----------------------------------------------------------
You can see that initially, 'two' had nothing running in it. After the
completion of the second command, we list 'two' again and see that there are
two tasks running: the shell which we execed and the python *cset* command
that is listing the cpuset. The reason for the second task is that the cpuset
property of a running task is inherited by all its children. Since we
executed the listing command from the new shell which was bound to cpuset
'two', the resulting process for the listing is also bound to cpuset 'two'.
Let's test that by just running a new shell with no prefixed *cset* command.
----------------------------------------------------------
[zuul:cpuset-trunk]# bash
[zuul:cpuset-trunk]# cset proc -l -s two
cset: "two" cpuset of CPUSPEC(2) with 3 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
root 20955 19253 Soth bash
root 21118 20955 Soth bash
root 21147 21118 Roth python ./cset proc -l two
----------------------------------------------------------
Here again we see that the second shell, PID 21118, has a parent PID of 20955
which is the first shell. Both shells as well as the listing command are
running in the 'two' cpuset.
NOTE: *cset* follows the tradition of separating the tool options from the
command to be execed options with a double dash (+\--+). This is not shown in
this simple example, but if the command you want to exec also takes options,
separate them with the double dash like so: +# cset proc -s myset -e mycommand
\-- -v+ The +-v+ will be passed to +mycommand+, and not to *cset*.
TIP: Execing a shell into a cpuset is a useful way to experiment with running
tasks in that cpuset since all children of the shell will also run in the same
cpuset.
Finally, if you misspell the command to be execed, the result may be
puzzling. For example:
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -s two -e blah-blah
cset: --> last message, executed args into cpuset "/two", new pid is: 21655
cset: **> [Errno 2] No such file or directory
----------------------------------------------------------
The result is no new process even though a new PID is output. The reason for
the message is of course that the *cset* process forked in preparation for
exec, but the command +blah-blah+ was not found in order to exec it.
Moving Tasks with Proc
~~~~~~~~~~~~~~~~~~~~~~
Although the ability to exec a task into a cpuset is fundamental, you will
most likely be moving tasks between cpusets more often. Moving tasks is
accomplished with the +-m/\--move+ and +-p/\--pid+ options to the +proc+
subcommand of *cset*. The move option tells the +proc+ subcommand that a task
move is requested. The +-p/\--pid+ option takes an argument called a PIDSPEC
(PID Specification). The PIDSPEC defines which tasks get operated on.
The PIDSPEC can be a single process ID, a list of process IDs separated by
commas, and a list of process ID ranges also separated by commas. For
example:
+\--move --pid 1234+::
This PIDSPEC argument specifies that task 1234 be moved.
+\--move --pid 1234,42,1934,15000,15001,15002+::
This PIDSPEC argument specifies that this list of tasks only be moved.
+\--move --pid 5000,5100,6010-7000,9232+::
This PIDSPEC argument specifies that tasks 5000,5100 and 9232 be moved along
with any existing task that is in the range 6010 through 7000 inclusive.
NOTE: A range in a PIDSPEC does not have to have running tasks for every
number in that range. In fact, it is not even an error if there are no tasks
running in that range; none will be moved in that case. The range simply
specifies to act on any tasks that have a PID or TID that is within that
range.
In the following example, we move the current shell into the cpuset named
'two' with a range PIDSPEC and back out to the 'root' cpuset with the bash
variable for the current PID.
-----------------------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -l -s two
cset: "two" cpuset of CPUSPEC(2) with 0 tasks running
[zuul:cpuset-trunk]# echo $$
19253
[zuul:cpuset-trunk]# cset proc -m -p 19250-19260 -t two
cset: moving following pidspec: 19253
cset: moving 1 userspace tasks to /two
cset: done
[zuul:cpuset-trunk]# cset proc -l -s two
cset: "two" cpuset of CPUSPEC(2) with 2 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
root 19253 16447 Roth bash
root 29456 19253 Roth python ./cset proc -l -s two
[zuul:cpuset-trunk]# cset proc -m -p $$ -t root
cset: moving following pidspec: 19253
cset: moving 1 userspace tasks to /
cset: done
[zuul:cpuset-trunk]# cset proc -l -s two
cset: "two" cpuset of CPUSPEC(2) with 0 tasks running
-----------------------------------------------------------------------
Use of the appropriate PIDSPEC can thus be handy to move tasks and groups of
tasks. Additionally, there is one more option that can help with
multi-threaded processes, and that is the +\--threads+ flag. If this flag is
present in a +proc+ move command with a PIDSPEC and if any of the task IDs in
the PIDSPEC belongs to a thread in a process container, then *all* the sibling
threads in that process container will also get moved. This flag provides an
easy mechanism to move all threads of a process by simply specifying one
thread in that process. In the following example, we move all the threads
running in cpuset 'three' to cpuset 'two' by using the +\--threads+ flag.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set two three
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
two 2 n 0 n 0 0 /two
three 3 n 0 n 10 0 /three
[zuul:cpuset-trunk]# cset proc -l -s three
cset: "three" cpuset of CPUSPEC(3) with 10 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
alext 16165 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16169 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16170 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16237 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16491 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16492 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16493 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 17243 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 17244 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 27133 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
[zuul:cpuset-trunk]# cset proc -m -p 16165 --threads -t two
cset: moving following pidspec: 16491,16493,16492,16170,16165,16169,27133,17244,17243,16237
cset: moving 10 userspace tasks to /two
[==================================================]%
cset: done
[zuul:cpuset-trunk]# cset set two three
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
two 2 n 0 n 10 0 /two
three 3 n 0 n 0 0 /three
----------------------------------------------------------
Moving All Tasks from one Cpuset to Another
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
There is a special case for moving all tasks currently running in one cpuset
to another. This can be a common use case, and when you need to do it,
specifying a PIDSPEC with +-p+ is not necessary 'so long as' you use the
+-f/\--fromset+ *and* the +-t/\--toset+ options.
In the following example, we move all 10 +beagled+ threads back to cpuset
'three' with this method.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -l two three
cset: "two" cpuset of CPUSPEC(2) with 10 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
alext 16165 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16169 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16170 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16237 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16491 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16492 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 16493 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 17243 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 17244 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
alext 27133 1 Soth beagled /usr/lib64/beagle/BeagleDaemon.exe --bg -...
cset: "three" cpuset of CPUSPEC(3) with 0 tasks running
[zuul:cpuset-trunk]# cset proc -m -f two -t three
cset: moving all tasks from two to /three
cset: moving 10 userspace tasks to /three
[==================================================]%
cset: done
[zuul:cpuset-trunk]# cset set two three
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
two 2 n 0 n 0 0 /two
three 3 n 0 n 10 0 /three
----------------------------------------------------------
Moving Kernel Threads with Proc
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Kernel threads are special and *cset* detects tasks that are kernel threads
and will refuse to move them unless you also add a +-k/\--kthread+ option to
your +proc+ move command. Even if you include +-k+, *cset* will 'still'
refuse to move the kernel thread if they are bound to specific CPUs. The
reason for this is system protection.
A number of kernel threads, especially on the real time Linux kernel, are
bound to specific CPUs and depend on per-CPU kernel variables. If you move
these threads to a different CPU than what they are bound to, you risk at best
that the system will become horribly slow, and at worst a system hang. If you
still insist to move those threads (after all, *cset* needs to give the
knowledgeable user access to the keys), then you need to use the +--force+
option additionally.
WARNING: Overriding a task move command with +--force+ can have dire
consequences for the system. Please be sure of the command before you force
it.
In the following example, we move all unbound kernel threads running in the
'root' cpuset to the cpuset named 'two' by using the +-k+ option.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -k -f root -t two
cset: moving all kernel threads from / to /two
cset: moving 70 kernel threads to: /two
cset: --> not moving 76 threads (not unbound, use --force)
[==================================================]%
cset: done
----------------------------------------------------------
You will note that we used the fromset->toset facility of the +proc+
subcommand and we only specified the +-k+ option (not the +-m+ option). This
has the effect of moving all kernel threads only.
Note that only 70 actual kernel threads were moved and 76 were not. The
reason that 76 kernel threads were not moved was because they are bound to
specific CPUs. Now, let's move those kernel threads back to 'root'.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -k -f two -t root
cset: moving all kernel threads from /two to /
cset: ** no task matched move criteria
cset: **> kernel tasks are bound, use --force if ok
[zuul:cpuset-trunk]# cset set -l -s two
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
two 2 n 0 n 70 0 /two
----------------------------------------------------------
Ah! What's this? *Cset* refused to move the kernel threads back to 'root'
because it says that they are "bound." Let's check this with the Linux
taskset command.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -l -s two | head -5
cset: "two" cpuset of CPUSPEC(2) with 70 tasks running
USER PID PPID SPPr TASK NAME
-------- ----- ----- ---- ---------
root 2 0 Soth [kthreadd]
root 55 2 Soth [khelper]
[zuul:cpuset-trunk]# taskset -p 2
pid 2's current affinity mask: 4
[zuul:cpuset-trunk]# cset set -l -s two
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
two 2 n 0 n 70 0 /two
----------------------------------------------------------
Of course, since the cpuset named 'two' only has CPU2 assigned to it, once we
moved the unbound kernel threads from 'root' to 'two', their affinity masks
got automatically changed to only use CPU2. This is evident from the
+taskset+ output which is a hex value. To really move these threads back to
'root', we need to force the move as follows.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -k -f two -t root --force
cset: moving all kernel threads from /two to /
cset: moving 70 kernel threads to: /
[==================================================]%
cset: done
----------------------------------------------------------
Destroying Tasks
~~~~~~~~~~~~~~~~
There actually is no *cset* subcommand or option to destroy tasks--it's not
really needed. Tasks exist and are accessible on the system as normal, even
if they happen to be running in one cpuset or another. To destroy tasks, use
the usual ^C method or by using the *kill(1)* command.
Implementing "Shielding" with Set and Proc
------------------------------------------
With the preceding material on the +set+ and +proc+ subcommands, we now have
the background to implement the basic shielding model, just like the +shield+
subcommand.
One may pose the question as to why we want to do this, especially since
+shield+ already does it? The answer is that sometimes one needs more
functionality than +shield+ has to implement one's shielding strategy. In
those case you need to first stop using +shield+ since that subcommand will
interfere with the further application of +set+ and +proc+; however, you will
still need to implement the functionality of +shield+ in order to implement
successful shielding.
Remember from the above sections describing +shield+, that shielding has at
minimum three cpusets: 'root', which is always present and contains all CPUs;
'system' which is the "non-shielded" set of CPUs and runs unimportant system
tasks; and 'user', which is the "shielded" set of CPUs and runs your important
tasks. Remember also that +shield+ moves all movable tasks into 'system'
and, optionally, moves unbound kernel threads into 'system' as well.
We start first by creating the 'system' and 'user' cpusets as follows. We
assume that the machine is a four-CPU machine without NUMA memory features.
The 'system' cpuset should hold only CPU0 while the 'user' cpuset should hold
the rest of the CPUs.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -c 0 -s system
cset: --> created cpuset "system"
[zuul:cpuset-trunk]# cset set -c 1-3 -s user
cset: --> created cpuset "user"
[zuul:cpuset-trunk]# cset set -l
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
root 0-3 y 0 y 333 2 /
user 1-3 n 0 n 0 0 /user
system 0 n 0 n 0 0 /system
----------------------------------------------------------
Now, we need to move all running user processes into the 'system' cpuset.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -m -f root -t system
cset: moving all tasks from root to /system
cset: moving 188 userspace tasks to /system
[==================================================]%
cset: done
[zuul:cpuset-trunk]# cset set -l
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
root 0-3 y 0 y 146 2 /
user 1-3 n 0 n 0 0 /user
system 0 n 0 n 187 0 /system
----------------------------------------------------------
We now have the basic shielding set up. Since all userspace tasks are running
in 'system', anything that is spawned from them will also run in 'system'.
The 'user' cpuset has nothing running in it unless you put tasks there with
the +proc+ subcommand as described above. If you also want to move movable
kernel threads from 'root' to 'system' (in order to achieve a form of
"interrupt shielding" on a real time Linux kernel for example), you would
execute the following command as well.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -k -f root -t system
cset: moving all kernel threads from / to /system
cset: moving 70 kernel threads to: /system
cset: --> not moving 76 threads (not unbound, use --force)
[==================================================]%
cset: done
[zuul:cpuset-trunk]# cset set -l
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
root 0-3 y 0 y 76 2 /
user 1-3 n 0 n 0 0 /user
system 0 n 0 n 257 0 /system
----------------------------------------------------------
At this point, you have achieved the simple shielding model that the +shield+
subcommand provides. You can now add other cpuset definitions to expand your
shielding strategy beyond that simple model.
Implementing Hierarchy with Set and Proc
----------------------------------------
One popular extended "shielding" model is based on hierarchical cpusets, each
with diminishing numbers of CPUs. This model is used to create "priority
cpusets" that allow assignment of CPU resources to tasks based on some
arbitrary priority definition. The idea being that a higher priority task
will get access to more CPU resources than a lower priority task.
The example provided here once again assumes a machine with four CPUs and no
NUMA memory features. This base serves to illustrate the point well; however,
note that if your machine has (many) more CPUs, then strategies such as this
and others get more interesting.
We define a shielding set up as in the previous section where we have a
'system' cpuset with just CPU0 that takes care of "unimportant" system tasks.
One usually requires this type of cpuset since it forms the basis of
shielding. We modify the strategy to not use a 'user' cpuset, instead we
create a number of new cpusets each holding one more CPU than the other.
These cpusets will be called 'prio_low' with one CPU, 'prio_med' with two
CPUs, 'prio_high' with three CPUs, and 'prio_all' with all CPUs.
NOTE: One may ask, why create a 'prio_all' with all CPUs when that is
substantially the definition of the 'root' cpuset? The answer is that it is
best to keep a separation between the 'root' cpuset and everything else, even
if a particular cpuset duplicates 'root' exactly. Usually, one builds
automation on top of a cpuset strategy. In these cases, it is best to avoid
using invariant names of cpusets, such as 'root' for example, in this
automation.
All of these 'prio_*' cpusets can be created under root, in a flat way;
however, it is advantageous to create them as a hierarchy. The reasoning for
this is twofold: first, if a cpuset is destroyed, all its tasks are moved to
its parent; second, one can use exclusive CPUs in a hierarchy.
There is a planned addition to the +proc+ subcommand that will allow moving a
specified PIDSPEC of tasks running in a specified cpuset to its parent. This
addition will ease the automation burden.
If a cpuset has CPUs that are exclusive to it, then other cpusets may not make
use of those CPUs unless they are children of that cpuset. This has more
relevance to machines with many CPUs and more complex strategies.
Now, we start with a clean slate and build the appropriate cpusets as
follows.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -r
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
root 0-3 y 0 y 344 0 /
[zuul:cpuset-trunk]# cset set -c 0-3 prio_all
cset: --> created cpuset "prio_all"
[zuul:cpuset-trunk]# cset set -c 1-3 /prio_all/prio_high
cset: --> created cpuset "/prio_all/prio_high"
[zuul:cpuset-trunk]# cset set -c 2-3 /prio_all/prio_high/prio_med
cset: --> created cpuset "/prio_all/prio_high/prio_med"
[zuul:cpuset-trunk]# cset set -c 3 /prio_all/prio_high/prio_med/prio_low
cset: --> created cpuset "/prio_all/prio_high/prio_med/prio_low"
[zuul:cpuset-trunk]# cset set -c 0 system
cset: --> created cpuset "system"
[zuul:cpuset-trunk]# cset set -l -r
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
root 0-3 y 0 y 344 2 /
system 0 n 0 n 0 0 /system
prio_all 0-3 n 0 n 0 1 /prio_all
prio_high 1-3 n 0 n 0 1 /prio_all/prio_high
prio_med 2-3 n 0 n 0 1 /prio_all/prio_high/prio_med
prio_low 3 n 0 n 0 0 /prio_all/pr...rio_med/prio_low
----------------------------------------------------------
NOTE: We used the +-r/\--recurse+ switch to list all the sets in the last
command above. If we had not, then the 'prio_med' and 'prio_low' cpusets
would not have been listed.
The strategy is now implemented and we now move all userspace tasks as well as
all movable kernel threads into the 'system' cpuset to activate the shield.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset proc -m -k -f root -t system
cset: moving all tasks from root to /system
cset: moving 198 userspace tasks to /system
cset: moving 70 kernel threads to: /system
cset: --> not moving 76 threads (not unbound, use --force)
[==================================================]%
cset: done
[zuul:cpuset-trunk]# cset set -l -r
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
root 0-3 y 0 y 76 2 /
system 0 n 0 n 268 0 /system
prio_all 0-3 n 0 n 0 1 /prio_all
prio_high 1-3 n 0 n 0 1 /prio_all/prio_high
prio_med 2-3 n 0 n 0 1 /prio_all/prio_high/prio_med
prio_low 3 n 0 n 0 0 /prio_all/pr...rio_med/prio_low
----------------------------------------------------------
The shield is now active. Since the 'prio_*' cpuset names are unique, one can
assign tasks to them either via either their simple name, or their full path
(as described in the +proc+ section above).
You may have noted that there is an ellipsis in the path of the 'prio_low'
cpuset in the listing above. This is done in order to fit the output onto an
80 character screen. If you want to see the entire line, then you need to use
the +-v/\--verbose+ flag as follows.
----------------------------------------------------------
[zuul:cpuset-trunk]# cset set -l -r -v
cset:
Name CPUs-X MEMs-X Tasks Subs Path
------------ ---------- - ------- - ----- ---- ----------
root 0-3 y 0 y 76 2 /
system 0 n 0 n 268 0 /system
prio_all 0-3 n 0 n 0 1 /prio_all
prio_high 1-3 n 0 n 0 1 /prio_all/prio_high
prio_med 2-3 n 0 n 0 1 /prio_all/prio_high/prio_med
prio_low 3 n 0 n 0 0 /prio_all/prio_high/prio_med/prio_low
----------------------------------------------------------
Using Shortcuts
===============
The commands listed in the previous sections always used all the required
options. *Cset* however does have a shortcut facility that will execute
certain commands without specifying all options. An effort has been made to
do this with the "principle of least surprise." This means that if you do not
specify options, but do specify parameters, then the outcome of the command
should be intuitive. As much as possible.
Using shortcuts is of course not necessary. In fact, you can not only not use
shortcuts, but you can use long options instead of short, in case you really
enjoy typing... All kidding aside, using long options and not using shortcuts
does have a use case: when you write a script intended to be self-documenting,
or perhaps when you generate *cset* documentation.
To begin, the subcommands +shield+, +set+ and +proc+ can themselves be
shortened to the fewest number of characters that are unambiguous. For
example, the following commands are identical:
----------------------------------------------------------
# cset shield -s -p 1234 <--> # cset sh -s -p 1234
# cset set -c 1,3 -s newset <--> # cset se -c 1,3 -s newset
# cset proc -s newset -e bash <--> # cset p -s newset -e bash
----------------------------------------------------------
Note that +proc+ can be shortened to just +p+, while +shield+ and +set+ need
two letters to disambiguate.
Shield Subcommand Shortcuts
---------------------------
The +shield+ subcommand supports two areas with shortcuts: the case when there
is no options given where to 'shield' is the common use case, and making the
+-p/\--pid+ option 'optional' for the +-s/\--shield+ and +-u/\--unshield+
options.
For the common use case of actually 'shielding' either a PIDSPEC or execing a
command into the shield, the following *cset* commands are equivalent.
-----------------------------------------------------------
# cset shield -s -p 1234,500-649 <--> # cset sh 1234,500-649
# cset shield -s -e bash <--> # cset sh bash
-----------------------------------------------------------
When using the +-s+ or +-u+ shield/unshield options, it is optional to use the
+-p+ option to specify a PIDSPEC. For example:
-----------------------------------------------------------
# cset shield -s -p 1234 <--> # cset sh -s 1234
# cset shield -u -p 1234 <--> # cset sh -u 1234
-----------------------------------------------------------
Set Subcommand Shortcuts
------------------------
The +set+ subcommand has a limited number of shortcuts. Basically, the
+-s/\--set+ option is optional in most cases and the +-l/\--list+ option is
also optional if you want to list sets. For example, these commands are
equivalent.
-----------------------------------------------------------
# cset set -l -s myset <--> # cset se -l myset
# cset se -l myset <--> # cset se myset
# cset set -c 1,2,3 -s newset <--> # cset se -c 1,2,3 newset
# cset set -d -s newset <--> # cset se -d newset
# cset set -n newname -s oldname <--> # cset se -n newname oldname
-----------------------------------------------------------
In fact, if you want to apply either the list or the destroy options to
multiple cpusets with one *cset* command, you'll need to not use the +-s+
option. For example:
-----------------------------------------------------------
# cset se -d myset yourset ourset
--> destroys cpusets: myset, yourset and ourset
# cset se -l prio_high prio_med prio_low
--> lists only cpusets prio_high, prio_med and prio_low
--> the -l is optional in this case since list is default
-----------------------------------------------------------
Proc Subcommand Shortcuts
-------------------------
For the +proc+ subcommand, the +-s+, +-t+ and +-f+ options to specify the
cpuset, the origination cpuset and the destination cpuset, can sometimes be
optional. For example, the following commands are equivalent.
-----------------------------------------------------------
To list tasks in cpusets:
# cset proc -l -s myset \
# cset proc -l -f myset --> # cset p -l myset
# cset proc -l -t myset /
# cset p -l myset <--> # cset p myset
# cset proc -l -s one two <--> # cset p -l one two
# cset p -l one two <--> # cset p one two
To exec a process into a cpuset:
# cset proc -s myset -e bash <--> # cset p myset -e bash
-----------------------------------------------------------
Movement of tasks into and out of cpusets have the following shortcuts.
-----------------------------------------------------------
To move a PIDSPEC into a cpuset:
# cset proc -m -p 4242,4243 -s myset <--> # cset p -m 4242,4243 myset
# cset proc -m -p 12 -t myset <--> # cset p -m 12 myset
To move all tasks from one cpuset to another:
# cset proc -m -f set1 -t set2 \
# cset proc -m -s set1 -t set2 --> # cset p -m set1 set2
# cset proc -m -f set1 -s set2 /
-----------------------------------------------------------
What To Do if There are Problems
================================
If you encounter problems with the *cset* application, the best option is to
log a bug with the *cset* bugzilla instance found here:
http://code.google.com/p/cpuset/issues/list
If you are using *cset* on a supported operating system such as SLES or SLERT
from Novell, then please use that bugzilla instead at:
http://bugzilla.novell.com
If the problem is repeatable, there is an excellent chance that it will get
fixed quickly. Also, *cset* contains a logging facility that is invaluable
for the developers to diagnose problems. To create a log of a run, use the
+-l/\--log+ option with a filename as an argument to the main *cset*
application. For example.
-------------------------------------------------------------
# cset -l logfile.txt set -n newname oldname
-------------------------------------------------------------
That command saves a lot of debugging information in the 'logfile.txt' file.
Please attach this file to the bug.
|