File: CPUSET

package info (click to toggle)
oar 2.6.1-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 9,700 kB
  • sloc: perl: 34,517; sh: 6,041; ruby: 5,840; sql: 3,390; cpp: 2,277; makefile: 402; php: 365; ansic: 335; python: 275; exp: 23
file content (102 lines) | stat: -rw-r--r-- 3,874 bytes parent folder | download | duplicates (9)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
Since OAR 2.5, the cpuset feature is configured by default (the majotiry of the
Linux distribution is compiled with cpuset and/or cgroups).

What are "oarsh" and "oarsh_shell" scripts ?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

"oarsh" and "oarsh_shell" are two scripts that can restrict user processes to
stay in the same cpuset on all nodes.

This feature is very usefull to restrict processor consumption on
multiprocessors computers and to kill all processes of a same
OAR job on several nodes.

Moreover "oarsh" replaces the ssh or rsh commands in the user scripts (they
don't have to configure ssh keys, "oarsh" works on all the nodes of a job). 

CPUSET definition
~~~~~~~~~~~~~~~~~

CPUSET is a module integrated in the Linux kernel since 2.6.x.
In the kernel documentation, you can read::

    Cpusets provide a mechanism for assigning a set of CPUs and Memory
    Nodes to a set of tasks.

    Cpusets constrain the CPU and Memory placement of tasks to only
    the resources within a tasks current cpuset.  They form a nested
    hierarchy visible in a virtual file system.  These are the essential
    hooks, beyond what is already present, required to manage dynamic
    job placement on large systems.

    Each task has a pointer to a cpuset.  Multiple tasks may reference
    the same cpuset.  Requests by a task, using the sched_setaffinity(2)
    system call to include CPUs in its CPU affinity mask, and using the
    mbind(2) and set_mempolicy(2) system calls to include Memory Nodes
    in its memory policy, are both filtered through that tasks cpuset,
    filtering out any CPUs or Memory Nodes not in that cpuset.  The
    scheduler will not schedule a task on a CPU that is not allowed in
    its cpus_allowed vector, and the kernel page allocator will not
    allocate a page on a node that is not allowed in the requesting tasks
    mems_allowed vector.


OARSH
~~~~~

"oarsh" is a wrapper around the "ssh" command (tested with openSSH).
Its goal is to propagate two environment variables:

    - OAR_CPUSET : The name of the OAR job cpuset
    - OAR_JOB_USER : The name of the user corresponding to the job

"oarsh" uses the oar ssh keys to do the ssh connection. That's why the ssh
server used for OAR is configured to propagete the environment variables
OAR_CPUSET and OAR_JOB_USER and only accepts the oar user.

OARSH_SHELL
~~~~~~~~~~~

"oarsh_shell" must be the shell of the oar user on each nodes where you want
"oarsh" to work.
This script takes "OAR_CPUSET" and "OAR_JOB_USER" environment variables and
adds its PID in OAR_CPUSET cpuset. Then it searches user shell and home and
executes the right command (like ssh).

Important notes
~~~~~~~~~~~~~~~

    - the command "scp" can be used with oarsh. The syntax is::

        scp -S /usr/bin/oarsh ...

      See scp man page.

      You can also use the "oarcp" command which do this for you.

      See the oarsh man page.
    
    - If you want to use oarsh from the user frontale, you can. You have to
      define the environment OAR_JOB_ID and then launch oarsh on a node used
      by your OAR job. This feature works only where the oarstat command is
      configured::

        OAR_JOB_ID=42 oarsh node12

      or::

        export OAR_JOB_ID=42
        oarsh node12

      This command gives you a shell on the "node12" from the OAR job 42.

    - You can disable the cpuset security mechanism by setting the 
      OARSH_BYPASS_WHOLE_SECURITY field to 1 in your oar.conf file. 
      WARNING: this is a critical functionality (this is only useful 
      if users want to have a command to connect on every nodes without 
      taking care of their ssh configuration and act like a ssh).
      DO NOT USE THIS IF YOU ARE UNSURE.

If you want more tips how to use oarsh with, for example, different MPI
implementation then take a look at http://oar.imag.fr/user-usecases/ .