File: config.rst

package info (click to toggle)
clustershell 1.9.3-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 2,228 kB
  • sloc: python: 20,978; makefile: 149
file content (740 lines) | stat: -rw-r--r-- 30,409 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
Configuration
=============

.. highlight:: ini

clush
-----

.. _clush-config:

clush.conf
^^^^^^^^^^

The following configuration file defines system-wide default values for
several ``clush`` tool parameters::

    /etc/clustershell/clush.conf

``clush`` settings might then be overridden (globally, or per user) if one of
the following files is found, in priority order::

    $XDG_CONFIG_HOME/clustershell/clush.conf
    $HOME/.config/clustershell/clush.conf (only if $XDG_CONFIG_HOME is not defined)
    {sys.prefix}/etc/clustershell/clush.conf
    $HOME/.local/etc/clustershell/clush.conf
    $HOME/.clush.conf (deprecated, for 1.6 compatibility only)

.. note:: The path using `sys.prefix`_ was added in version 1.9.1 and is
   useful for Python virtual environments.

In addition, if the environment variable ``$CLUSTERSHELL_CFGDIR`` is defined and
valid, it will used instead. In such case, the following configuration file
will be tried first for ``clush``::

    $CLUSTERSHELL_CFGDIR/clush.conf

The following table describes available ``clush`` config file settings.

+-----------------+----------------------------------------------------+
| Key             | Value                                              |
+=================+====================================================+
| fanout          | Size of the sliding window of connectors (eg. max  |
|                 | number of *ssh(1)* allowed to run at the same      |
|                 | time).                                             |
+-----------------+----------------------------------------------------+
| confdir         | Optional list of directory paths where ``clush``   |
|                 | should look for **.conf** files which define       |
|                 | :ref:`run modes <clushmode-config>` that can then  |
|                 | be activated with `--mode`. All other ``clush``    |
|                 | config file settings defined in this table might   |
|                 | be overridden in a run mode. Each mode section     |
|                 | should have a name prefixed by "mode:" to clearly  |
|                 | identify a section defining a mode. Duplicate      |
|                 | modes are not allowed in those files.              |
|                 | Configuration files that are not readable by the   |
|                 | current user are ignored. The variable `$CFGDIR`   |
|                 | is replaced by the path of the highest priority    |
|                 | configuration directory found (where *clush.conf*  |
|                 | resides). The default *confdir* value enables both |
|                 | system-wide and any installed user configuration   |
|                 | (thanks to `$CFGDIR`). Duplicate directory paths   |
|                 | are ignored.                                       |
+-----------------+----------------------------------------------------+
| connect_timeout | Timeout in seconds to allow a connection to        |
|                 | establish. This parameter is passed to *ssh(1)*.   |
|                 | If set to 0, no timeout occurs.                    |
+-----------------+----------------------------------------------------+
| command_prefix  | Command prefix. Generally used for specific        |
|                 | :ref:`run modes <clush-modes>`, for example to     |
|                 | implement *sudo(8)* support.                       |
+-----------------+----------------------------------------------------+
| command_timeout | Timeout in seconds to allow a command to complete  |
|                 | since the connection has been established. This    |
|                 | parameter is passed to *ssh(1)*. In addition, the  |
|                 | ClusterShell library ensures that any commands     |
|                 | complete in less than (connect_timeout \+          |
|                 | command_timeout). If set to 0, no timeout occurs.  |
+-----------------+----------------------------------------------------+
| color           | Whether  to  use  ANSI  colors  to  surround node  |
|                 | or nodeset prefix/header with escape sequences to  |
|                 | display them in color on the terminal. Valid       |
|                 | arguments are *never*, *always* or *auto* (which   |
|                 | use color if standard output/error refer to a      |
|                 | terminal).                                         |
|                 | Colors are set to ``[34m`` (blue foreground text)  |
|                 | for stdout and ``[31m`` (red foreground text) for  |
|                 | stderr, and cannot be modified.                    |
+-----------------+----------------------------------------------------+
| fd_max          | Maximum  number  of  open  file descriptors        |
|                 | permitted per ``clush`` process (soft resource     |
|                 | limit for open files). This limit can never exceed |
|                 | the system (hard) limit. The *fd_max* (soft) and   |
|                 | system (hard) limits should be high enough to      |
|                 | run ``clush``, although their values depend on     |
|                 | your fanout value.                                 |
+-----------------+----------------------------------------------------+
| history_size    | Set the maximum number of history entries saved in |
|                 | the GNU readline history list. Negative values     |
|                 | imply unlimited history file size.                 |
+-----------------+----------------------------------------------------+
| node_count      | Should ``clush`` display additional (node count)   |
|                 | information in buffer header? (yes/no)             |
+-----------------+----------------------------------------------------+
| maxrc           | Should ``clush`` return the largest of command     |
|                 | return codes? (yes/no)                             |
|                 | If set to no (the default), ``clush`` exit status  |
|                 | gives no information about command return codes,   |
|                 | but rather reports on ``clush`` execution itself   |
|                 | (zero indicating a successful run).                |
+-----------------+----------------------------------------------------+
| password_prompt | Enable password prompt and password forwarding to  |
|                 | stdin? (yes/no)                                    |
|                 | Generally used for specific                        |
|                 | :ref:`run modes <clush-modes>`, for example to     |
|                 | implement interactive *sudo(8)* support.           |
+-----------------+----------------------------------------------------+
| verbosity       | Set the verbosity level: 0 (quiet), 1 (default),   |
|                 | 2 (verbose) or more (debug).                       |
+-----------------+----------------------------------------------------+
| ssh_user        | Set the *ssh(1)* user to use for remote connection |
|                 | (default is to not specify).                       |
+-----------------+----------------------------------------------------+
| ssh_path        | Set the *ssh(1)* binary path to use for remote     |
|                 | connection (default is *ssh*).                     |
+-----------------+----------------------------------------------------+
| ssh_options     | Set additional (raw) options to pass to the        |
|                 | underlying *ssh(1)* command.                       |
+-----------------+----------------------------------------------------+
| scp_path        | Set the *scp(1)* binary path to use for remote     |
|                 | copy (default is *scp*).                           |
+-----------------+----------------------------------------------------+
| scp_options     | Set additional options to pass to the underlying   |
|                 | *scp(1)* command. If not specified, *ssh_options*  |
|                 | are used instead.                                  |
+-----------------+----------------------------------------------------+
| rsh_path        | Set the *rsh(1)* binary path to use for remote     |
|                 | connection (default is *rsh*). You could easily    |
|                 | use *mrsh* or *krsh* by simply changing this       |
|                 | value.                                             |
+-----------------+----------------------------------------------------+
| rcp_path        | Same as *rsh_path* but for rcp command (default is |
|                 | *rcp*).                                            |
+-----------------+----------------------------------------------------+
| rsh_options     | Set additional options to pass to the underlying   |
|                 | rsh/rcp command.                                   |
+-----------------+----------------------------------------------------+

.. _clushmode-config:

Run modes
^^^^^^^^^

Since version 1.9, ``clush`` has support for run modes, which are special
:ref:`clush-config` settings with a given name. Two run modes are provided in
example configuration files that can be copied and modified. They implement
password-based authentication with *sshpass(1)* and support of interactive
*sudo(8)* with password.

To use a run mode with ``clush --mode``, install a configuration file in one
of :ref:`clush-config`'s ``confdir`` (usually ``clush.conf.d``).  Only
configuration files ending in **.conf** are scanned. If the user running
``clush`` doesn't have read access to a configuration file, it is ignored.
When ``--mode`` is specified, you can display all available run modes for
the current user by enabling debug mode (``-d``).

Example of a run mode configuration file (eg.
``/etc/clustershell/clush.conf.d/sudo.conf``) to add support for interactive
sudo::

    [mode:sudo]
    password_prompt: yes
    command_prefix: /usr/bin/sudo -S -p "''"

System administrators or users can easily create additional run modes by
adding configuration files to :ref:`clush-config`'s ``confdir``.

More details about using run modes can be found :ref:`here <clush-modes>`.

.. _groups-config:

Node groups
-----------

ClusterShell defines a *node group* syntax to represent a collection of nodes.
This is a convenient way to manipulate node sets, especially in HPC (High
Performance Computing) or with large server farms. This section explains how
to configure node group **sources**. Please see also :ref:`nodeset node groups
<nodeset-groups>` for specific usage examples.

.. _groups_config_conf:

groups.conf
^^^^^^^^^^^

ClusterShell loads *groups.conf* configuration files that define how to
obtain node groups configuration, ie. the way the library should access
file-based or external node group **sources**.

The following configuration file defines system-wide default values for
*groups.conf*::

    /etc/clustershell/groups.conf

*groups.conf* settings might then be overridden (globally, or per user) if one
of the following files is found, in priority order::

    $XDG_CONFIG_HOME/clustershell/groups.conf
    $HOME/.config/clustershell/groups.conf (only if $XDG_CONFIG_HOME is not defined)
    {sys.prefix}/etc/clustershell/groups.conf
    $HOME/.local/etc/clustershell/groups.conf

.. note:: The path using `sys.prefix`_ was added in version 1.9.1 and is
   useful for Python virtual environments.

In addition, if the environment variable ``$CLUSTERSHELL_CFGDIR`` is defined and
valid, it will used instead. In such case, the following configuration file
will be tried first for *groups.conf*::

    $CLUSTERSHELL_CFGDIR/groups.conf

This makes possible for an user to have its own *node groups* configuration.
If no readable configuration file is found, group support will be disabled but
other node set operations will still work.

*groups.conf* defines configuration sub-directories, but may also define
source definitions by itself. These **sources** provide external calls that
are detailed in :ref:`group-external-sources`.

The following example shows the content of a *groups.conf* file where node
groups are bound to the source named *genders* by default::

    [Main]
    default: genders
    confdir: /etc/clustershell/groups.conf.d $CFGDIR/groups.conf.d
    autodir: /etc/clustershell/groups.d $CFGDIR/groups.d

    [genders]
    map: nodeattr -n $GROUP
    all: nodeattr -n ALL
    list: nodeattr -l

    [slurm]
    map: sinfo -h -o "%N" -p $GROUP
    all: sinfo -h -o "%N"
    list: sinfo -h -o "%P"
    reverse: sinfo -h -N -o "%P" -n $NODE

The *groups.conf* files are parsed with Python's `ConfigParser`_:

* The first section whose name is *Main* accepts the following keywords:

  * *default* defines a **default node group source** (eg. by referencing a
    valid section header)
  * *confdir* defines an optional list of directory paths where the
    ClusterShell library should look for **.conf** files which define group
    sources to use.  Each file in these directories with the .conf suffix
    should contain one or more node group source sections as documented below.
    These will be merged with the group sources defined in the main
    *groups.conf* to form the complete set of group sources to use. Duplicate
    group source sections are not allowed in those files. Configuration files
    that are not readable by the current user are ignored (except the one that
    defines the default group source). The variable `$CFGDIR` is replaced by
    the path of the highest priority configuration directory found (where
    *groups.conf* resides). The default *confdir* value enables both
    system-wide and any installed user configuration (thanks to `$CFGDIR`).
    Duplicate directory paths are ignored.
  * *autodir* defines an optional list of directories where the ClusterShell
    library should look for **.yaml** files that define in-file group
    dictionaries. No need to call external commands for these files, they are
    parsed by the ClusterShell library itself. Multiple group source
    definitions in the same file is supported. The variable `$CFGDIR` is
    replaced by the path of the highest priority configuration directory found
    (where *groups.conf* resides). The default *confdir* value enables both
    system-wide and any installed user configuration (thanks to `$CFGDIR`).
    Duplicate directory paths are ignored.

* Each following section (`genders`, `slurm`) defines a  group source. The
  map, all, list and reverse upcalls are explained below in
  :ref:`group-sources-upcalls`.

.. _group-file-based:

File-based group sources
^^^^^^^^^^^^^^^^^^^^^^^^

Version 1.7 introduces support for native handling of flat files with
different group sources to avoid the use of external upcalls for such static
configuration. This can be achieved through the *autodir* feature and YAML
files described below.

YAML group files
""""""""""""""""

Cluster node groups can be defined in straightforward YAML files. In such a
file, each YAML dictionary defines group to nodes mapping. **Different
dictionaries** are handled as **different group sources**.

For compatibility reasons with previous versions of ClusterShell, this is not
the default way to define node groups yet. So here are the steps needed to try
this out:

Rename the following file::

    /etc/clustershell/groups.d/cluster.yaml.example

to a file having the **.yaml** extension, for example::

  /etc/clustershell/groups.d/cluster.yaml


Ensure that *autodir* is set in :ref:`groups_config_conf`::

  autodir: /etc/clustershell/groups.d $CFGDIR/groups.d

In the following example, we also changed the default group source
to **roles** in :ref:`groups_config_conf` (the first dictionary defined in
the example), so that *@roles:groupname* can just be shorted *@groupname*.

.. highlight:: yaml

Here is an example of **/etc/clustershell/groups.d/cluster.yaml**::

    roles:
        adm: 'mgmt[1-2]'                 # define groups @roles:adm and @adm
        login: 'login[1-2]'
        compute: 'node[0001-0288]'
        gpu: 'node[0001-0008]'

        servers:                         # example of yaml list syntax for nodes
            - 'server001'                # in a group
            - 'server002,server101'                
            - 'server[003-006]'

        cpu_only: '@compute!@gpu'        # example of inline set operation
                                         # define group @cpu_only with node[0009-0288]

        storage: '@lustre:mds,@lustre:oss' # example of external source reference

        all: '@login,@compute,@storage'  # special group used for clush/nodeset -a
                                         # only needed if not including all groups

    lustre:
        mds: 'mds[1-4]'
        oss: 'oss[0-15]'
        rbh: 'rbh[1-2]'


If you wish to define an empty group (with no nodes), you can either use an
empty string ``''`` or any valid YAML null value (``null`` or ``~``).

.. highlight:: console

Testing the syntax of your group file can be quickly performed through the
``-L`` or ``--list-all`` command of :ref:`nodeset-tool`::

    $ nodeset -LL
    @adm mgmt[1-2]
    @all login[1-2],mds[1-4],node[0001-0288],oss[0-15],rbh[1-2]
    @compute node[0001-0288]
    @cpu_only node[0009-0288]
    @gpu node[0001-0008]
    @login login[1-2]
    @storage mds[1-4],oss[0-15],rbh[1-2]
    @sysgrp sysgrp[1-4]
    @lustre:mds mds[1-4]
    @lustre:oss oss[0-15]
    @lustre:rbh rbh[1-2]

.. _group-external-sources:

External group sources
^^^^^^^^^^^^^^^^^^^^^^

.. _group-sources-upcalls:

Group source upcalls
""""""""""""""""""""

Each node group source is defined by a section name (*source* name) and up to
four upcalls:

* **map**: External shell command used to resolve a group name into a node
  set, list of nodes or list of node sets (separated by space characters or by
  carriage returns). The variable *$GROUP* is replaced before executing the command.
* **all**: Optional external shell command that should return a node set, list
  of nodes or list of node sets of all nodes for this group source. If not
  specified, the library will try to resolve all nodes by using the **list**
  external command in the same group source followed by **map** for each
  available group. The notion of *all nodes* is used by ``clush -a`` and also
  by the special group name ``@*`` (or ``@source:*``).
* **list**: Optional external shell command that should return the list of all
  groups for this group source (separated by space characters or by carriage
  returns). If this upcall is not specified, ClusterShell won't be able to
  list any available groups (eg. with ``nodeset -l``), so it is highly
  recommended to set it.
* **reverse**: Optional external shell command used to find the group(s) of a
  single node. The variable *$NODE* is previously replaced. If this external
  call is not specified, the reverse operation is computed in memory by the
  library from the **list** and **map** external calls, if available. Also, if
  the number of nodes to reverse is greater than the number of available
  groups, the reverse external command is avoided automatically to reduce
  resolution time.

In addition to context-dependent *$GROUP* and *$NODE* variables described
above, the two following variables are always available and also replaced
before executing shell commands:

* *$CFGDIR* is replaced by *groups.conf* base directory path
* *$SOURCE* is replaced by current source name (see an usage example just
  below)

.. _group-external-caching:

Caching considerations
""""""""""""""""""""""

External command results are cached in memory, for a limited amount of time,
to avoid multiple similar calls.

The optional parameter **cache_time**, when specified within a group source
section, defines the number of seconds each upcall result is kept in cache,
in memory only. Please note that caching is actually only useful for
long-running programs (like daemons) that are using node groups, not for
one-shot commands like :ref:`clush <clush-tool>` or
:ref:`cluset <cluset-tool>`/:ref:`nodeset <nodeset-tool>`.

The default value of **cache_time** is 3600 seconds.

Multiple sources section
""""""""""""""""""""""""

.. highlight:: ini

Use a comma-separated list of source names in the section header if you want
to define multiple group sources with similar upcall commands. The special
variable `$SOURCE` is always replaced by the source name before command
execution (here `cluster`, `racks` and `cpu`), for example::

    [cluster,racks,cpu]
    map: get_nodes_from_source.sh $SOURCE $GROUP
    all: get_all_nodes_from_source.sh $SOURCE
    list: list_nodes_from_source.sh $SOURCE

is equivalent to::

    [cluster]
    map: get_nodes_from_source.sh cluster $GROUP
    all: get_all_nodes_from_source.sh cluster
    list: list_nodes_from_source.sh cluster

    [racks]
    map: get_nodes_from_source.sh racks $GROUP
    all: get_all_nodes_from_source.sh racks
    list: list_nodes_from_source.sh racks

    [cpu]
    map: get_nodes_from_source.sh cpu $GROUP
    all: get_all_nodes_from_source.sh cpu
    list: list_nodes_from_source.sh cpu

Return code of external calls
"""""""""""""""""""""""""""""

Each external command might return a non-zero return code when the operation
is not doable. But if the call return zero, for instance, for a non-existing
group, the user will not receive any error when trying to resolve such unknown
group. The desired behavior is up to the system administrator.

.. _group-slurm-bindings:

Slurm group bindings
""""""""""""""""""""

Enable Slurm node group bindings by renaming the example configuration file
usually installed as ``/etc/clustershell/groups.conf.d/slurm.conf.example`` to
``slurm.conf``. Three group sources are defined in this file and are detailed
below. Each section comes with a long and short names (for convenience), but
actually defines a same group source.

While examples below are based on the :ref:`nodeset-tool` tool, all Python
tools using ClusterShell and the :class:`.NodeSet`  class will automatically
benefit from these additional node groups.

.. highlight:: ini

The first section **slurmpart,sp** defines a group source based on Slurm
partitions. Each group is named after the partition name and contains the
partition's nodes::

    [slurmpart,sp]
    map: sinfo -h -o "%N" -p $GROUP
    all: sinfo -h -o "%N"
    list: sinfo -h -o "%R"
    reverse: sinfo -h -N -o "%R" -n $NODE

.. highlight:: console

Example of use with :ref:`nodeset <nodeset-tool>` on a cluster having two Slurm
partitions named *kepler* and *pascal*::

    $ nodeset -s sp -ll
    @sp:kepler cluster-[0001-0065]
    @sp:pascal cluster-[0066-0068]

.. highlight:: ini

The second section **slurmresv,sr** defines a group source based on Slurm
reservations. Each group is based on a different reservation and contains
the nodes currently in that reservation::

    [slurmresv,sr]
    map: scontrol -o show reservation $GROUP | grep -Po 'Nodes=\K[^ ]+'
    all: scontrol -o show reservation | grep -Po 'Nodes=\K[^ ]+'
    list: scontrol -o show reservation | grep -Po 'ReservationName=\K[^ ]+'
    cache_time: 60

.. highlight:: console

Example of use on a cluster having a reservation in place for an upcoming
system maintenance::

    $ nodeset -s slurmresv -l
    @slurmresv:Maintenance_2025-02-04
    $ clush -w @slurmresv:Maintenance_2025-02-04 uptime

.. highlight:: ini

The next section **slurmstate,st** defines a group source based on Slurm
node states. Each group is based on a different state name and contains the
nodes currently in that state::

    [slurmstate,st]
    map: sinfo -h -o "%N" -t $GROUP
    all: sinfo -h -o "%N"
    list: sinfo -h -o "%T" | tr -d '*~#$@+'
    reverse: sinfo -h -N -o "%T" -n $NODE | tr -d '*~#$@+'
    cache_time: 60

Here, :ref:`cache_time <group-external-caching>` is set to 60 seconds instead
of the default (3600s) to avoid caching results in memory for too long, in
case of state change (this is only useful for long-running processes, not
one-shot commands).

.. highlight:: console

Example of use with :ref:`nodeset <nodeset-tool>` to get the current nodes that
are in the Slurm state *drained*::

    $ nodeset -f @st:drained
    cluster-[0058,0067]

.. highlight:: ini

The next section **slurmjob,sj** defines a group source based on Slurm jobs.
Each group is based on a running job ID and contains the nodes currently
allocated for this job::

    [slurmjob,sj]
    map: squeue -h -j $GROUP -o "%N"
    list: squeue -h -o "%i" -t R
    reverse: squeue -h -w $NODE -o "%i"
    cache_time: 60

The next section **slurmuser,su** defines a group source based on Slurm users.
Each group is based on a username and contains the nodes currently
allocated for jobs belonging to the username::

    [slurmuser,su]
    map: squeue -h -u $GROUP -o "%N" -t R
    list: squeue -h -o "%u" -t R
    reverse: squeue -h -w $NODE -o "%i"
    cache_time: 60

.. highlight:: console

Example of use with :ref:`clush <clush-tool>` to execute a command on all nodes
with running jobs of username::

    $ clush -bw@su:username 'df -Ph /scratch'
    $ clush -bw@su:username 'du -s /scratch/username'

:ref:`cache_time <group-external-caching>` is also set to 60 seconds instead
of the default (3600s) to avoid caching results in memory for too long, because
this group source is likely very dynamic (this is only useful for long-running
processes, not one-shot commands).

.. highlight:: ini

The next section **slurmaccount,sa** defines a group source based on Slurm
accounts. Each group is based on a account and contains the nodes where there
are running jobs under this account::

    [slurmaccount,sa]
    map: squeue -h -A $GROUP -o "%N" -t R
    list: squeue -h -o "%a" -t R
    reverse: squeue -h -w $NODE -o "%a" 2>/dev/null || true
    cache_time: 60

.. highlight:: console

For example, to find all nodes that have running jobs from the account ``ruthm``::

    $ cluset -f @sa:ruthm
    sh02-01n57,sh03-09n51,sh03-11n10

.. highlight:: ini

The next section **slurmqos,sq** defines a group source based on Slurm QoS.
Each group is based on a qos and contains the nodes where there are running
jobs under this qos::

    [slurmqos,sq]
    map: squeue -h -q $GROUP -o "%N" -t R
    list: squeue -h -o "%q" -t R
    reverse: squeue -h -w $NODE -o "%q" 2>/dev/null || true
    cache_time: 60

.. highlight:: console

Then it is easy to find nodes currently running jobs in a specified qos, here
in qos ``long`` for example::

    $ cluset -f @slurmqos:long
    sh02-01n[01-02,16-17,45,51,56],sh03-01n[02,29,61]

.. _group-xcat-bindings:

xCAT group bindings
"""""""""""""""""""

Enable xCAT node group bindings by renaming the example configuration file
usually installed as ``/etc/clustershell/groups.conf.d/xcat.conf.example`` to
``xcat.conf``. A single group source is defined in this file and is detailed
below.

.. warning:: xCAT installs its own `nodeset`_ command which
   usually takes precedence over ClusterShell's :ref:`nodeset-tool` command.
   In that case, simply use :ref:`cluset <cluset-tool>` instead.

While examples below are based on the :ref:`cluset-tool` tool, all Python
tools using ClusterShell and the :class:`.NodeSet`  class will automatically
benefit from these additional node groups.

.. highlight:: ini

The section **xcat** defines a group source based on xCAT static node groups::

    [xcat]

    # list the nodes in the specified node group
    map: lsdef -s -t node $GROUP | cut -d' ' -f1
    
    # list all the nodes defined in the xCAT tables
    all: lsdef -s -t node | cut -d' ' -f1
    
    # list all groups
    list: lsdef -t group | cut -d' ' -f1

.. highlight:: console

Example of use with :ref:`cluset-tool`::

    $ lsdef -s -t node dtn
    sh-dtn01  (node)
    sh-dtn02  (node)
    
    $ cluset -s xcat -f @dtn
    sh-dtn[01-02]

.. highlight:: text

.. _defaults-config:

Library Defaults
----------------

.. warning:: Modifying library defaults is for advanced users only as that
   could change the behavior of tools using ClusterShell. Moreover, tools are
   free to enforce their own defaults, so changing library defaults may not
   change a global behavior as expected.

Since version 1.7, most defaults of the ClusterShell library may be overridden
in *defaults.conf*.

The following configuration file defines ClusterShell system-wide defaults::

    /etc/clustershell/defaults.conf

*defaults.conf* settings might then be overridden (globally, or per user) if
one of the following files is found, in priority order::

    $XDG_CONFIG_HOME/clustershell/defaults.conf
    $HOME/.config/clustershell/defaults.conf (only if $XDG_CONFIG_HOME is not defined)
    {sys.prefix}/etc/clustershell/defaults.conf
    $HOME/.local/etc/clustershell/defaults.conf

In addition, if the environment variable ``$CLUSTERSHELL_CFGDIR`` is defined and
valid, it will used instead. In such case, the following configuration file
will be tried first for ClusterShell defaults::

    $CLUSTERSHELL_CFGDIR/defaults.conf

Use case: rsh
^^^^^^^^^^^^^^

If your cluster uses a rsh variant like ``mrsh`` or ``krsh``, you may want to
change it in the library defaults.

An example file is usually available in
``/usr/share/doc/clustershell-*/examples/defaults.conf-rsh`` and could be
copied to ``/etc/clustershell/defaults.conf`` or to an alternate path
described above. Basically, the change consists in defining an alternate
distant worker by Python module name as follow::

    [task.default]
    distant_workername: Rsh


.. _defaults-config-slurm:

Use case: Slurm
^^^^^^^^^^^^^^^

If your cluster naming scheme has multiple dimensions, as in ``node-93-02``, we
recommend that you disengage some nD folding when using Slurm, which is
currently unable to detect some multidimensional node indexes when not
explicitly enclosed with square brackets.

To do so, define ``fold_axis`` to -1 in the :ref:`defaults-config` so that nD
folding is only computed on the last axis (seems to work best with Slurm)::

    [nodeset]
    fold_axis: -1

That way, node sets computed by ClusterShell tools can be passed to Slurm
without error.

.. _ConfigParser: http://docs.python.org/library/configparser.html
.. _nodeset: https://xcat-docs.readthedocs.io/en/stable/guides/admin-guides/references/man8/nodeset.8.html
.. _sys.prefix: https://docs.python.org/3/library/sys.html#sys.prefix