1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142
|
+-------------+
| Description |
+-------------+
Pdsh is a multithreaded remote shell client which executes commands
on multiple remote hosts in parallel. Pdsh can use several different remote
shell services, including standard "rsh", Kerberos IV, and ssh.
See the man page in this directory for usage information.
+---------------+
| Configuration |
+---------------+
As of version 1.6, pdsh uses GNU autoconf for configuration. To configure:
./configure
--with-ssh[=/path/to/ssh/program]
Use SSH as the primary remote shell type. This is not recommended
unless you cannot enable rsh root equivalence between nodes.
SSH has not been tested extensively. There will be no configurable
connect timeout. SSH will be run with no controlling tty, so if it
needs to prompt for a password or a "man in the middle" message, it
will fail. By default, configure will search for a program named
"ssh". If you prefer ssh2, specify its full path as the argument
to --with-ssh, e.g. --with-ssh=/usr/local/bin/ssh2.
--with-krb4[=/path/to/krb4/root]
Use Kerberos IV as the primary remote shell type. This is not
recommended unless you are running on an SP and cannot enable rsh
root equivalence between nodes. Kerberos IV has not been extensively
tested. The action of refreshing TGT's is very slow under pdsh.
--with-machines=/path/to/machines
Use a flat file list of machine names for -a instead of nodeattr
or SDRGetObjects. You must specify this option if you do not have
nodeattr or SDRGetObjects (see below).
--with-elan
Enable support for running parallel jobs on the Quadrics Elan
interconnect via the -E option and qshell daemon.
--enable-debug
Turns on assertion checking and compiles with -Wall.
--with-fanout=N
Specify default fanout (default is 32)
--with-connect-timeout=N
Set default connect timeout (default is 10 seconds)
--with-readline
Use the GNU readline library to parse input in interactive mode.
A word about -a (target "all" nodes) and -i (use alternate hostnames) option:
On an SP, configure will find SDRGetObjects and use that to generate the
list of "all" nodes for -a (reliable_hostname), transform to alternate
hostnames with -i (initial_hostname), etc..
If SDRGetObjects is not found and if the LLNL Genders software is available,
configure will find the nodeattr command and use that to generate the list
of "all" nodes for -a ("all" attribute).
If neither of the above are found, a flat list of hostnames can be used for
the list of "all" nodes. Specify the location of this file using the
--with-machines configure option.
+------------+
| INSTALLING |
+------------+
make
make install
+---------+
| GOTCHAS |
+---------+
Watch out for the following gotchas:
1) Pdsh uses one reserved socket for each active connection, two if it is
maintaining a separate connection for stderr. It obtains these sockets
by calling rresvport(), which normally draws from a pool of 256 sockets.
You may exhaust these if multiple pdsh's are running simultanously on
a machine, or if the fanout is set too high.
2) When pdsh is using a remote shell service that is wrapped with TCP wrappers,
there are three areas where bottlenecks can be created: IDENT, DNS, and SYSLOG.
If your hosts.allow includes "user@", e.g. "in.rshd : ALL@ALL : ALLOW"
and TCP wrappers is configured to support IDENT, each simultaneous remote shell
connection will result in an IDENT query back to the source. For large fanouts
this can quickly overwhelm the source. Similarly, if TCP wrappers is
configured to query the DNS on every connection, pdsh may overwhelm the
DNS server. Finally, if every remote shell connection results in a remote
syslog entry, syslogd on your loghost may be overwhelmed and logs may grow
excessively long.
If local security policy permits, consider configuring TCP wrappers to avoid
calling IDENT, DNS, or SYSLOG on every remote shell connection. Configuring
without the "PARANOID" option (which requires all connections to be
registered in the DNS), permitting a simple list of IP addresses or a
subnet (no names, and no user@ prefix), and setting the SYSLOG severity for
the remote shell service to a level that is not remotely logged will avoid
these pitfalls. If these actions are not possible, you may wish to
reduce pdsh's default fanout (configure --with-fanout=N).
+---------------------+
| THEORY OF OPERATION |
+---------------------+
A thread is created for each rsh connection to a node. Each thread opens
a connection using an MT-safe rcmd-like function, then copies stdin/stderr
and terminates.
The mainline starts fanout number of rsh threads and waits on a condition
variable that is signalled by the rsh threads as they terminate. When
the condition variable is signalled, the main thread starts a new rsh thread
to maintain the fanout, until all remote commands have been executed.
A timeout thread is created that monitors the state of the threads and
terminates any that take too much time connecting or, if requested on the
command line, take too long to complete.
Typing ^C causes pdsh to list threads that are in the connected state.
Another ^C immediately following the first one terminates the program.
+--------+
| AUTHOR |
+--------+
Jim Garlick <garlick@llnl.gov>
Please send suggestions, bug reports, or just a note letting me know that you
are using pdsh (it would be interesting to hear how many nodes are in your
cluster).
+------+
| NOTE |
+------+
This product includes software developed by the University of California,
Berkeley and its contributors (xrcmd.c, k4cmd.c, qcmd.c). Modifications
have been made and bugs are probably mine.
The PDSH software package has no affiliation with the Democratic Party of
Albania (www.pdsh.org).
|