1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162
|
.TH "hmmpgmd_shard" 1 "@HMMER_DATE@" "HMMER @HMMER_VERSION@" "HMMER Manual"
.SH NAME
hmmpgmd_shard \- sharded daemon for database search web services
.SH SYNOPSIS
.B hmmpgmd_shard
[\fIoptions\fR]
.SH DESCRIPTION
.PP
The
.B hmmpgmd_shard
program provides a sharded version of the
.B hmmpgmd
program that we use internally to implement high-performance HMMER services that can be accessed via the internet. See the
.B hmmpgmd
man page for a discussion of how the base
.B hmmpgmd
program is used. This man page discusses differences between
.B hmmpgmd_shard
and
.B hmmpgmd.
The base
.B hmmpgmd
program loads the entirety of its database file into RAM on every worker node, in spite of the fact that each worker node searches a predictable fraction of the database(s) contained in that file when performing searches. This wastes RAM, particularly when many worker nodes are used to accelerate searches of large databases.
.PP
.B Hmmpgmd_shard
addresses this by dividing protein sequence database files into shards. Each worker node loads only 1/Nth of the database file, where N is the number of worker nodes attached to the master. HMM database files are not sharded, meaning that every worker node will load the entire database file into RAM. Current HMM databases are much smaller than current protein sequence databases, and easily fit into the RAM of modern servers even without sharding.
.PP
.B Hmmpgmd_shard
is used in the same manner as
.B hmmpgmd
, except that it takes one additional argument:
.BI \-\-num_shards " <n>"
, which specifies the number of shards that protein databases will be divided into, and defaults to 1 if unspecified. This argument is only valid for the master node of a
.B hmmpgmd
system (i.e., when
.BI \-\-master
is passed to the
.B hmmpgmd
program), and must be equal to the number of worker nodes that will connect to the master node.
.B Hmmpgmd_shard
will signal an error if more than
.BI num_shards
worker nodes attempt to connect to the master node or if a search is started when fewer than
.BI num_shards
workers are connected to the master.
.SH OPTIONS
.TP
.B \-h
Help; print a brief reminder of command line usage and all available
options.
.TP
.BI \-\-master
Run as the master server.
.TP
.BI \-\-worker " <s>"
Run as a worker, connecting to the master server that is running on IP
address
.IR <s> .
.TP
.BI \-\-cport " <n>"
Port to use for communication between clients and the master server.
The default is 51371.
.TP
.BI \-\-wport " <n>"
Port to use for communication between workers and the master server.
The default is 51372.
.TP
.BI \-\-ccncts " <n>"
Maximum number of client connections to accept. The default is 16.
.TP
.BI \-\-wcncts " <n>"
Maximum number of worker connections to accept. The default is 32.
.TP
.BI \-\-pid " <f>"
Name of file into which the process id will be written.
.TP
.BI \-\-seqdb " <f>"
Name of the file (in
.B hmmpgmd
format) containing protein sequences.
The contents of this file will be cached for searches.
.TP
.BI \-\-hmmdb " <f>"
Name of the file containing protein HMMs. The contents of this file
will be cached for searches.
.TP
.BI \-\-cpu " <n>"
Number of parallel threads to use (for
.B \-\-worker
).
.TP
.BI \-\-num_shards " <n>"
Number of shards to divide cached sequence database(s) into. HMM databases are not sharded, due to their small size.
This option is only valid when the
.B \-\-master
option is present, and defaults to 1 if not specified.
.B Hmmpgmd_shard
requires that the number of shards be equal to the number of worker nodes, and will give errors if more than
.BI num_shards
workers attempt to connect to the master node or if a search is started with fewer than
.BI num_shards
workers connected to the master.
.SH SEE ALSO
See
.BR hmmmpgmd (1)
for a description of the base hmmpgmd command and how the daemon should be used.
.BR hmmer (1)
for a master man page with a list of all the individual man pages
for programs in the HMMER package.
.PP
For complete documentation, see the user guide that came with your
HMMER distribution (Userguide.pdf); or see the HMMER web page
(@HMMER_URL@).
.SH COPYRIGHT
.nf
@HMMER_COPYRIGHT@
@HMMER_LICENSE@
.fi
For additional information on copyright and licensing, see the file
called COPYRIGHT in your HMMER source distribution, or see the HMMER
web page
(@HMMER_URL@).
.SH AUTHOR
.nf
http://eddylab.org
.fi
|