File: sge_pe.html

package info (click to toggle)
gridengine 6.2-4
  • links: PTS, VCS
  • area: main
  • in suites: lenny
  • size: 51,532 kB
  • ctags: 51,172
  • sloc: ansic: 418,155; java: 37,080; sh: 22,593; jsp: 7,699; makefile: 5,292; csh: 4,244; xml: 2,901; cpp: 2,086; perl: 1,895; tcl: 1,188; lisp: 669; ruby: 642; yacc: 393; lex: 266
file content (272 lines) | stat: -rw-r--r-- 12,967 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
<HTML>
<BODY BGCOLOR=white>
<PRE>
<!-- Manpage converted by man2html 3.0.1 -->
NAME
     sge_pe - Grid Engine parallel environment configuration file
     format

DESCRIPTION
     Parallel environments are parallel programming  and  runtime
     environments  allowing for the execution of shared memory or
     distributed  memory  parallelized   applications.   Parallel
     environments usually require some kind of setup to be opera-
     tional before starting parallel applications.  Examples  for
     common  parallel  environments  are  shared  memory parallel
     operating systems and the  distributed  memory  environments
     Parallel  Virtual Machine (PVM) or Message Passing Interface
     (MPI).

     <I>sge</I>_<I>pe</I> allows for the definition of interfaces to  arbitrary
     parallel  environments.   Once  a  parallel  environment  is
     defined or modified with the -ap or -mp options to  <B><A HREF="../htmlman1/qconf.html">qconf(1)</A></B>
     the  environment  can  be  requested  for  a job via the -pe
     switch to <B><A HREF="../htmlman1/qsub.html">qsub(1)</A></B> together with a request of a range for the
     number of parallel process to be allocated by the job. Addi-
     tional -l options may be used to specify the job requirement
     to further detail.

FORMAT
     The format of a <I>sge</I>_<I>pe</I> file is defined as follows:

  pe_name
     The name of the parallel environment.  To  be  used  in  the
     <B><A HREF="../htmlman1/qsub.html">qsub(1)</A></B> -pe switch.

  slots
     The number of parallel processes being  allowed  to  run  in
     total under the parallel environment concurrently.

  user_lists
     A comma separated  list  of  user  access  list  names  (see
     <B><A HREF="../htmlman5/access_list.html">access_list(5)</A></B>).  Each user contained in at least one of the
     enlisted access lists has access to  the  parallel  environ-
     ment.  If  the  user_lists  parameter  is  set  to NONE (the
     default) any user has access being not  explicitly  excluded
     via the xuser_lists parameter described below.  If a user is
     contained both in an access list enlisted in xuser_lists and
     user_lists  the  user  is  denied  access  to  the  parallel
     environment.

  xuser_lists
     The xuser_lists parameter contains a comma separated list of
     so  called user access lists as described in <B><A HREF="../htmlman5/access_list.html">access_list(5)</A></B>.
     Each user contained in at least one of the  enlisted  access
     lists  is not allowed to access the parallel environment. If
     the xuser_lists parameter is set to NONE (the  default)  any
     user  has  access.  If a user is contained both in an access
     list enlisted in xuser_lists  and  user_lists  the  user  is
     denied access to the parallel environment.

  start_proc_args
     The invocation command line of a start-up procedure for  the
     parallel  environment.  The start-up procedure is invoked by
     <B><A HREF="../htmlman8/sge_shepherd.html">sge_shepherd(8)</A></B> prior to executing the job script. Its  pur-
     pose is to setup the parallel environment correspondingly to
     its needs.  An optional prefix "user@"  specifies  the  user
     under  which  this procedure is to be started.  The standard
     output of the start-up procedure is redirected to  the  file
     <I>REQNAME</I>.po<I>JID</I>  in the job's working directory (see <B><A HREF="../htmlman1/qsub.html">qsub(1)</A></B>),
     with <I>REQNAME</I> being the name  of  the  job  as  displayed  by
     <B><A HREF="../htmlman1/qstat.html">qstat(1)</A></B>  and  <I>JID</I>  being  the  job's identification number.
     Likewise,  the  standard  error  output  is  redirected   to
     <I>REQNAME</I>.pe<I>JID</I>
     The following special variables being  expanded  at  runtime
     can  be  used  (besides  any  other strings which have to be
     interpreted by the start and stop procedures) to  constitute
     a command line:

     $<I>pe</I>_<I>hostfile</I>
          The pathname of a file containing a  detailed  descrip-
          tion  of  the  layout of the parallel environment to be
          setup by the start-up procedure. Each line of the  file
          refers  to a host on which parallel processes are to be
          run. The first entry of each line denotes the hostname,
          the second entry the number of parallel processes to be
          run on the host, the third entry the name of the queue,
          and  the  fourth  entry a processor range to be used in
          case of a multiprocessor machine.

     $<I>host</I>
          The name of the host on which the start-up or stop pro-
          cedures are started.

     $<I>job</I>_<I>owner</I>
          The user name of the job owner.

     $<I>job</I>_<I>id</I>
          Grid Engine's unique job identification number.

     $<I>job</I>_<I>name</I>
          The name of the job.

     $<I>pe</I>  The name of the parallel environment in use.

     $<I>pe</I>_<I>slots</I>
          Number of slots granted for the job.

     $<I>processors</I>
          The processors string as contained in the queue  confi-
          guration  (see  <B><A HREF="../htmlman5/queue_conf.html">queue_conf(5)</A></B>) of the master queue (the
          queue in which the start-up  and  stop  procedures  are
          started).

     $<I>queue</I>
          The cluster queue of the master queue instance.

  stop_proc_args
     The invocation command line of a shutdown procedure for  the
     parallel  environment.  The shutdown procedure is invoked by
     <B><A HREF="../htmlman8/sge_shepherd.html">sge_shepherd(8)</A></B> after the job script has finished. Its  pur-
     pose  is  to  stop the parallel environment and to remove it
     from all participating systems.  An optional prefix  "user@"
     specifies  the  user  under  which  this  procedure is to be
     started.  The standard output of the stop procedure is  also
     redirected  to  the  file <I>REQNAME</I>.po<I>JID</I> in the job's working
     directory (see <B><A HREF="../htmlman1/qsub.html">qsub(1)</A></B>), with <I>REQNAME</I> being the name of  the
     job  as  displayed by <B><A HREF="../htmlman1/qstat.html">qstat(1)</A></B> and <I>JID</I> being the job's iden-
     tification number.  Likewise, the standard error  output  is
     redirected to <I>REQNAME</I>.pe<I>JID</I>
     The same special variables as  for  start_proc_args  can  be
     used to constitute a command line.

  allocation_rule
     The allocation rule  is  interpreted  by  <B><A HREF="../htmlman8/sge_schedd.html">sge_schedd(8)</A></B>  and
     helps  the  scheduler  to  decide how to distribute parallel
     processes among the available machines. If, for instance,  a
     parallel environment is built for shared memory applications
     only, all parallel processes have to be assigned to a single
     machine, no matter how much suitable machines are available.
     If, however, the parallel environment  follows  the  distri-
     buted  memory  paradigm,  an  even distribution of processes
     among machines may be favorable.
     The current version of the scheduler  only  understands  the
     following allocation rules:

     &lt;int&gt;:    An integer number fixing the number  of  processes
               per  host.  If the number is 1, all processes have
               to reside  on  different  hosts.  If  the  special
               denominator  $pe_slots  is used, the full range of
               processes as specified with the <B><A HREF="../htmlman1/qsub.html">qsub(1)</A></B> -pe switch
               has  to  be  allocated on a single host (no matter
               which value belonging  to  the  range  is  finally
               chosen for the job to be allocated).

     $fill_up: Starting from the best  suitable  host/queue,  all
               available  slots  are allocated. Further hosts and
               queues are "filled up" as  long  as  a  job  still
               requires slots for parallel tasks.

     $round_robin:
               From all suitable hosts a single slot is allocated
               until  all tasks requested by the parallel job are
               dispatched. If more tasks are requested than suit-
               able hosts are found, allocation starts again from
               the  first  host.  The  allocation  scheme   walks
               through  suitable  hosts  in a best-suitable-first
               order.

  control_slaves
     This parameter can be set to TRUE or FALSE (the default). It
     indicates  whether  Grid  Engine is the creator of the slave
     tasks  of  a  parallel  application  via  <B><A HREF="../htmlman8/sge_execd.html">sge_execd(8)</A></B>   and
     <B><A HREF="../htmlman8/sge_shepherd.html">sge_shepherd(8)</A></B> and thus has full control over all processes
     in a parallel application, which enables  capabilities  such
     as  resource  limitation and correct accounting. However, to
     gain control over the slave tasks of a parallel application,
     a  sophisticated  PE  interface  is  required,  which  works
     closely together with Grid Engine facilities. Such PE inter-
     faces  are  available through your local Grid Engine support
     office.

     Please set the control_slaves parameter  to  false  for  all
     other PE interfaces.

  job_is_first_task
     This parameter is only checked if control_slaves (see above)
     is  set  to  TRUE and thus Grid Engine is the creator of the
     slave tasks of a parallel application via  <B><A HREF="../htmlman8/sge_execd.html">sge_execd(8)</A></B>  and
     <B><A HREF="../htmlman8/sge_shepherd.html">sge_shepherd(8)</A></B>.  In this case, a sophisticated PE interface
     is required closely coupling the  parallel  environment  and
     Grid  Engine.  The documentation accompanying such PE inter-
     faces will recommend the setting for job_is_first_task.

     The job_is_first_task parameter can be set to TRUE or FALSE.
     A  value  of  TRUE indicates that the Grid Engine job script
     already contains one of the tasks of the  parallel  applica-
     tion,  while  a value of FALSE indicates that the job script
     (and its child processes) is not part of the  parallel  pro-
     gram.


  urgency_slots
     For pending jobs with a slot range PE request the number  of
     slots  is  not determined. This setting specifies the method
     to be used by Grid Engine to assess the number of slots such
     jobs might finally get.

     The assumed slot allocation has a meaning  when  determining
     the resource-request-based priority contribution for numeric
     resources as described in <B><A HREF="../htmlman5/sge_priority.html">sge_priority(5)</A></B> and  is  displayed
     when <B><A HREF="../htmlman1/qstat.html">qstat(1)</A></B> is run without -g t option.

     The following methods are supported:

     &lt;int&gt;:    The specified integer number is directly  used  as
               prospective slot amount.

     min:      The slot range minimum is used as prospective slot
               amount.  If  no  lower bound is specified with the
               range 1 is assumed.

     max:      The of the slot range maximum is used as  prospec-
               tive  slot  amount. If no upper bound is specified
               with the range the absolute maximum  possible  due
               to the PE's slots setting is assumed.

     avg:      The average of all numbers  occurring  within  the
               job's PE range request is assumed.

RESTRICTIONS
     Note, that the functionality of the start-up,  shutdown  and
     signalling procedures remains the full responsibility of the
     administrator configuring the  parallel  environment.   Grid
     Engine  will just invoke these procedures and evaluate their
     exit status. If the procedures do not  perform  their  tasks
     properly  or  if  the  parallel  environment or the parallel
     application behave unexpectedly, Grid Engine has no means to
     detect this.

SEE ALSO
     <B><A HREF="../htmlman1/sge_intro.html">sge_intro(1)</A></B>,   <B><A HREF="../htmlman1/qconf.html">qconf(1)</A></B>,   <B><A HREF="../htmlman1/qdel.html">qdel(1)</A></B>,    <B><A HREF="../htmlman1/qmod.html">qmod(1)</A></B>,    <B><A HREF="../htmlman1/qsub.html">qsub(1)</A></B>,
     <B><A HREF="../htmlman5/access_list.html">access_list(5)</A></B>,        <B><A HREF="../htmlman8/sge_qmaster.html">sge_qmaster(8)</A></B>,        <B><A HREF="../htmlman8/sge_schedd.html">sge_schedd(8)</A></B>,
     <B><A HREF="../htmlman8/sge_shepherd.html">sge_shepherd(8)</A></B>.

COPYRIGHT
     See <B><A HREF="../htmlman1/sge_intro.html">sge_intro(1)</A></B> for a full statement of rights and  permis-
     sions.

















</PRE>
<HR>
<ADDRESS>
Man(1) output converted with
<a href="http://www.oac.uci.edu/indiv/ehood/man2html.html">man2html</a>
</ADDRESS>
</BODY>
</HTML>