File: execution_systems_guide.tex

package info (click to toggle)
mobyle 1.5.5%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: stretch
  • size: 8,284 kB
  • ctags: 2,783
  • sloc: python: 22,709; sh: 33; makefile: 31; ansic: 10; xml: 6
file content (304 lines) | stat: -rw-r--r-- 13,975 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
\documentclass[11pt,a4paper]{article} 
\usepackage[latin1]{inputenc} 
\usepackage[american]{babel}
\usepackage{verbatim}
\usepackage{url}
% %pour creer des lien vers des pages web
\usepackage[pdftex,colorlinks=true, urlcolor=cyan,pdfstartview=FitH]{hyperref}
%%pour inserer du code
\usepackage{listings}
\lstset{ %
         language=Python,
         showspaces=false,
         showstringspaces=false,
         showtabs=false,
         basicstyle=\footnotesize,
         }

%% pour afficher des images avec pdflatex utiliser le package pdftex
%\usepackage{graphicx}

%% MARGES
\oddsidemargin -1cm
\marginparwidth 0cm \textwidth 18.5cm
\topmargin -0.5cm
\headheight -0.8cm \headsep 0cm
%%\footskip 0cm
\textheight 26.8cm
\pagestyle{plain}

\date{ Mobyle 1.0 }

\title{ The Mobyle Execution System  }

\begin{document}
\maketitle

\tableofcontents

\section{concepts}
There are 3 main actors in the Mobyle execution system:
\begin{itemize}
\item The ExecutionSystem is the interface between Mobyle and the low level
execution system as your local system or your favorite Distributed Resources
Management System (DRMS),
\item The ExecutionConfig is an object which provides the basic information
needed by the ExecutionSystem to be used in your environment,
\item The Dispatcher chooses which execution system to use for a job.
\end{itemize}
Each actor is modelized as a class and several implementations are provided in
Mobyle distribution.

\subsection{The ExecutionSystem}
All execution System classes inherit from the abstract class ExecutionSystem
and are located in the MOBYLEHOME/Src/Mobyle/Execution package. 5 ExecutionSystem 
classes are available:
\begin{itemize}
  \item SYS which is the interface between Mobyle and your local system.
  \item SGE which is the interface between Mobyle and the Sun Grid Engine DRMS.
  \item SgeDRMAA which is the interface between Mobyle and the Sun Grid Engine
  DRMS using the drmaa library.
  \item PbsDRMAA which is the interface between Mobyle and the PBS/torque
  DRMS using the drmaa library. 
  \item Lsf DRMAA which is the interface between Mobyle and the LSF
  DRMS using the drmaa library.
\end{itemize}

For SGE and PBS, two execution systems are proposed: one which wraps
the shell commands and another which deals with Distributed Resource Management
Application Api (DRMAA) library. The benefits to use the DRMAA library is that
there is no need to use intermediate shell to run, get the status or
kill a job. 
For those who cannot or do not want to install libdrmaa, the legacy
SGE execution systems is kept.

\subsection{The ExecutionConfig}
Mobyle is highly flexible towards the execution of jobs, you can use several
Execution System in parallel. For instance you can use SGE for most of jobs and
SYS for some very small jobs or one cluster managed by SGE for a set of jobs and 
an other cluster managed by PBS for the other jobs on so on. 
For each execution system you want to use you must define an ExecutionConfig. 
this instanciation must be done in EXECUTION\_SYSTEM\_ALIAS. 
To each class of ExecutionSystem a class of ExecutionConfig is associated. So
the choice of an ExecutionConfig determines which ExecutionSystem will be
used.

\subsubsection{SgeDRMAAConfig}
is associated to SgeDRMAA ExecutionSystem, which is the interface with SGE
using libdrmaa. This class takes 3 mandatory parameters:
\begin{enumerate}
  \item the path of the drmaa library (eg. /usr/local/sge/lib/lx26-amd64/libdrmaa.so )
  \item root the content of the SGE\_ROOT variable.
  \item cell the content of the SGE\_CELL variable.
\end{enumerate}

\subsubsection{PbsDRMAAConfig}
is associated to PbsDRMAA ExecutionSystem, which is the interface with Pbs/torque
using libdrmaa. This class takes 2 mandatory parameters:
\begin{enumerate}
  \item the path of the drmaa library (eg.  /usr/local/lib64/libdrmaa.so)
  \item le fully qualified name of the host where is located the PBS/torque
  daemon server
\end{enumerate}

\subsubsection{LsfDRMAAConfig}
is associated to LsfDRMAA ExecutionSystem, which is the interface with
LSF using libdrmaa. This class takes 3 mandatory parameters:
\begin{enumerate}
	\item the path of the drmaa library (eg.  /usr/local/lib64/libdrmaa.so)
  	\item lsf\_envdir the content of the variable ENVDIR of LSF
  	\item lsf\_serverdir the content of the variable SERVERDIR of LSF
\end{enumerate}

\subsubsection{SGEConfig}
is associated to SGE ExecutionSystem, which is the interface with SGE using the shell 
commands wrapping. 
This class takes 2 mandatory parameters:
\begin{enumerate}
  \item root the content of the SGE\_ROOT variable
  \item cell the content of the SGE\_CELL variable
\end{enumerate}

\subsubsection{SYSConfig}
is associated to SYS ExecutionSystem. It is used to launch job without any DRMS.
There is no argument to run a job in ``local''.

\subsection{The Dispatcher}
After having defined all your execution system, you have to specify what system
must be used for a given program. This is the role of the dispatcher.\\
A DefaultDispatcher is provided. 

\subsubsection{The DefaultDispatcher}
associates statically one program to an ExecutionSystem and a queue. 
The DefaultDispatcher takes as argument a dictionary where the
name of the programs are the keys and the values are tuple with 2 arguments.
the first one is an EXECUTION\_SYSTEM\_ALIAS entry and the second a queue name.
There is a joker program name : ``DEFAULT'' to define a system for all programs
which are not listed in the keys.


\section{how to configure}

\subsection{First step: define the execution system you want to use.}
The key is a symbolic name you give to an execution system.\\
The value is an instance of ExecutionConfig.
following an example of EXECUTION\_SYSTEM\_ALIAS using all ExecutionConfig
we provide. 
\begin{verbatim}
from Execution import  *
EXECUTION_SYSTEM_ALIAS = {
          'DRMAA_sge'  : SgeDRMAAConfig( '/usr/local/sge/lib/lx26-amd64/libdrmaa.so' ,
                                         root ='/usr/local/sge',
                                         cell = 'default' ) ,
          'DRMAA_torque': PbsDRMAAConfig('/usr/local/lib64/libdrmaa.so' ,
                                         'marygay.sis.pasteur.fr'),

          'SGE'         : SGEConfig( root = '/usr/local/sge',
                                     cell= 'default' ) ,
          'SYS'        : SYSConfig()  ,
          'LSF'         : LsfDRMAAConfig( '/home/bneron/Sys/lib/libdrmaa.so' ,  
                                                          lsf_envdir = '/home/bneron/Sys/share/lsf/conf' ,
                                                          lsf_serverdir = '/home/bneron/Sys/share/lsf/7.0/linux2.6-glibc2.3-x86_64/etc')
                          }
          }
\end{verbatim}

here a second example to illustrate that you can use the same Execution System with different Config.  

\begin{verbatim}
from Execution import  *
EXECUTION_SYSTEM_ALIAS = {
          'cluster1'  : SgeDRMAAConfig( '/usr/local/sge/lib/lx26-amd64/libdrmaa.so' ,
                                         root ='/usr/local/sge',
                                         cell = 'cluster1' ) ,
          'cluster2' : SgeDRMAAConfig( '/usr/local/sge/lib/lx26-amd64/libdrmaa.so' ,
                                         root ='/usr/local/sge',
                                         cell = 'cluster2' ) ,       
                           }
\end{verbatim}
in this example, the cluster1 and cluster2 have differents features, 
number of nodes, memory \ldots and some jobs must run on cluster1 and the other on cluster2.
 
\subsection{Second step: define which SystemExecution will be used for given program.}
 
After having defined your ExecutionSystem you must specify in what conditions you will use it.
We illustrate here the configuration of the DefaultDispatcher which links a job
name to an ExecutionConfig and a queue.
\begin{verbatim}
from Mobyle.Dispatcher import DefaultDispatcher

DISPATCHER = DefaultDispatcher( {
       'blast2'    : ( EXECUTION_SYSTEM_ALIAS[ 'DRMAA_sge' ] , 'mobyle' ),
       'fastdnaml' : ( EXECUTION_SYSTEM_ALIAS[ 'DRMAA_torque' ] , 'mobyle' ),
       'toppred'   : ( EXECUTION_SYSTEM_ALIAS[ 'DRMAA_torque' ] , 'short' ),
       'dnapars'   : ( EXECUTION_SYSTEM_ALIAS[ 'SGE' ]    , 'long' ),
       'golden'    : ( EXECUTION_SYSTEM_ALIAS[ 'SYS' ]   ,   ''    ),
       'kitch'     : ( EXECUTION_SYSTEM_ALIAS[ 'LSF' ] , 'mobyle' ),
       'DEFAULT'   : ( EXECUTION_SYSTEM_ALIAS[ 'DRMAA_sge' ]    , 'mobyle' )
                             } )
\end{verbatim}
or
\begin{verbatim}
from Mobyle.Dispatcher import DefaultDispatcher

DISPATCHER = DefaultDispatcher( {
       'job1' : ( EXECUTION_SYSTEM_ALIAS[ 'cluster1' ] , 'short' ),
       'job2' : ( EXECUTION_SYSTEM_ALIAS[ 'cluster1' ] , 'long' ),
       'DEFAULT'   : ( EXECUTION_SYSTEM_ALIAS[ 'cluster2' ] , 'mobyle' )
                           } )
\end{verbatim}

Don't be afraid by this configuration once it is done you do not have to
change it very often, and for most of you, you will have only one ExecutionConfig.
For instance if you use SGE and one queue 'mobyle' for all jobs, the
configuration will be:
\begin{verbatim}
EXECUTION_SYSTEM_ALIAS = { 'SGE'  : SGEConfig( root = '/usr/local/sge', cell='default' ) } 
DISPATCHER = DefaultDispatcher( { 'DEFAULT' : ( EXECUTION_SYSTEM_ALIAS[ 'SGE'] , 'mobyle' )
\end{verbatim}

\section{add new execution system}
\label{sec:new_execution_system}
If you have an other execution system not supported by Mobyle, 
you can develop our own ExecutionSystem. This Class must inherits from the abstract ExecutionSystem Class.
The module must contain a class named as the module. Only this class can be used by Mobyle.
Your module must be located in MOBYLEHOME/Src/Execution package.
The new Class must implement 4 methods \_\_init\_\_, \_run , getStatus and kill.

\subsection{\_\_init\_\_}
The init method has one argument which is the ExecutionConfig and is used to do 
all requirements to comunicate with the DRM. 
For instance set some variables in the environment \ldots usually these informations are
contained in the ExecutionConfig. The \_\_init\_\_ method is not necessary if
you don't need any configuration like SYS class.

\subsection{\_run}
This method is the most complex you have to write to implement, and it has several responsabilities.
\begin{itemize}
  \item This method is responsible for submitting the job to the DRMS.
  \item This method must be synchron with the job execution.
  \item As we do not use a server which can keep a record of all jobs, we must store some informations to retrieve a 
given job from an other process (cgi) to get the status of a job or to kill it.
These informations are stored in a '.admin' file located in each job directory.
The \_run method is responsible for setting the value of the execution Sytem used and the key to retrieve this job 
on this Execution system to the an Admin object. The name is always accesible through self.execution\_config\_alias attribute 
and the key is the pid of the job for SYS or the job identifier in SGE\ldots.   
  \item The ADMINDIR directory is a kind of table of all jobs currently in execution in Mobyle. 
It contains a symbolic link toward each job currently running.
The \_run method must make this link when it submits the job to the DRMS and remove it when the job is finished.
  \item Finally, when the job is finished the \_run must map the status job to a Mobyle.Status and return it.
\end{itemize}
There is a Dummy class in the Execution package to help the developer to write his own Execution class.

\subsection{getStatus}
Has one argument, the identifier of the job for this DRMs ( one that you store in .admin file in the \_run method ).
This method queries the DMRS about the status of this job and maps this DRMS status to a Mobyle Status. 
If the job cannot be found in the DRMS the Status ``unknown'' must be returned.

\subsection{kill}
Has one argument, the identifier of the job for this DRMs ( one that you store in .admin file in the \_run method ). 
This method asks the drms to kill the job and return None.

\subsection{ExecutionConfig}
You must implement also the ExecutionConfig which will be associated to this Class. 
The ExecutionConfig Class must be named as the new ExecutionSystem you develop with ``Config'' 
at the end. The module containing this ExecutionConfig
class must be called also as the Class and located in MOBYLEHOME/Local/Config/Execution package.\\

Your ExecutionConfig class must inherit from ExecutionConfig and implement
all requirements needed by your ExecutionSystem you have coded.
At this point you can use your new Execution suitable to your need from the general Config as any 
other provided classes.
   
\section{add new dispatcher}
The Default dispatcher allows to associate one program to one ExecutionSystem
and one queue. This queue is determined statically in the config. If you
need something more dynamic, like compute the quqe based on the email of the
user for instance, you must develop a new Dispatcher. This new dispatcher will
inherit from Dispatcher Class and must implement 2 methods:
\begin{itemize}
  \item getQueue
  \item getExecutionConfig
\end{itemize}

\subsection{getQueue}
Has one argument which is a JobState instance. You can easily access to all
job characteristics (the name of the job, the email of the user\ldots) 
to compute the queue name and return it.   

\subsection{getExecutionConfig}
returns the ExecutionConfig which will used to execute a job.

\section{DRMS requirement if DRMAA is used}

\subsection{python-drmaa}

\subsection{torque}
To work with DRMAA and Mobyle ( to run a synchronous job ) torque must be able
to report the status of a completed jobs. This feature is enabled by setting the keep\_completed attribute on the job execution 
queue or server configuration.

\subsection{LSF}
We need lsf-drmaa from the FedStage DRMAA for LSF project
http://sourceforge.net/projects/lsf-drmaa/
\end{document}