1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304
|
\documentclass[11pt,a4paper]{article}
\usepackage[latin1]{inputenc}
\usepackage[american]{babel}
\usepackage{verbatim}
\usepackage{url}
% %pour creer des lien vers des pages web
\usepackage[pdftex,colorlinks=true, urlcolor=cyan,pdfstartview=FitH]{hyperref}
%%pour inserer du code
\usepackage{listings}
\lstset{ %
language=Python,
showspaces=false,
showstringspaces=false,
showtabs=false,
basicstyle=\footnotesize,
}
%% pour afficher des images avec pdflatex utiliser le package pdftex
%\usepackage{graphicx}
%% MARGES
\oddsidemargin -1cm
\marginparwidth 0cm \textwidth 18.5cm
\topmargin -0.5cm
\headheight -0.8cm \headsep 0cm
%%\footskip 0cm
\textheight 26.8cm
\pagestyle{plain}
\date{ Mobyle 1.0 }
\title{ The Mobyle Execution System }
\begin{document}
\maketitle
\tableofcontents
\section{concepts}
There are 3 main actors in the Mobyle execution system:
\begin{itemize}
\item The ExecutionSystem is the interface between Mobyle and the low level
execution system as your local system or your favorite Distributed Resources
Management System (DRMS),
\item The ExecutionConfig is an object which provides the basic information
needed by the ExecutionSystem to be used in your environment,
\item The Dispatcher chooses which execution system to use for a job.
\end{itemize}
Each actor is modelized as a class and several implementations are provided in
Mobyle distribution.
\subsection{The ExecutionSystem}
All execution System classes inherit from the abstract class ExecutionSystem
and are located in the MOBYLEHOME/Src/Mobyle/Execution package. 5 ExecutionSystem
classes are available:
\begin{itemize}
\item SYS which is the interface between Mobyle and your local system.
\item SGE which is the interface between Mobyle and the Sun Grid Engine DRMS.
\item SgeDRMAA which is the interface between Mobyle and the Sun Grid Engine
DRMS using the drmaa library.
\item PbsDRMAA which is the interface between Mobyle and the PBS/torque
DRMS using the drmaa library.
\item Lsf DRMAA which is the interface between Mobyle and the LSF
DRMS using the drmaa library.
\end{itemize}
For SGE and PBS, two execution systems are proposed: one which wraps
the shell commands and another which deals with Distributed Resource Management
Application Api (DRMAA) library. The benefits to use the DRMAA library is that
there is no need to use intermediate shell to run, get the status or
kill a job.
For those who cannot or do not want to install libdrmaa, the legacy
SGE execution systems is kept.
\subsection{The ExecutionConfig}
Mobyle is highly flexible towards the execution of jobs, you can use several
Execution System in parallel. For instance you can use SGE for most of jobs and
SYS for some very small jobs or one cluster managed by SGE for a set of jobs and
an other cluster managed by PBS for the other jobs on so on.
For each execution system you want to use you must define an ExecutionConfig.
this instanciation must be done in EXECUTION\_SYSTEM\_ALIAS.
To each class of ExecutionSystem a class of ExecutionConfig is associated. So
the choice of an ExecutionConfig determines which ExecutionSystem will be
used.
\subsubsection{SgeDRMAAConfig}
is associated to SgeDRMAA ExecutionSystem, which is the interface with SGE
using libdrmaa. This class takes 3 mandatory parameters:
\begin{enumerate}
\item the path of the drmaa library (eg. /usr/local/sge/lib/lx26-amd64/libdrmaa.so )
\item root the content of the SGE\_ROOT variable.
\item cell the content of the SGE\_CELL variable.
\end{enumerate}
\subsubsection{PbsDRMAAConfig}
is associated to PbsDRMAA ExecutionSystem, which is the interface with Pbs/torque
using libdrmaa. This class takes 2 mandatory parameters:
\begin{enumerate}
\item the path of the drmaa library (eg. /usr/local/lib64/libdrmaa.so)
\item le fully qualified name of the host where is located the PBS/torque
daemon server
\end{enumerate}
\subsubsection{LsfDRMAAConfig}
is associated to LsfDRMAA ExecutionSystem, which is the interface with
LSF using libdrmaa. This class takes 3 mandatory parameters:
\begin{enumerate}
\item the path of the drmaa library (eg. /usr/local/lib64/libdrmaa.so)
\item lsf\_envdir the content of the variable ENVDIR of LSF
\item lsf\_serverdir the content of the variable SERVERDIR of LSF
\end{enumerate}
\subsubsection{SGEConfig}
is associated to SGE ExecutionSystem, which is the interface with SGE using the shell
commands wrapping.
This class takes 2 mandatory parameters:
\begin{enumerate}
\item root the content of the SGE\_ROOT variable
\item cell the content of the SGE\_CELL variable
\end{enumerate}
\subsubsection{SYSConfig}
is associated to SYS ExecutionSystem. It is used to launch job without any DRMS.
There is no argument to run a job in ``local''.
\subsection{The Dispatcher}
After having defined all your execution system, you have to specify what system
must be used for a given program. This is the role of the dispatcher.\\
A DefaultDispatcher is provided.
\subsubsection{The DefaultDispatcher}
associates statically one program to an ExecutionSystem and a queue.
The DefaultDispatcher takes as argument a dictionary where the
name of the programs are the keys and the values are tuple with 2 arguments.
the first one is an EXECUTION\_SYSTEM\_ALIAS entry and the second a queue name.
There is a joker program name : ``DEFAULT'' to define a system for all programs
which are not listed in the keys.
\section{how to configure}
\subsection{First step: define the execution system you want to use.}
The key is a symbolic name you give to an execution system.\\
The value is an instance of ExecutionConfig.
following an example of EXECUTION\_SYSTEM\_ALIAS using all ExecutionConfig
we provide.
\begin{verbatim}
from Execution import *
EXECUTION_SYSTEM_ALIAS = {
'DRMAA_sge' : SgeDRMAAConfig( '/usr/local/sge/lib/lx26-amd64/libdrmaa.so' ,
root ='/usr/local/sge',
cell = 'default' ) ,
'DRMAA_torque': PbsDRMAAConfig('/usr/local/lib64/libdrmaa.so' ,
'marygay.sis.pasteur.fr'),
'SGE' : SGEConfig( root = '/usr/local/sge',
cell= 'default' ) ,
'SYS' : SYSConfig() ,
'LSF' : LsfDRMAAConfig( '/home/bneron/Sys/lib/libdrmaa.so' ,
lsf_envdir = '/home/bneron/Sys/share/lsf/conf' ,
lsf_serverdir = '/home/bneron/Sys/share/lsf/7.0/linux2.6-glibc2.3-x86_64/etc')
}
}
\end{verbatim}
here a second example to illustrate that you can use the same Execution System with different Config.
\begin{verbatim}
from Execution import *
EXECUTION_SYSTEM_ALIAS = {
'cluster1' : SgeDRMAAConfig( '/usr/local/sge/lib/lx26-amd64/libdrmaa.so' ,
root ='/usr/local/sge',
cell = 'cluster1' ) ,
'cluster2' : SgeDRMAAConfig( '/usr/local/sge/lib/lx26-amd64/libdrmaa.so' ,
root ='/usr/local/sge',
cell = 'cluster2' ) ,
}
\end{verbatim}
in this example, the cluster1 and cluster2 have differents features,
number of nodes, memory \ldots and some jobs must run on cluster1 and the other on cluster2.
\subsection{Second step: define which SystemExecution will be used for given program.}
After having defined your ExecutionSystem you must specify in what conditions you will use it.
We illustrate here the configuration of the DefaultDispatcher which links a job
name to an ExecutionConfig and a queue.
\begin{verbatim}
from Mobyle.Dispatcher import DefaultDispatcher
DISPATCHER = DefaultDispatcher( {
'blast2' : ( EXECUTION_SYSTEM_ALIAS[ 'DRMAA_sge' ] , 'mobyle' ),
'fastdnaml' : ( EXECUTION_SYSTEM_ALIAS[ 'DRMAA_torque' ] , 'mobyle' ),
'toppred' : ( EXECUTION_SYSTEM_ALIAS[ 'DRMAA_torque' ] , 'short' ),
'dnapars' : ( EXECUTION_SYSTEM_ALIAS[ 'SGE' ] , 'long' ),
'golden' : ( EXECUTION_SYSTEM_ALIAS[ 'SYS' ] , '' ),
'kitch' : ( EXECUTION_SYSTEM_ALIAS[ 'LSF' ] , 'mobyle' ),
'DEFAULT' : ( EXECUTION_SYSTEM_ALIAS[ 'DRMAA_sge' ] , 'mobyle' )
} )
\end{verbatim}
or
\begin{verbatim}
from Mobyle.Dispatcher import DefaultDispatcher
DISPATCHER = DefaultDispatcher( {
'job1' : ( EXECUTION_SYSTEM_ALIAS[ 'cluster1' ] , 'short' ),
'job2' : ( EXECUTION_SYSTEM_ALIAS[ 'cluster1' ] , 'long' ),
'DEFAULT' : ( EXECUTION_SYSTEM_ALIAS[ 'cluster2' ] , 'mobyle' )
} )
\end{verbatim}
Don't be afraid by this configuration once it is done you do not have to
change it very often, and for most of you, you will have only one ExecutionConfig.
For instance if you use SGE and one queue 'mobyle' for all jobs, the
configuration will be:
\begin{verbatim}
EXECUTION_SYSTEM_ALIAS = { 'SGE' : SGEConfig( root = '/usr/local/sge', cell='default' ) }
DISPATCHER = DefaultDispatcher( { 'DEFAULT' : ( EXECUTION_SYSTEM_ALIAS[ 'SGE'] , 'mobyle' )
\end{verbatim}
\section{add new execution system}
\label{sec:new_execution_system}
If you have an other execution system not supported by Mobyle,
you can develop our own ExecutionSystem. This Class must inherits from the abstract ExecutionSystem Class.
The module must contain a class named as the module. Only this class can be used by Mobyle.
Your module must be located in MOBYLEHOME/Src/Execution package.
The new Class must implement 4 methods \_\_init\_\_, \_run , getStatus and kill.
\subsection{\_\_init\_\_}
The init method has one argument which is the ExecutionConfig and is used to do
all requirements to comunicate with the DRM.
For instance set some variables in the environment \ldots usually these informations are
contained in the ExecutionConfig. The \_\_init\_\_ method is not necessary if
you don't need any configuration like SYS class.
\subsection{\_run}
This method is the most complex you have to write to implement, and it has several responsabilities.
\begin{itemize}
\item This method is responsible for submitting the job to the DRMS.
\item This method must be synchron with the job execution.
\item As we do not use a server which can keep a record of all jobs, we must store some informations to retrieve a
given job from an other process (cgi) to get the status of a job or to kill it.
These informations are stored in a '.admin' file located in each job directory.
The \_run method is responsible for setting the value of the execution Sytem used and the key to retrieve this job
on this Execution system to the an Admin object. The name is always accesible through self.execution\_config\_alias attribute
and the key is the pid of the job for SYS or the job identifier in SGE\ldots.
\item The ADMINDIR directory is a kind of table of all jobs currently in execution in Mobyle.
It contains a symbolic link toward each job currently running.
The \_run method must make this link when it submits the job to the DRMS and remove it when the job is finished.
\item Finally, when the job is finished the \_run must map the status job to a Mobyle.Status and return it.
\end{itemize}
There is a Dummy class in the Execution package to help the developer to write his own Execution class.
\subsection{getStatus}
Has one argument, the identifier of the job for this DRMs ( one that you store in .admin file in the \_run method ).
This method queries the DMRS about the status of this job and maps this DRMS status to a Mobyle Status.
If the job cannot be found in the DRMS the Status ``unknown'' must be returned.
\subsection{kill}
Has one argument, the identifier of the job for this DRMs ( one that you store in .admin file in the \_run method ).
This method asks the drms to kill the job and return None.
\subsection{ExecutionConfig}
You must implement also the ExecutionConfig which will be associated to this Class.
The ExecutionConfig Class must be named as the new ExecutionSystem you develop with ``Config''
at the end. The module containing this ExecutionConfig
class must be called also as the Class and located in MOBYLEHOME/Local/Config/Execution package.\\
Your ExecutionConfig class must inherit from ExecutionConfig and implement
all requirements needed by your ExecutionSystem you have coded.
At this point you can use your new Execution suitable to your need from the general Config as any
other provided classes.
\section{add new dispatcher}
The Default dispatcher allows to associate one program to one ExecutionSystem
and one queue. This queue is determined statically in the config. If you
need something more dynamic, like compute the quqe based on the email of the
user for instance, you must develop a new Dispatcher. This new dispatcher will
inherit from Dispatcher Class and must implement 2 methods:
\begin{itemize}
\item getQueue
\item getExecutionConfig
\end{itemize}
\subsection{getQueue}
Has one argument which is a JobState instance. You can easily access to all
job characteristics (the name of the job, the email of the user\ldots)
to compute the queue name and return it.
\subsection{getExecutionConfig}
returns the ExecutionConfig which will used to execute a job.
\section{DRMS requirement if DRMAA is used}
\subsection{python-drmaa}
\subsection{torque}
To work with DRMAA and Mobyle ( to run a synchronous job ) torque must be able
to report the status of a completed jobs. This feature is enabled by setting the keep\_completed attribute on the job execution
queue or server configuration.
\subsection{LSF}
We need lsf-drmaa from the FedStage DRMAA for LSF project
http://sourceforge.net/projects/lsf-drmaa/
\end{document}
|