File: plugin.tex

package info (click to toggle)
diet 2.8.0-1
  • links: PTS, VCS
  • area: main
  • in suites: wheezy
  • size: 38,140 kB
  • sloc: ansic: 65,575; cpp: 58,570; xml: 365; sh: 83; makefile: 29
file content (438 lines) | stat: -rw-r--r-- 20,160 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
%**
%*  @file  plugin.tex
%*  @brief   DIET User's Manual plugin scheduler chapter file 
%*  @author  - Alan SU (Alan.SU@ens-lyon.fr) 
%*  @section Licence 
%*    |LICENCE|

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% \documentclass{article}

\newenvironment{code}
{\begin{list}{}{\setlength{\leftmargin}{1em}}\item\bfseries\tt}
{\end{list}}

\newenvironment{tinycode}
{\begin{list}{}{\setlength{\leftmargin}{1em}}\item\tiny\bfseries\tt}
{\end{list}}

%% \begin{document}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\chapter{Scheduling in \diet}
\label{ch:plugin}

\section{Introduction}

We introduce a \emph{plugin scheduling} facility, designed to allow \diet
service developers to define application-specific performance measures and to
implement corresponding scheduling strategies. This section describes the
default scheduling policy in \diet and the interface to the plugin scheduling
facility.

\section{Default Scheduling Strategy}\label{sect:default_sched}

The \diet scheduling subsystem is based on the notion that, for the sake of
system efficacy and scalability, the work of determining the appropriate
schedule for a parallel workload should be distributed across the computational
platform.  When a task in such a parallel workload is submitted to the system
for processing, each Server Daemon (\sed) provides a \emph{performance
  estimate}~-- a collection of data pertaining to the capabilities of a
particular server in the context of a particular client request~-- for that
task.  These estimates are passed to the server's parent agent; agents then
sort these responses in a manner that optimizes certain performance criteria.
Effectively, candidate {\sed}s are identified through a distributed scheduling
algorithm based on pairwise comparisons between these performance estimations;
upon receiving server responses from its children, each agent performs a local
scheduling operation called \emph{server response aggregation}.  The end result
of the agent's aggregation phase is a list of server responses (from servers in
the subtree rooted at said agent), sorted according to the aggregation method
in effect. By default, the aggregation phase implements the following ordered
sequence of tests:

\begin{enumerate}
\item \textbf{Least recently used}: In the absence of application- and
  platform-specific performance data, the \diet scheduler attempts to
  probabilistically achieve load balance by assigning client requests based on
  the time they last finished to compute.  Essentially each server records a
  timestamp indicating the last time at which it was assigned a job for
  execution.  Each time a request is received, the \sed computes the time elapsed
  since its last execution, and among the responses it receives, \diet agents
  select {\sed}s with a longer elapsed time.
\item \textbf{Random}: If the {\sed} is unable to store timestamps, the \diet
  scheduler will chose randomly when comparing two otherwise equivalent {\sed}
  performance estimations.
\end{enumerate}

In principle, this scheduling policy prioritizes servers that are able to
provide useful performance prediction information. 
In general, this approach works well when all servers in a
given \diet hierarchy are capable of making such estimations. However, in
platforms composed of {\sed}s with varying capabilities, load imbalances may
occur: since \diet systematically prioritizes server responses containing scheduling data,
servers that do not respond with such performance data will never be chosen.

We have designed a plugin scheduler facility to enable the application
developer to tailor the \diet scheduling to the targeted application. This
functionality provides the application developer the means to extend the notion
of a performance estimation to include metrics that are application-specific,
and to instruct \diet how to treat those data in the aggregation phase.  We
describe these interfaces in the following sections.


\section{Plugin Scheduler Interface}

Distributed applications are varied and often exhibit performance behavior
specific to the domain from which they arise. Consequently,
application-specific scheduling approaches are often necessary to achieve
high-performance execution. We propose an extensible framework to build
\emph{plugin schedulers}, enabling application developers to specify
performance estimation metrics that are tailored to their individual needs.

%% This section introduces the principal components of the basic plugin
%% scheduler framework.

\subsection{Estimation Metric Vector}\label{sect:estvector}

The type \texttt{estVector\_t} represents an \emph{estimation vector},
logically a structure that can manage a dynamic collection of performance
estimation values. It contains values that represent the performance profile
provided by a {\sed} in response to a \diet service request. This collection of
values may include either standard performance measures that are available
through \diet, or developer-defined values that are meaningful solely in the
context of the application being developed.

\subsection{Standard Estimation Tags}\label{sect:estTags}

To access to the different fields of the \texttt{estVector\_t}, it is necessary
to specify the tag that correspond to a specific information type.
Table~\ref{t:tags} describes this correspondence. Some tags represent a list of
values, one has to use the \texttt{diet\_est\_array\_*} functions to have
access to them. In Table~\ref{t:tags},  the second column marks these
multi-value tags.

The tag \textit{ALLINFOS} is a special: his field is always empty, but it
allows to fill the vector with all known tags  by the particular collector.
 
\begin{table}[h]
 \tiny
 \centering
 \begin{tabular}[c]{|l|c|l|}\hline
%TAGS lines
 \textit{TCOMP        }&& the predicted time  to solve a problem \\[5pt]
  \hline
  \textit{TIMESINCELASTSOLVE} &   & time since last solve has been made (sec) \\[5pt]
  \hline
  \textit{FREECPU      }&& amount of free CPU power between 0 and 1 \\[5pt]
  \hline
  \textit{FREEMEM      }&& amount of free  memory (Mb) \\[5pt]
  \hline
  \textit{NBCPU        }&& number of available processors  \\[5pt]
  \hline
  \textit{CPUSPEED     }&x& frequency of CPUs (MHz) \\[5pt]
  \hline
  \textit{TOTALMEM     }&& total memory size (Mb)  \\[5pt]
  \hline
  \textit{AVGFREECPU   }&& average amount of free CPU power in [0..1] \\[5pt]
  \hline
  \textit{BOGOMIPS     }&x& CPUs' bogomips \\[5pt]
  \hline
  \textit{CACHECPU     }&x& cache size CPUs (Kb) \\[5pt]
  \hline
  \textit{TOTALSIZEDISK}&& size of the partition (Mb)\\[5pt]
  \hline
  \textit{FREESIZEDISK }&& amount of free place on partition (Mb)\\[5pt]
  \hline
  \textit{DISKACCESREAD}&& average time to read on disk (Mb/sec) \\[5pt]
  \hline
  \textit{DISKACCESWRITE}&& average time to write to disk (sec) \\[5pt]
  \hline
  \textit{ALLINFOS     }&x& [empty] fill all possible fields \\[5pt]
  \hline
  \textit{PARAL\_NB\_FREE\_RESOURCES\_IN\_DEFAULT\_QUEUE} & & number
  of idle resources \\[5pt] 
  \hline
 \end{tabular}
 \caption{Estimation tags}
 \label{t:tags}
\end{table}

\subsubsection{Standard Performance Metrics}

To access to the existing default performance estimation routines (as described
in Chapter~\ref{chapter:performance}), the following functions are available to
facilitate the construction of custom performance estimation functions:
\begin{itemize}
\item The time elapsed since the last execution (to enable the ``least recently used''
  scheduler) is stored in an estimation metric vector by calling
  \begin{tabbing}
    \texttt{int diet\_estimate\_lastexec(}\=\texttt{estVector\_t ev,} \\
    \> \texttt{const diet\_profile\_t* const profilePtr)};
  \end{tabbing}
  with an appropriate value for \texttt{ev} and the \texttt{profilePtr}
  corresponding to the current \diet request.
\item The number of waiting jobs when using the maximum concurrent jobs limit
  is stored in an estimation metric vector by calling
  \begin{tabbing}
    \texttt{int diet\_estimate\_waiting\_jobs(}\=\texttt{estVector\_t ev)};
  \end{tabbing}
\item CoRI allows to access in an easy way to basic performance
  prediction. See Chapter~\ref{sec:CORI} to know more about the use of it.
\end{itemize}

In the future, we plan to expand the suite of default estimation metrics to
include dynamic internal \diet system state information (\eg queue lengths).

\subsubsection{Developer-defined Performance Metrics}

Application developers may also define performance values to be included in a
\sed response to a client request.  For example, a \diet \sed that provides a
service to query particular databases may need to include information about
which databases are currently resident in its disk cache, in order that an
appropriate server be identified for each client request.  To store such
values, the \sed developer should first choose a unique integer identifier,
referred to as the \emph{tag} to denote each logical datum to be stored.
Values are associated with tags using the following interface:
\begin{code}
int diet\_est\_set(estVector\_t ev, int userTag, double value);
\end{code}
The \texttt{ev} parameter is the estimation vector where the value will be
stored, the \texttt{userTag} parameter denotes the chosen tag, and
\texttt{value} indicates the value to be associated with the tag.  Tagged data
are used to effect scheduling policies by defining custom server response
aggregation methods, described in Section~\ref{sect:agg_methods}.

\subsection{Estimation Function}\label{sect:est_fn}

The default behavior of a {\sed} when a service request arrives from its parent
agent is to store the following information in the request profile:
\begin{enumerate}
\item \textbf{Elapsed time since last execution}: To implement the default
  round-robin behavior in absence of scheduling facilities, each {\sed}
  stores a timestamp of its last execution.  When a service request arrives,
  the difference between that timestamp and the current time is added to the
  performance estimate.
\end{enumerate}
This is accomplished by using the \texttt{diet\_estimate\_lastexec} functions
described in Section~\ref{sect:estvector}.

To implement a plugin scheduler, we define an interface that admits
customizable performance estimation routines:
\begin{tabbing}
  \texttt{typedef void (* diet\_perfmetric\_t)(}
    \=\texttt{diet\_profile\_t*,} \\
    \>\texttt{estVector\_t);} \\
\end{tabbing}
\begin{tabbing}
  \texttt{diet\_perfmetric\_t} \\
  \texttt{diet\_service\_use\_perfmetric(diet\_perfmetric\_t perfmetric\_fn);}\\
\end{tabbing}
%% \begin{code}
%%   diet\_perfmetric\_t\\
%%   diet\_service\_use\_perfmetric(diet\_perfmetric\_t perfmetric\_fn);\\
%% \end{code}
Thus, the type \texttt{diet\_perfmetric\_t} is a function pointer that takes as
arguments a performance estimation (represented by the \texttt{estVector\_t}
object) and a \diet service request profile.  The application developer can
associate such a function, or \emph{performance estimation routine}, with \diet
services via the \texttt{diet\_service\_use\_perfmetric} interface.  This
interface returns the previously registered performance estimation routine, if
one was defined (and \texttt{NULL} otherwise).  At this point, a service added
using the \texttt{diet\_service\_table\_add} function will be associated with
the declared performance estimation routine.  Additionally, a performance
estimation routine so specified will be associated with \emph{all} services
added into the service table until another call to the
\texttt{diet\_service\_use\_perfmetric} interface is made.  In the performance
estimation routine, the {\sed} developer should store in the provided
estimation vector any performance data used in the server response aggregation
methods (described in the next section).

\subsection{Aggregation Methods}\label{sect:agg_methods}

At the time a \diet service is defined, an \emph{aggregation method}~-- the
logical mechanism by which {\sed} responses are sorted~-- is associated with
the service; the default behavior was described in
Section~\ref{sect:default_sched}.

If application-specific data \emph{are} supplied (\ie the estimation function
has been redefined), an alternative method for aggregation is needed.
Currently, a basic \emph{priority scheduler} has been implemented, enabling an
application developer to specify a series of performance values that are to be
optimized in succession.  A developer may implement a priority scheduler using
the following interface:
\begin{code}
\begin{tabbing}
diet\_aggregator\_desc\_t* \\
diet\_profile\_desc\_aggregator(diet\_profile\_desc\_t* profile); \\
\\
int diet\_aggregator\_set\_type(\=diet\_aggregator\_desc\_t* agg, \\
\> diet\_aggregator\_type\_t atype); \\
\\
int diet\_aggregator\_priority\_max(\=diet\_aggregator\_desc\_t* agg, \\
\> diet\_est\_tag\_t tag); \\
\\
int diet\_aggregator\_priority\_min(\=diet\_aggregator\_desc\_t* agg, \\
\> diet\_est\_tag\_t tag); \\
\\
int diet\_aggregator\_priority\_maxuser(\=diet\_aggregator\_desc\_t* agg, \\
\> int val); \\
\\
int diet\_aggregator\_priority\_minuser(\=diet\_aggregator\_desc\_t* agg, \\
\> int val); \\
\end{tabbing}
\end{code}
The \texttt{diet\_profile\_desc\_aggregator} and
\texttt{diet\_aggregator\_set\_type} functions fetch and configure the
aggregator corresponding to a \diet service profile, respectively.  In
particular, a priority scheduler is declared by invoking the latter function
with \texttt{DIET\_AGG\_PRIORITY} as the \texttt{agg} parameter.  Recall that
from the point of view of an agent, the aggregation phase is essentially a
sorting of the server responses from its children.  A priority scheduler
logically uses a series of user-specified tags to perform the pairwise server
comparisons needed to construct the sorted list of server responses.

To define the tags and the order in which they should be compared, four
functions are introduced.  These functions, of the form
\texttt{diet\_aggregator\_priority\_*}, serve to identify the estimation values
to be optimized during the aggregation phase.  The \texttt{\_min} and
\texttt{\_max} forms indicate that a standard performance metric (\eg time
elapsed since last execution, from the \texttt{diet\_estimate\_lastexec}
function) is to be either minimized or maximized, respectively.  Similarly, the
\texttt{\_minuser} and \texttt{\_maxuser} forms indicate the analogous
operations on user-supplied estimation values.  Calls to these functions
indicate the order of \textbf{precedence} of the tags.

Each time two server responses need to be compared, the values associated with
the tags specified in the priority aggregator are retrieved. In the specified
order, pairs of corresponding values are successively compared, passing to the
next tag only if the values for the current tag are identical. If one server
response contains a value for the metric currently being compared, and another
does not, the response with a valid value will be selected. If at any point
during the treatment of tags \emph{both} responses lack the necessary tag, the
comparison is declared indeterminate.  This process continues until one
response is declared superior to the other, or all tags in the priority
aggregator are exhausted (and the responses are judged equivalent).


\section{Example}

An example is present in the \diet distribution to illustrate the usage
of the plugin scheduler functionality; this code is available in the directory
\begin{code}
src/examples/plugin\_example/
\end{code}
A \diet server and client corresponding to a simulation of a database research
application are provided. If the construction of examples was enabled during
\diet configuration, two binaries \texttt{server} and \texttt{client} will be
built in this directory. Having deployed a \diet agent hierarchy, the server
may be instantiated:
\begin{code}
  \$ server <SeD\_config> <DB> [ <DB> ... ]
\end{code}
where \texttt{<DB>} are string(s) that represent the existence of a particular
database at the {\sed}'s site.  A client would pose a query against a set of
databases:
\begin{code}
  \$ client <client\_config> <DB> [ <DB> ... ]
\end{code}
The application uses the plugin scheduling facility to prioritize the existence
of databases in selecting a server, and thus, the expected result is that one
of the {\sed}s with the fewest number of database mismatches will be selected.

In the \texttt{main} function of the \texttt{server.c} file, the following
block of code (a)~specifies the use of the priority aggregator for this
service, (b)~declares a performance estimation function to supply the necessary
data at request-time, and (c)~defines the order of precedence of the
performance values (\ie minimizing the number of database mismatches, and then
maximizing the elapsed execution time).
\begin{verbatim}
  {
    /* new section of the profile: aggregator */
    diet_aggregator_desc_t *agg;
    agg = diet_profile_desc_aggregator(profile);

    /* for this service, use a priority scheduler */
    diet_aggregator_set_type(agg, DIET_AGG_PRIORITY);          /* (a) */

    /* install our custom performance function */
    diet_service_use_perfmetric(performanceFn);                /* (b) */

    /* define the precedence order */
    diet_aggregator_priority_minuser(agg, 0);                  /* (c) */
    diet_aggregator_priority_max(agg, EST_TIMESINCELASTSOLVE); /* (c) */
  }
\end{verbatim}
The performance function \texttt{performanceFn} is defined as follows:
\begin{verbatim}
static void performanceFn(diet_profile_t* pb, estVector_t perfValues);

[...]

/*
** performanceFn: the performance function to use in the DIET
**   plugin scheduling facility
*/
static void
performanceFn(diet_profile_t* pb, estVector_t perfValues)
{
  const char *target;
  int numMismatch;

  /* string value must be fetched from description; value is NULL */
  target = (diet_paramstring_get_desc(diet_parameter(pb, 0)))->param;
  numMismatch = computeMismatches(target);

  /*
  ** store the mismatch value in the user estimate space,
  ** using tag value 0
  */
  diet_est_set(perfValues, 0, numMismatch);

  /* also store the timestamp since last execution */
  diet_estimate_lastexec(perfValues, pb);
}
\end{verbatim}
The function \texttt{computeMismatches} (defined earlier in \texttt{server.c})
calculates the number of requested databases that are not present on the {\sed}
making the evaluation. Together, these two code segments serve to customize the
generation of performance information and the treatment of these data in the
context of the simulated database search. Finally, it should be noted that the
existence of a plugin scheduler is completely transparent to the client, and
thus client code need not be changed.

%
% Scheduling at agents level
%
\input{agent_scheduler.tex}

\section{Future Work}

We have two primary efforts planned for extensions to the plugin scheduler.
\begin{itemize}
\item \textbf{Additional information services}: We plan to add functionalities
  to enable the application developer to access and use data concerning the
  internal state of the \diet server (\eg the current length of request
  queues).  As other performance measurement and evaluation tools are developed
  both within and external to the \diet project (see
  Chapter~\ref{chapter:performance}), some tools are already available to
  enable such  information to be incorporated in the context of the plugin
  scheduler.
\item \textbf{Enhanced aggregation methods}: The plugin scheduler implemented
  in the current release enables the \diet system to account for user-defined
  factors in the server selection process.  However, the priority aggregation
  method is fairly rudimentary and lacks the power to express many imaginable
  comparison mechanisms.  We plan to investigate methods to embed code into
  \diet agents (\eg a simple expression interpreter) in a manner that is secure
  and that preserves performance.
\end{itemize}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% \end{document}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%% Local Variables:
%%% mode: latex
%%% ispell-local-dictionary: "american"
%%% mode: flyspell
%%% fill-column: 79
%%% End: