File: architecture.html

package info (click to toggle)
boost 1.27.0-3
  • links: PTS
  • area: main
  • in suites: woody
  • size: 19,908 kB
  • ctags: 26,546
  • sloc: cpp: 122,225; ansic: 10,956; python: 4,412; sh: 855; yacc: 803; makefile: 257; perl: 165; lex: 90; csh: 6
file content (563 lines) | stat: -rw-r--r-- 21,043 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<head>

    <meta name="generator" content="HTML Tidy, see www.w3.org">
    <meta http-equiv="Content-Type" content="text/html; charset=windows-1252">

    <title>Boost.Build V2 Architecture</title>
</head>

<body bgcolor="#FFFFFF" text="#000000">

    <img src="../../c++boost.gif" alt="c++boost.gif (8819 bytes)" align= "center" width="277" height="86">  

    <h1>Boost.Build V2 Architecture</h1>

<h2>Jobs dictated by Jam</h2>
    <p>First, let's outline the constraints that come from the build
tool. The main tasks of any build system using Jam are:

<ol>
<li>Establish targets
<li>Generate build actions for the targets
<li>Attach variables to the targets which will be substituted into build actions
<li>Establish dependency relationships between the targets
<li>Establish target locations.
</ol>

I'll refer to steps 1-4 above as "building the dependency graph".
There are also some details to handle header scanning, but the list
above covers most of it.

<p>

<h2>Build Procedure</h2>

    <h3>Preamble</h3>

    <blockquote>
      We will implement a variation of Rene Rivera's ideas for
      allowing Boost.Build to work "out-of-the-box", with no
      environment variable settings. The idea involves searching
      upward from the invocation directory for "<tt>boost-build.jam</tt>"
      and "<tt>project-root.jam</tt>",
      which can then be loaded to get the location of the build system
      installation and project-specific settings, respectively.
      There are two core Jam extensions I expect to rely on for this functionality:

     <ol>
     <li> The <tt>PWD</tt> builtin rule, which returns the absolute path name
          of the directory from which Jam was invoked. <tt>PWD</tt> can be
          stolen from Matt Armstrong's guest branch at Perforce.

     <li> The <tt>ARGV</tt> builtin rule, which returns the arguments with
          which Jam was invoked. <tt>$(ARGV[1])</tt> can be used to find the
          name that was used to invoke Jam, which can be used as a key
          to whether to implement stock Perforce Jam behavior or
          Boost.Build behavior.
     </ol>
 
     I note that this combined functionality would obviate the need
     for "<tt>subproject</tt>" and "<tt>project-root</tt>" rules in Jamfiles, except
     where the user wants to declare a project-id.
 </blockquote>

<h3>Initialization</h3>

   <p>
   We check the name used to invoke Jam, and if the name is not the recognized
   Boost.Jam invocation we continue with the execution of the builtin <tt>Jambase</tt>.
   </p>
   
   <p>
   Otherwise, when we recognize a Boost.Jam invocation, we:
   
   <ol>
   <li> We attempt to load "boost-build.jam" by searching from the current
        invocation directory up to the root of the file-system and in the paths
        specified by BOOST_BUILD_PATH. Within this, one invokes the "<tt>boost-build</tt>"
        rule to indicate where the Boost.Build system files are, and to load them.
   </li>
   <li> If the Boost.Build system isn't loaded yet we error and exit, giving
        brief instructions on possible errors.
   </li>
   <li> We attempt to load "<tt>project-root.jam</tt>" by searching from the current
        invocation directory up to the root of the file-system. When found its
        location is noted as the <tt>PROJECT_ROOT</tt>.
   <li> Finaly we load the "<tt>Jamfile</tt>" in the invocation directory. There are
        various spellings considered when finding the "<tt>Jamfile</tt>", the pattern
        for the name must fit: <tt>(J|j)amfile[.jam]</tt>.
   </li>
   </ol>
   </p>

   <p>
   ** the current Jambase gives an error about FTJam toolset
   definitions etc., if <tt>BOOST_BUILD_PATH</tt> is not set and the
   toolset definition is not set either. Probably that message should
   be extended to say something about setting
   <tt>BOOST_BUILD_PATH</tt> so that people are not confused. The
   FTJam toolset definitions aren't needed unless you're building Jam
   itself. <i>Shall we dump FTJam functionality that we don't
   absolutely need?</i>

   <p>

   <h3>Configuration</h3>

   boost-build.jam first loads two modules, site-config.jam and
   user-config.jam. We can put empty versions in the Boost.Build
   directory so that there is no error if users don't put their own
   somewhere earlier in <tt>BOOST_BUILD_PATH</tt> (we'll re-use that trick a
   lot). The Actually, these should probably contain <i>lots</i> of
   commented-out code which does basic configuration jobs. What do
   these things look like? For one thing, I'd like to avoid having the
   user set a bunch of obscure global variables. Let's do this through
   rule invocations:

<blockquote><pre>
     <font color="blue"># ---- Sample site-config.jam file ----
     # loads the msvc module, which registers it as an available
     # toolset. No special toolset location/configuration information
     # is given, so it is assumed that the toolset is already set up
     # (e.g. VCVARS32.BAT has been called), or that it's installed in
     # its standard location.</font>
     using msvc ;

     <font color="blue"># As above, but tells the system that we have two versions of the
     # gcc toolset installed, in the specified locations, with 2.95.3
     # being the default. The 'using' rule loads the given module and
     # calls its "configure" method with the rest of its
     # arguments. How a module treats configuration information, of
     # course, is up to the module.</font>
     using gcc : 2.95.3 /usr/local/gcc-2.95.3
                 3.0.2 /usr/local/gcc-3.0.2
               ;

     <font color="blue"># Same idea as above.</font>
     using stlport : 4.5 ~/stlport-4.5
                     4.6b2 ~/stlport-4.6b2
                   ;

     <font color="blue"># does what ALL_LOCATE_TARGET currently does</font>
     locate-built-targets bigdrive:/dave/builds ;
</pre></blockquote>

<h3>Abstract Target Specification</h3>

   The system reads the <tt>Jamfile</tt> in the invocation directory. <tt>Jamfiles</tt>
   contain mostly <i>declarative</i> rule invocations. Declarative rules
   build data structures describing targets (and other things,
   e.g. toolsets in toolset description files). Of course a <tt>Jamfile</tt>
   may also import modules and invoke rules that do more heavy-duty
   work. There are several important reason to use declarative rules
   in <tt>Jamfile</tt>s. First, the current system which constructs a
   dependency/action graph as each descriptive rule is read is quite
   inefficient for large projects. Secondly, we are unable to make any
   delayed decisions based on the entire contents of the jamfile. For
   example, we might want to do something completely different with
   the target descriptions, e.g. generate makefiles. Finally, it is
   too easy for the rules invoked and variables set by the user in a
   <tt>Jamfile</tt> to interfere with the build process.

<p>

   The system traverses the set of top-level targets and generates
   the dependency graph based on the expanded build description (the
   algorithm for expanding build descriptions is given at the bottom
   of <a href=
"http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?Boost.Build_Design_Proposal"
>this document</a>).
   OK, so I've tossed off most of the work of the build system in one
   sentence. The rest of this document deals with that in more detail.

   <p>

   The basic algorithm is as follows:

<ul>
<li>      Determine the subproject requirements
<li>      For each main target in the Jamfile:
          <ul>
          
          <li>Generate the targets which correspond to source files
          
          <li>Generate the subvariant targets of the main target
          
          <li>For each subvariant of the main target
              <ul>
              <li>build the dependency graph between the source targets and
              the subvariant target
              </ul>
</ul>
</ul>

<h3>Determining the subproject requirements</h3>


   This part is simple. The user tells us about the subproject
   using the following rules:
<blockquote><pre>
project.project ( project-id : requirements * : default-build * )
</pre></blockquote>
      Declares a project or subproject. A subproject's id is a path,
      starting with the project id of which it is a subproject. The
      requirements and default build apply to any targets described in
      the Jamfile which do not explicitly declare others. A project
      rule invocation is mandatory in any <tt>Jamfile</tt> in a project which
      includes subprojects or uses other projects.

<blockquote><pre>
project.jamfile-location( root-to-jamfile )
</pre></blockquote>

       Declares the location of this <tt>Jamfile</tt> with respect to the
       project root, in case the path given in the project rule does
       not describe the location of the <tt>Jamfile</tt>.

<blockquote><pre>
project.source-location( root-to-source )
</pre></blockquote>

       Declares that relative paths in this <tt>Jamfile</tt> are all specified
       relative to the specified directory. Thus, a project with this
       structure:

<blockquote><pre>
root
+- build
|  `- Jamfile
`- src
   +- foo.c
   `- bar.c
</pre></blockquote>

       might have the following <tt>Jamfile</tt>:

<blockquote><pre>
project.project foobar ;
project.jamfile-location build
project.source-location src ;

exe foobar : foo.c bar.c ;
</pre></blockquote>

<h2>Generating Targets</h2>

   Target names given by users in a <tt>Jamfile</tt> aren't, in
   general, the names of targets that will actually be used by the
   build system, since elements such as subvariant grist come into
   play. In order to keep the system usable, however, ungristed names
   are "claimed" by the system for <tt>NOTFILE</tt> targets which
   correspond to the user's notion of what should be built. When
   multiple subvariant builds have been requested these
   <tt>NOTFILE</tt> targets will depend on several built target files.

<h3>Building A Target From Sources</h3>

Target declaration rules can of course be written to take the crude
route of directly building the dependency graph, but that's makes for
a limited, closed system. This describes how rules can be written
which allow complex interactions between orthogonal build properties
(e.g. STLPort support, Python support, toolset selection). The scheme
used by Boost.Build relies on objects called "Generators" to do the
work.

<h3>Finding the Generator Set</h3>

Generators will be chosen based on a set of build properties and the
generator's matching criteria. A generator's matching criteria are composed
of 3 lists:

<ol>
<li> required-properties
<li> optional-properties
<li> rules
 </ol>

and an optional "category". If no category is supplied, the generator has
the empty category.

<p>

A generator doesn't match the build request unless all of its
required-properties are contained in the build request.

<p>
The matching process for a generator looks like this:

<blockquote><pre>
local match ;
if $(required-properties) in $(build-properties)
{
    match = $(required-properties)
        [ set.intersection
          $(optional-properties)
          : $(build-properties)
          $(build-properties:G) # valueless properties match any value
          ] ;
    for local r in $(rules)
    {
        match = [ $(r) $(match) ] ; # maybe some other arguments, too
    }
}
return match ;
</pre></blockquote>

<p>
The specificity of a match is given by the length of its match list.
Basically, generators that match more properties will be more likely to be
chosen. In each category with a matching generator is found, we select the
generators with the longest match for the generator set.

<blockquote>
Notes: These criteria handles things like target-type
specificity. Properties in an inheritance/refinement hierarchy can
be composite properties which expand to add properties for all of
their bases. So for example,

<blockquote><pre>
&lt;target=type&gt;PYD
</pre></blockquote>

 might expand to

<blockquote><pre>
&lt;target-type&gt;PYD &lt;target-type-base&gt;PYD
&lt;target-type-base&gt;DLL &lt;target-type-base&gt;executable
</pre></blockquote>

(see the bottom of this document if you need
to refer to the target-type refinement hierarchy).  A generator
which wanted to match all executables might specify

<blockquote><pre>
&lt;target-type-base&gt;executable
</pre></blockquote>

as a required property. More-specific generators would still match
if available. More-specific generators match better because they
list /all/ of the base properties to which they apply:

<blockquote><pre>
pyd-generator requires:
    &lt;target-type-base&gt;executable
    &lt;target-type-base&gt;DLL
    &lt;target-type-base&gt;PYD

dll-generator requires:
    &lt;target-type-base&gt;executable
    &lt;target-type-base&gt;DLL
</pre></blockquote>

</blockquote>


<h3>Build Property Expansion</h3>

In this step, all selected generators get an opportunity to modify the
build properties associated with the build request:

<p>

For each generator in the generator set, call its "expand" rule to alter the
properties in the build request. Most generators that can actually build
targets will not want to implement expand; usually expand will only be used
by generators that need to modify the build somehow, e.g. by adding #include
paths. Note that the build property set is still one big wad available to
all competing/interacting generators, so this would  be an inappropriate
place for a toolset generator to remove irrelevant properties.

<h3>Virtual Target Generation</h3>

In this step, each generator has an opportunity to build a
representation of the virtual dependency graph for the requested
target. By "virtual", I mean that the targets in the graph are objects
with a record of their parents, children, build properties, etc., but
that no <tt>DEPENDS</tt> calls or action rules have yet been invoked for them.

<p>

For each generator in the generator set, call its "execute" rule. The
"execute" rule should return a list of the virtual targets generated
in its dependency graph. Generators that don't succeed in producing
targets will return the empty list.

<p>

At this point, the generator may collect a set of properties relevant
to its target construction method into a subvariant identifier. A
database of already-generated subvariant identifiers and their related
targets can be queried to see if the subvariant already exists. If it
does, the generator may use the cached data to return to its caller
immediately.

<p>

To produce the dependency graph, a generator may well invoke the
matching and target calculation process again on some or all of the
source files, with the build property set changed to reflect the
generator's input target types. For example, a generator for
executables comes across a <tt>CPP</tt> file in a list of sources. It then
replaces target types in the build request with its list of input
target types (<tt>OBJ</tt>, <tt>LIB</tt>,...). The matching process finds a generator
which matches
<tt>&lt;target-type&gt;OBJ</tt>
 with optional
<tt>&lt;source-type&gt;CPP</tt>. This
generator, if eventually selected, is the one that invokes the C++
compiler.

<p>

<blockquote>
Why is CPP an <tt>&lt;source-type&gt;CPP</tt> an optional property for the C++
compiler?  Consider what happens when a a YPP source file appears
in the list of sources: the C++ generator should still be matched
- it will want to invoke the matching process itself once again,
hopefully finding a generator which matches
<tt>&lt;target-type&gt;CPP/&lt;source-type&gt;YPP</tt>.

<p>

Why do we need <tt>&lt;source-type&gt;CPP</tt> at all? We don't: it's an
optimization to prevent less-specific generators from being
invoked for build-property expansion or virtual target generation.
</blockquote>

Generators are typically matched based on a single desired
target-type, but some generators produce more than one target. When
returning its list of targets, a generator distinguishes the
intermediate from final targets by dividing the list of targets into
two sections, separated by a special symbol (say "<tt>@</tt>", since we seem to
be using it for everythign). When an intemediate generator produces
multiple targets, its parent transforms the targets it can cope with,
and passes back any others to its parent as final targets. A
multi-source generator like <tt>EXE&lt;-{OBJ,LIB}</tt> will add any final targets
that it can't cope with to its sources at that point in its list of
sources.

<blockquote>
Vladimir's favorite example has a dependency graph like this one:

<blockquote><pre>
targets   foo
   ^      / \
   |   a1.o a2.o
   |    |     |
   | a1.cpp a2.cpp
   |    |     |
   |  a.whl  a.dlp &lt;--- one build action generates both WHL and DLP
   |     \   /
sources  asm.wd
</pre></blockquote>

To make the problem more interesting, let's reformulate the example:

<blockquote><pre>
targets   foo
          / \
       a1.o a2.o
        |     |
     a1.cpp   |
        |   a2.cpp
     a1.lex   |
        |     |
      a.whl  a.dlp &lt;--- one build action generates both WHL and DLP
         \   /
         asm.wd
</pre></blockquote>

  In the case of

<blockquote><pre>
EXE&lt;-OBJ*, OBJ&lt;-CPP, CPP&lt;-LEX, LEX&lt;-WHL, {WHL,DLP}&lt;-WD,
--------------------------------------^
</pre></blockquote>

  The arrow indicates a place where, as we're unwinding, we find that
  the generator can't cope with <tt>DLP</tt>, so with final targets enclosed in
  {}:

<blockquote><pre>
EXE&lt;-OBJ*, OBJ&lt;-CPP, CPP&lt;-LEX, {LEX,DLP}&lt;-WHL
EXE&lt;-OBJ*, OBJ&lt;-CPP, {CPP,DLP}&lt;-LEX,
EXE&lt;-OBJ*, {OBJ,DLP}&lt;-CPP
</pre></blockquote>

Now find that <tt>OBJ*</tt> doesn't match <tt>DLP</tt>, so we search
for <tt>DLP</tt> just like a source.
</blockquote>

Before returning from its execute rule, each generator may collect
"synthesized" build properties from the sources and/or intermediate
targets generated by the matching process it invoked.

<blockquote>
Note: A generator does not /have/ to build a new target. It may
just modify properties of targets at the next level and pass the
targets up to their caller.

<p>

For example, a Windows <tt>PYD</tt> generator might replace the target-type
with <tt>DLL</tt> and reinvoke the process, then go back and add "<tt>_d</tt>" to the
name of the generated file in the case of a Python debug build.
</blockquote>

<h3>Virtual Target Selection</h3>

In this step, we select one of the dependency graphs generated by a
generator in the generator set. The criterion is simple: we choose the
graph whose generator returned the shortest list of targets. In other
words, we choose the shortest path from sources to targets. The root
target(s) of the dependency graph are labelled with the generator, so
that the generator for any virtual target can be retrieved.

<h3>Actual Target Generation</h3>

In this step, we recursively descend the dependency graph, invoking
the "finalize" rule of the generator associated with each target. In
general, this will invoke action rules and DEPENDS to create the
internal Jam dependency graph.

<hr>

<h2>Sample Target Type Refinement Hierarchy Diagram</h2>

<blockquote><pre>

           +--------+                  +--------+ Cmd +-----+ Link +------------+
           | source |       'Archive'..+ object +....&gt;| RSP +.....&gt;| executable |
           +----+---+                : +---+----+     +-----+ :    +------+-----+
                |                    :     |                  :           |
        +-------+----+          +----------+-+-------------+  :      +----+----+
        |            |          |    :       |             |  V      |         |
    +----------+  +--+--+Asm +--+--+ :    +--+--+     +----+---+  +--+--+   +--+--+
    | compiled |  | ASM |...&gt;| OBJ | :...&gt;| LIB |     | IMPLIB |  | DLL |   | EXE |
    +----+-----+  +-----+    +-----+      +-----+     +--------+  +--+--+   +-----+
         |         ^  ^       ^  ^                                   |
     +---+--+      :  :       :  :                                +--+--+
     |      |      :  :       :  :                                | PYD |
  +--+--+ +-+-+    :  :       :  :                                +-----+
  | CPP | | C +....:..........:  :
  +--+--+ +---+ 'C'   :          :
     :                :          :
     :................:..........:
          'C++'

</pre></blockquote>

<hr>
    <p>&copy; Copyright David Abrahams 2002. Permission to copy, use, modify,
    sell and distribute this document is granted provided this copyright notice
    appears in all copies. This document is provided "as is" without express or
    implied warranty, and with no claim as to its suitability for any purpose.

    <p>Revised 
    <!--webbot bot="Timestamp" s-type="EDITED" s-format="%d %B, %Y" startspan
                        -->21 January, 2002 
    <!--webbot bot="Timestamp" endspan i-checksum="21080"
                        -->
</body>