File: cons.pod

package info (click to toggle)
cons 1.5-1
links: PTS
area: main
in suites: potato
size: 492 kB
ctags: 217
sloc: perl: 1,602; makefile: 32; sh: 17
file content (1483 lines) | stat: -rw-r--r-- 72,667 bytes
=head1 NAME

Cons - Cons: A Software Construction System

=head1 DESCRIPTION

The original document was automatically derived from the F<cons/cons.html>
by B<html2pod>, thanks to Ulrich Pfiefer. Later revisions were created from
the original.

=head1 B<Cons: A Software Construction System>

by Bob Sidebotham F<rns@fore.com>

A guide and reference for version 1.3.1

Copyright (c) 1996-1998 FORE Systems, Inc. All rights reserved.

Permission to use, copy, modify and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of FORE Systems, Inc. (``FORE Systems'') not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission.

FORE SYSTEMS DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ANY WARRANTIES REGARDING INTELLECTUAL PROPERTY RIGHTS AND ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL FORE SYSTEMS BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.


=head1 B<Introduction>

B<Cons> is a system for constructing, primarily, software, but is quite different from previous software construction systems. Cons was designed from the ground up to deal easily with the construction of software spread over multiple source directories. Cons makes it easy to create build scripts that are simple, understandable and maintainable. Cons ensures that complex software is easily and accurately reproducible.

Cons uses a number of techniques to accomplish all of this. Construction scripts are just Perl scripts, making them both easy to comprehend and very flexible. Global scoping of variables is replaced with an import/export mechanism for sharing information between scripts, significantly improving the readability and maintainability of each script. B<Construction environments> are introduced: these are Perl objects that capture the information required for controlling the build process. Multiple environments are used when different semantics are required for generating products in the build tree. Cons implements automatic dependency analysis and uses this to globally sequence the entire build. Variant builds are easily produced from a single source tree. Intelligent build subsetting is possible, when working on localized changes. Overrides can be setup to easily override build instructions without modifying any scripts. MD5 cryptographic B<signatures> are associated with derived files, and are used to accurately determine whether a given file needs to be rebuilt.

While offering all of the above, and more, Cons remains simple and easy to use. This will, hopefully, become clear as you read the remainder of this document.


=head1 B<Why Cons? Why not Make?>

Cons is a B<make> replacement. In the following paragraphs, we look at a few of the undesirable characteristics of make--and typical build environments based on make--that motivated the development of Cons.


=item B<Build complexity>

Traditional make-based systems of any size tend to become quite complex. The original make utility and its derivatives have contributed to this tendency in a number of ways. Make is not good at dealing with systems that are spread over multiple directories. Various work-arounds are used to overcome this difficulty; the usual choice is for make to invoke itself recursively for each sub-directory of a build. This leads to complicated code, in which it is often unclear how a variable is set, or what effect the setting of a variable will have on the build as a whole. The make scripting language has gradually been extended to provide more possibilities, but these have largely served to clutter an already over extended language. Often, builds are done in multiple passes in order to provide appropriate products from one directory to another directory. This represents a further increase in build complexity.


=item B<Build reproducibility >

The bane of all makes has always been the correct handling of dependencies. Most often, an attempt is made to do a reasonable job of dependencies within a single directory, but no serious attempt is made to do the job between directories. Even when dependencies are working correctly, make's reliance on a simple time stamp comparison to determine whether a file is out of date with respect to its dependents is not, in general, adequate for determining when a file should be rederived. If an external library, for example, is rebuilt and then ``snapped'' into place, the timestamps on its newly created files may well be earlier than the last local build, since it was built before it became visible.


=item B<Variant builds>

Make provides only limited facilities for handling variant builds. With the proliferation of hardware platforms and the need for debuggable vs. optimized code, the ability to easily create these variants is essential. More importantly, if variants are created, it is important to either be able to separate the variants or to be able to reproduce the original or variant at will. With make it is very difficult to separate the builds into multiple build directories, separate from the source. And if this technique isn't used, it's also virtually impossible to guarantee at any given time which variant is present in the tree, without resorting to a complete rebuild.


=item B<Repositories>

Make provides only limited support for building software from code
that exists in a central repository directory structure.  The VPATH
feature of GNU make (and some other make implementations) is intended
to provide this, but doesn't work as expected:  it changes the path
of target file to the VPATH name too early in its analysis, and
therefore searches for all dependencies in the VPATH directory.
To ensure correct development builds, it is important to be able
to create a file in a local build directory and have any files in
a code repository (a VPATH directory, in make terms) that depend
on the local file get rebuilt properly.  This isn't possible with
VPATH, without coding a lot of complex repository knowledge directly
into the makefiles.


=head1 B<Keeping it simple>

A few of the difficulties with make have been cited above. In this and subsequent sections, we shall introduce Cons and show how these issues are addressed.


=item B<Perl scripts>

Cons is Perl-based. That is, Cons scripts--C<Conscript> and C<Construct> files, the equivalent to C<makefiles>--are all written in Perl. This provides an immediate benefit: the language for writing scripts is a familiar one. Even if you don't happen to be a Perl programmer, it helps to know that Perl is basically just a simple declarative language, with a well-defined flow of control, and familiar semantics. It has variables that behave basically the way you would expect them to, subroutines, flow-of-control, and so on. There is no special syntax introduced for Cons. The use of Perl as a scripting language simplifies the task of expressing the appropriate solution to the often complex requirements of a build.


=item B<Hello, World!>

To ground the following discussion, here's how you could build the B<Hello, World!> C application with Cons:



  $env = new cons();
  Program $env 'hello', 'hello.c';

If you install this script in a directory, naming the script C<Construct,> and create the C<hello.c> source file in the same directory, then you can type C<cons hello> to build the application:



  % cons hello
  cc -c hello.c -o hello.o
  cc -o hello hello.o


=item B<Construction environments>

A key simplification of Cons is the idea of a B<construction environment>. A construction environment is an B<object> characterized by a set of key/value pairs and a set of B<methods. >In order to tell Cons how to build something, you invoke the appropriate method via an appropriate construction environment. Consider the following example:



  $env = new cons(
  	CC	=>	'gcc',
  	LIBS	=>	'libworld.a'
  );
  
  Program $env 'hello', 'hello.c';

In this case, rather than using the default construction environment, as is, we have overridden the value of C<CC> so that the GNU C Compiler equivalent is used, instead. Since this version of B<Hello, World!> requires a library, C<libworld.a>, we have specified that any program linked in this environment should be linked with that library. If the library exists already, well and good, but if not, then we'll also have to include the statement:



  Library $env 'libworld', 'world.c';

Now if you type C<cons hello>, the library will be built before the program is linked, and, of course, C<gcc> will be used to compile both modules:



  % cons hello
  gcc -c hello.c -o hello.o
  gcc -c world.c -o world.o
  ar r libworld.a world.o
  ar: creating libworld.a
  ranlib libworld.a
  gcc -o hello hello.o libworld.a


=item B<Automatic and complete dependency analysis>

With Cons, dependencies are handled automatically. Continuing the previous example, note that when we modify C<world.c>, C<world.o> is recompiled, C<libworld.a> recreated, and C<hello> relinked:



  % touch world.c
  % cons hello
  gcc -c world.c -o world.o
  ar r libworld.a world.o
  ar: creating libworld.a
  ranlib libworld.a
  gcc -o hello hello.o libworld.a

This is a relatively simple example: Cons ``knows'' world.o depends upon C<world.c>, because the dependency is explicitly set up by the C<Library> method. It also knows that C<libworld.a> depends upon C<world.o> and that C<hello> depends upon C<libworld.a>, all for similar reasons.

Now it turns out that C<hello.c> also includes the interface definition file, C<world.h>:



  % touch world.h
  % cons hello
  gcc -c hello.c -o hello.o
  gcc -o hello hello.o libworld.a

How does Cons know that C<hello.c> includes C<world.h>, and that C<hello.o> must therefore be recompiled? For now, suffice it to say that when considering whether or not C<hello.o> is up-to-date, Cons invokes a scanner for its dependency, C<hello.c>. This scanner enumerates the files included by C<hello.c> to come up with a list of further dependencies, beyond those made explicit by the Cons script. This process is recursive: any files included by included files will also be scanned.

Isn't this expensive? The answer is--it depends. If you do a full build of a large system, the scanning time is insignificant. If you do a rebuild of a large system, then Cons will spend a fair amount of time thinking about it before it decides that nothing has to be done (although not necessarily more time than make!). The good news is that Cons makes it very easy to intelligently subset your build, when you are working on localized changes.


=item B<Automatic global build sequencing>

Because Cons does full and accurate dependency analysis, and does this globally, for the entire build, Cons is able to use this information to take full control of the B<sequencing> of the build. This sequencing is evident in the above examples, and is equivalent to what you would expect for make, given a full set of dependencies. With Cons, this extends trivially to larger, multi-directory builds. As a result, all of the complexity involved in making sure that a build is organized correctly--including multi-pass hierarchical builds--is eliminated. We'll discuss this further in the next sections.


=head1 B<Building large trees--still just as simple>


=item B<A hierarchy of build scripts>

A larger build, in Cons, is organized by creating a hierarchy of B<build scripts>. At the top of the tree is a script called C<Construct>. The rest of the scripts, by convention, are each called C<Conscript>. These scripts are connected together, very simply, by the C<Build>, C<Export>, and C<Import> commands.


=item B<The >B<Build>B< command>

The C<Build> command takes a list of C<Conscript> file names, and arranges for them to be included in the build. For example:



  Build qw(
  	drivers/display/Conscript
  	drivers/mouse/Conscript
  	parser/Conscript
  	utilities/Conscript
  );

This is a simple two-level hierarchy of build scripts: all the subsidiary C<Conscript> files are mentioned in the top-level C<Construct> file. Notice that not all directories in the tree necessarily have build scripts associated with them.

This could also be written as a multi-level script. For example, the C<Construct> file might contain this command:



  Build qw(
  	parser/Conscript
  	drivers/Conscript
  	utilities/Conscript
  );

and the C<Conscript> file in the C<drivers> directory might contain this:



  
  Build qw(
  	display/Conscript
  	mouse/Conscript
  );

Experience has shown that the former model is a little easier to understand, since the whole construction tree is laid out in front of you, at the top-level. Hybrid schemes are also possible. A separately maintained component that needs to be incorporated into a build tree, for example, might hook into the build tree in one place, but define its own construction hierarchy.


=item B<Relative, top-relative, and absolute file names>

You may have noticed that the file names specified to the Build command are relative to the location of the script it is invoked from. This is generally true for other filename arguments to other commands, too, although we might as well mention here that if you begin a file name with a hash mark, ``#'', then that file is interpreted relative to the top-level directory (where the Construct file resides). And, not surprisingly, if you begin it with ``/'', then it is considered to be an absolute pathname. This is true even on systems which use a back slash rather than a forward slash to name absolute paths.


=item B<Scope of variables>

Each C<Conscript> file, and also the top-level C<Construct> file, begins life in a separate Perl package. Except for the C<Construct> file, which gets some of the command line arguments, the symbol table for each script is empty. All of the variables that are set, therefore, are set by the script itself--not by some external script. Variables can be explicitly B<imported> by a script from its parent script. To import a variable, it must have been B<exported> by the parent and initialized (otherwise an error will occur). It is therefore possible to determine, from looking at a single script, exactly where each variable in that script is set.


=item B<The >B<Export>B< command>

The C<Export> command is used as in the following example:



  $ENV = new cons();
  $INCLUDE = "#export/include";
  $LIB = "#export/lib";
  Export qw( ENV INCLUDE LIB );
  Build qw( util/Conscript );

The values of the simple variables mentioned in the C<Export> list will be squirreled away by any subsequent C<Build> commands. The C<Export> command will only export PerlB< scalar> variables, that is, variables whose name begins with C<$>. Other variables, objects, etc. can be exported by reference--but all scripts will refer to the same object, and this object should be considered to be read-only by the subsidiary scripts and by the original exporting script. It's acceptable, however, to assign a new value to the exported scalar variable--that won't change the underlying variable referenced. This sequence, for example, is OK:



  $ENV = new cons();
  Export qw( ENV INCLUDE LIB );
  Build qw( util/Conscript );
  $ENV = new cons(CFLAGS => '-O');
  Build qw( other/Conscript );

It doesn't matter whether the variable is set before or after the C<Export> command. The important thing is the value of the variable at the time the C<Build> command is executed. This is what gets squirreled away. Any subsequent C<Export> commands, by the way, invalidate the first: you must mention all the variables you wish to export on each C<Export> command.


=item B<The >B<Import>B< command>

Variables exported by the C<Export> command can be imported into subsidiary scripts by the C<Import> command. The subsidiary script always imports variables directly from the superior script. Consider this example:



  Import qw( ENV INCLUDE );

This is only legal if the parent script exported both C<$ENV> and C<$INCLUDE>. It also must have given each of these variables values. It is OK for the subsidiary script to only import a subset of the exported variables (in this example, C<$LIB>, which was exported by the previous example, is not imported).

All the imported variables are automatically re-exported, so the sequence:



  Import qw ( ENV INCLUDE );
  Build qw ( beneath-me/Conscript );

will supply both C<$ENV> and C<$INCLUDE> to the subsidiary file. If only C<$ENV> is to be exported, then the following will suffice:



  Import qw ( ENV INCLUDE );
  Export qw ( ENV );
  Build qw ( beneath-me/Conscript );

Needless to say, the variables may be modified locally before invoking C<Build> on the subsidiary script.


=item B<Build script evaluation order>

The only constraint on the ordering of build scripts is that superior scripts are evaluated before their inferior scripts. The top-level C<Construct> file, for instance, is evaluated first, followed by any inferior scripts. This is all you really need to know about the evaluation order, since order is generally irrelevant. Consider the following C<Build> command:



  Build qw(
  	drivers/display/Conscript
  	drivers/mouse/Conscript
  	parser/Conscript
  	utilities/Conscript
  );

We've chosen to put the script names in alphabetical order, simply because that's the most convenient for maintenance purposes. Changing the order will make no difference to the build.


=head1 B<A Model for sharing files>


=item B<Some simple conventions>

In any complex software system, a method for sharing build products needs to be established. We propose a simple set of conventions which are trivial to implement with Cons, but very effective.

The basic rule is to require that all build products which need to be shared between directories are shared via an intermediate directory. We have typically called this C<export>, and, in a C environment, provided conventional sub-directories of this directory, such as C<include>, C<lib>, C<bin>, etc.

These directories are defined by the top-level C<Construct> file. A simple C<Construct> file for a B<Hello, World!> application, organized using multiple directories, might look like this:



  # Construct file for Hello, World!
  
  # Where to put all our shared products.
  $EXPORT = '#export';
  
  Export qw( CONS INCLUDE LIB BIN );



  # Standard directories for sharing products.
  $INCLUDE = "$EXPORT/include";
  $LIB = "$EXPORT/lib";
  $BIN = "$EXPORT/bin";
  
  # A standard construction environment.
  $CONS = new cons (
  	CPPPATH => $INCLUDE,			# Include path for C Compilations
  	LIBPATH => $LIB		,   	# Library path for linking programs
  	LIBS => '-lworld',			# List of standard libraries
  );



  Build qw(
  	hello/Conscript 
  	world/Conscript
  );

The C<world> directory's C<Conscript> file looks like this:



  # Conscript file for directory world
  Import qw( CONS INCLUDE LIB );



  # Install the products of this directory
  Install $CONS $LIB, 'libworld.a';
  Install $CONS $INCLUDE, 'world.h';



  # Internal products
  Library $CONS 'libworld.a', 'world.c';

and the C<hello> directory's C<Conscript> file looks like this:



  # Conscript file for directory hello
  Import qw( CONS BIN );
  
  # Exported products
  Install $CONS $BIN, 'hello';
  
  # Internal products
  Program $CONS 'hello', 'hello.c';

To construct a B<Hello, World!> program with this directory structure, go to the top-level directory, and invoke C<cons> with the appropriate arguments. In the following example, we tell Cons to build the directory C<export>. To build a directory, Cons recursively builds all known products within that directory (only if they need rebuilding, of course). If any of those products depend upon other products in other directories, then those will be built, too.



  % cons export
  Install world/world.h as export/include/world.h
  cc -Iexport/include -c hello/hello.c -o hello/hello.o
  cc -Iexport/include -c world/world.c -o world/world.o
  ar r world/libworld.a world/world.o
  ar: creating world/libworld.a
  ranlib world/libworld.a
  Install world/libworld.a as export/lib/libworld.a
  cc -o hello/hello hello/hello.o -Lexport/lib -lworld
  Install hello/hello as export/bin/hello


=item B<Clean, understandable, location-independent scripts>

You'll note that the two C<Conscript> files are very clean and to-the-point. They simply specify products of the directory and how to build those products. The build instructions are minimal: they specify which construction environment to use, the name of the product, and the name of the inputs. Note also that the scripts are location-independent: if you wish to reorganize your source tree, you are free to do so: you only have to change the C<Construct> file (in this example), to specify the new locations of the C<Conscript> files. The use of an export tree makes this goal easy.

Note, too, how Cons takes care of little details for you. All the C<export> directories, for example, were made automatically. And the installed files were really hard-linked into the respective export directories, to save space and time. This attention to detail saves considerable work, and makes it even easier to produce simple, maintainable scripts.


=head1 B<Separating source and build trees>

It's often desirable to keep any derived files from the build completely separate from the source files. This makes it much easier to keep track of just what is a source file, and also makes it simpler to handle B<variant> builds, especially if you want the variant builds to co-exist.


=item B<Separating build and source directories using the >B<Link>B< command>

Cons provides a simple mechanism that handles all of these requirements. The C<Link> command is invoked as in this example:



  Link 'build' => 'src';

The specified directories are ``linked'' to the specified source directory. Let's suppose that you setup a source directory, C<src>, with the sub-directories C<world> and C<hello> below it, as in the previous example. You could then substitute for the original build lines the following:



  Build qw(
  	build/world/Conscript
  	build/hello/Conscript
  );

Notice that you treat the C<Conscript> file as if it existed in the build directory. Now if you type the same command as before, you will get the following results:



  % cons export
  Install build/world/world.h as export/include/world.h
  cc -Iexport/include -c build/hello/hello.c -o build/hello/hello.o
  cc -Iexport/include -c build/world/world.c -o build/world/world.o
  ar r build/world/libworld.a build/world/world.o
  ar: creating build/world/libworld.a
  ranlib build/world/libworld.a
  Install build/world/libworld.a as export/lib/libworld.a
  cc -o build/hello/hello build/hello/hello.o -Lexport/lib -lworld
  Install build/hello/hello as export/bin/hello

Again, Cons has taken care of the details for you. In particular, you will notice that all the builds are done using source files and object files from the build directory. For example, C<build/world/world.o> is compiled from C<build/world/world.c>, and C<export/include/world.h> is installed from C<build/world/world.h>. This is accomplished on most systems by the simple expedient of ``hard'' linking the required files from each source directory into the appropriate build directory.

The links are maintained correctly by Cons, no matter what you do to the source directory. If you modify a source file, your editor may do this ``in place'' or it may rename it first and create a new file. In the latter case, any hard link will be lost. Cons will detect this condition the next time the source file is needed, and will relink it appropriately.

You'll also notice, by the way, that B<no> changes were required to the underlying C<Conscript> files. And we can go further, as we shall see in the next section.


=head1 B<Variant builds>


=item B<Hello, World!>B< for baNaNa and peAcH OS's>

Variant builds require just another simple extension. Let's take as an example a requirement to allow builds for both the baNaNa and peAcH operating systems. In this case, we are using a distributed file system, such as NFS to access the particular system, and only one or the other of the systems has to be compiled for any given invocation of C<cons>. Here's one way we could set up the C<Construct> file for our B<Hello, World!> application:



  # Construct file for Hello, World!
  
  die qq(OS must be specified) unless $OS = $ARG{OS};
  die qq(OS must be "peach" or "banana")
  	if $OS ne "peach" && $OS ne "banana";



  # Where to put all our shared products.
  $EXPORT = "#export/$OS";
  
  Export qw( CONS INCLUDE LIB BIN );
  
  # Standard directories for sharing products.
  $INCLUDE = "$EXPORT/include";
  $LIB = "$EXPORT/lib";
  $BIN = "$EXPORT/bin";
  
  # A standard construction environment.
  $CONS = new cons (
  	CPPPATH => $INCLUDE,			# Include path for C Compilations
  	LIBPATH => $LIB		,   	# Library path for linking programs
  	LIBS => '-lworld',			# List of standard libraries
  );



  # $BUILD is where we will derive everything.
  $BUILD = "#build/$OS";



  # Tell cons where the source files for $BUILD are.
  Link $BUILD => 'src';
  
  Build (
  	"$BUILD/hello/Conscript",
  	"$BUILD/world/Conscript",
  );

Now if we login to a peAcH system, we can build our B<Hello, World!> application for that platform:



  % cons export OS=peach
  Install build/peach/world/world.h as export/peach/include/world.h
  cc -Iexport/peach/include -c build/peach/hello/hello.c -o build/peach/hello/hello.o
  cc -Iexport/peach/include -c build/peach/world/world.c -o build/peach/world/world.o
  ar r build/peach/world/libworld.a build/peach/world/world.o
  ar: creating build/peach/world/libworld.a
  ranlib build/peach/world/libworld.a
  Install build/peach/world/libworld.a as export/peach/lib/libworld.a
  cc -o build/peach/hello/hello build/peach/hello/hello.o -Lexport/peach/lib -lworld
  Install build/peach/hello/hello as export/peach/bin/hello


=item B<Variations on a theme>

Other variations of this model are possible. For example, you might decide that you want to separate out your include files into platform dependent and platform independent files. In this case, you'd have to define an alternative to C<$INCLUDE> for platform-dependent files. Most C<Conscript> files, generating purely platform-independent include files, would not have to change.

You might also want to be able to compile your whole system with debugging or profiling, for example, enabled. You could do this with appropriate command line options, such as C<DEBUG=on>. This would then be translated into the appropriate platform-specific requirements to enable debugging (this might include turning off optimization, for example). You could optionally vary the name space for these different types of systems, but, as we'll see in the next section, it's not B<essential> to do this, since Cons is pretty smart about rebuilding things when you change options.


=head1 B<Signatures>


=item B<MD5 cryptographic signatures>

Whenever Cons creates a derived file, it stores a B<signature> for that file. The signature is stored in a separate file, one per directory. After the previous example was compiled, the C<.consign> file in the C<build/peach/world> directory looked like this:



  world.o:834179303 23844c0b102ecdc0b4548d1cd1cbd8c6
  libworld.a:834179304 9bf6587fa06ec49d864811a105222c00

The first number is a timestamp--for a UNIX systems, this is typically the number of seconds since January 1st, 1970. The second value is an MD5 checksum. The B<Message Digest Algorithm> is an algorithm that, given an input string, computes a strong cryptographic signature for that string. The MD5 checksum stored in the C<.consign> file is, in effect, a digest of all the dependency information for the specified file. So, for example, for the file C<world.o>, this includes at least the file C<world.c>, and also any header files that Cons knows about that are included, directly or indirectly by C<world.c>. Not only that, but the actual command line that was used to generate C<world.o> is also fed into the computation of the signature. Similarly, C<libworld.a> gets a signature which ``includes'' all the signatures of its constituents (and hence, transitively, the signatures of B<their> constituents), as well as the command line that created the file.

The signature of a non-derived file is computed, by default, by taking the current modification time of the file and the file's entry name (unless there happens to be a current C<.consign> entry for that file, in which case that signature is used).

Notice that there is no need for a derived file to depend upon any particular C<Construct> or C<Conscript> file--if changes to these files affect the file in question, then this will be automatically reflected in its signature, since relevant parts of the command line are included in the signature. Unrelated changes will have no effect.

When Cons considers whether to derive a particular file, then, it first computes the expected signature of the file. It then compares the file's last modification time with the time recorded in the C<.consign> entry, if one exists. If these times match, then the signature stored in the C<.consign> file is considered to be accurate. If the file's previous signature does not match the new, expected signature, then the file must be rederived.

Notice that a file will be rederived whenever anything about a dependent file changes. In particular, notice that B<any> change to the modification time of a dependent (forward or backwards in time) will force recompilation of the derived file.

The use of these signatures is an extremely simple, efficient, and effective method of improving--dramatically--the reproducibility of a system.

We'll demonstrate this with a simple example:



  # Simple "Hello, World!" Construct file
  $CFLAGS = '-g' if $ARG{DEBUG} eq 'on';
  $CONS = new cons(CFLAGS => $CFLAGS);
  Program $CONS 'hello', 'hello.c';

Notice how Cons recompiles at the appropriate times:



  % cons hello
  cc -c hello.c -o hello.o
  cc -o hello hello.o
  % cons hello
  cons: "hello" is up-to-date.
  % cons DEBUG=on hello
  cc -g -c hello.c -o hello.o
  cc -o hello hello.o
  % cons DEBUG=on hello
  cons: "hello" is up-to-date.
  % cons hello
  cc -c hello.c -o hello.o
  cc -o hello hello.o


=head1 B<Code Repositories>

Many software development organizations
will have one or more central repository directory trees
containing the current source code for one or more projects,
as well as the derived object files, libraries, and executables.
In order to reduce unnecessary recompilation,
it is useful to use files from the repository
to build development software--assuming, of course,
that no newer dependency file
exists in the local build tree.


=item B<Repository>

Cons provides a mechanism to specify a list of
code repositories that will be searched, in-order,
for source files and derived files not
found in the local build directory tree.

The following lines in a C<Construct> file
will instruct Cons to look first under the
C</usr/experiment/repository> directory
and then under the C</usr/product/repository> directory:

  Repository qw (
  	/usr/experiment/repository
  	/usr/product/repository
  );

The repository directories specified may contain
source files, derived files
(objects, libraries and executables), or both.
If there is no local file (source or derived)
under the directory in which Cons is executed,
then the first copy of a same-named file
found under a repository directory
will be used to build any local derived files.

Cons maintains one global list of repositories directories.
Cons will eliminate the current directory,
and any non-existent directories, from the list.


=item B<Finding the Construct file in a Repository>

Cons will also search for C<Construct> and C<Conscript> files in
the repository tree or trees.
This leads to a chicken-and-egg situation, though:
how do you look in a repository tree for a C<Construct> file
if the C<Construct> file tells you where the repository is?
To get around this,
repositories may be specified via
C<-R> options on the command line:

  % cons -R /usr/experiment/repository -R /usr/product/repository .

Any repository directories specified in the C<Construct>
or C<Conscript> files will be appended to the
repository directories specified by command-line C<-R> options.

=item B<Repository source files>

If the source code (include the C<Conscript> file)
for the library version of the I<Hello, World!> C application
is in a repository (with no derived files),
Cons will use the repository source files
to create the local object files
and executable file:

  % cons -R /usr/src_only/repository hello
  gcc -c /usr/src_only/repository/hello.c -o hello.o
  gcc -c /usr/src_only/repository/world.c -o world.o
  ar r libworld.a world.o
  ar: creating libworld.a
  ranlib libworld.a
  gcc -o hello hello.o libworld.a

Creating a local source file will cause Cons
to rebuild the appropriate derived file or files:

  % touch world.c
  % cons -R /usr/src_only/repository hello
  gcc -c world.c -o world.o
  ar r libworld.a world.o
  ar: creating libworld.a
  ranlib libworld.a
  gcc -o hello hello.o libworld.a

And removing the local source file
will cause Cons to revert back to
building the derived files
from the repository source:

  % touch -R /usr/src_only/repository hello world.c
  % cons hello
  gcc -c /usr/src_only/world.c/repository -o world.o
  ar r libworld.a world.o
  ar: creating libworld.a
  ranlib libworld.a
  gcc -o hello hello.o libworld.a


=item B<Repository derived files>

If a repository tree contains derived files
(usually object files, libraries, or executables),
Cons will perform its normal signature calculation
to decide whether the repository file is up-to-date
or a derived file must be built locally.
This means that,
in order to ensure correct signature calculation,
a repository tree must also contain the C<.consign> files
that were created by Cons when generating the derived files.

This would usually be accomplished by
building the software in the repository
(or, alternatively, in a build directory,
and then copying the result to the repository):

  % cd /usr/all/repository
  % cons hello
  gcc -c hello.c -o hello.o
  gcc -c world.c -o world.o
  ar r libworld.a world.o
  ar: creating libworld.a
  ranlib libworld.a
  gcc -o hello hello.o libworld.a

(This is safe even if the C<Construct> file
lists the C</usr/all/repository> directory
in a C<Repository> command because
Cons will remove the current directory
from the repository list.)

Now if we want to build a copy of the application
with our own C<hello.c> file,
we only need to create the one necessary source file,
and use the C<-R> option to have Cons
use other files from the repository:

  % mkdir $HOME/build1
  % cd $HOME/build1
  % touch hello.c
  % cons -R /usr/all/repository hello
  gcc -c hello.c -o hello.o
  gcc -o hello hello.o /usr/all/repository/libworld.a

Notice that Cons has not bothered
to recreate a local C<libworld.a> library
(or recompile the C<world.o> module),
but instead uses the already-compiled version
from the repository.

Because the MD5 signatures that Cons puts in the C<.consign> file
contain timestamps for the derived files,
the signature timestamps must match the file timestamps
for a signature to be considered valid.

Some software systems may alter the timestamps on
repository files (by copying them, e.g.),
in which case Cons will,
by default,
assume the repository signatures are invalid
and rebuild files unnecessarily.
This behavior may be altered by specifying:

  Repository_Sig_Times_OK 0;

This tells Cons to ignore timestamps
when deciding whether a signature is valid.
(Note that avoiding this sanity check
means there must be proper control
over the repository tree to ensure
that the derived files cannot be modified
without updating the C<.consign> signature.)


=item B<Local copies of files>

If the repository tree contains
the complete results of a build,
and we try to build from the repository without
any files in our local tree,
something moderately surprising happens:

  % mkdir $HOME/build2
  % cd $HOME/build2
  % cons -R /usr/all/repository hello
  cons: "hello" is up-to-date.

Why does Cons say that the C<hello> program is up-to-date
when there is no C<hello> program in the local build directory?
Because the repository
(not the local directory)
contains the up-to-date C<hello> program,
and Cons correctly determines that nothing needs to be done
to rebuild this up-to-date copy of the file.

There are, however, many times in which it is
appropriate to ensure that a local copy of a file always exists.
A packaging or testing script, for example,
may assume that certain generated files exist locally
Instead of making these subsidiary scripts
aware of the repository directory,
the C<Local> command
may be added to a C<Construct> or C<Conscript> file
to specify that a certain file or files
must appear in the local build directory:

  Local qw(
  	hello
  );

Then, if we re-run the same command,
Cons will make a local copy of the program
from the repository copy
(telling you that it is doing so):

  % cons -R /usr/all/repository hello
  Local copy of hello from /usr/all/repository/hello
  cons: "hello" is up-to-date.

Notice that, because the act of making the local copy
is not considered a "build" of the C<hello> file,
Cons still reports that it is up-to-date.

Creating local copies is most useful
for files that are being installed into an
intermediate directory
(for sharing with other directories)
via the C<Install> command.
Accompanying the C<Install> command
for a file with a companion C<Local> command
is so common that Cons provides a
C<Install_Local> command
as a convenient way to do both:

  Install_Local $env, '#export', 'hello';

is exactly equivalent to:

  Install $env '#export', 'hello';
  Local '#export/hello';

Both the C<Local> and C<Install_Local> commands update the local
C<.consign> file with the appropriate file signatures, so that
future builds are performed correctly.


=item B<Repository dependency analysis>

Due to its built-in scanning,
Cons will search the specified repository trees
for included C<.h> files.
Unless the compiler also knows about the repository trees,
though, it will be unable to find C<.h> files
that only exist in a repository.
If, for example, the C<hello.c> file
includes the C<hello.h> file
in its current directory:

  % cons -R /usr/all/repository hello
  gcc -c /usr/all/repository/hello.c -o hello.o
  /usr/all/repository/hello.c:1: hello.h: No such file or directory

Solving this problem forces some requirements onto
the way construction environments are defined
and onto the way the C C<#include> preprocessor directive
is used to include files.

In order to inform the compiler about the repository trees,
Cons will add appropriate C<-I> flags to the compilation commands.
This means that the C<CPPPATH> variable in
the construct environment must explicitly specify
all subdirectories which are to be searched
for included files,
including the current directory.
Consequently, we can fix the above example
by changing the environment creation
in the C<Construct> file as follows:

  $env = new cons(
	CC	=> 'gcc',
	CPPPATH	=> '.',
	LIBS	=> 'libworld.a',
  );

Due to the definition of the C<CPPPATH> variable,
this yields, when we re-execute the command:

  % cons -R /usr/all/repository hello
  gcc -c -I. -I/usr/all/repository /usr/all/repository/hello.c -o hello.o
  gcc -o hello hello.o /usr/all/repository/libworld.a

The order of the C<-I> flags replicates,
for the C preprocessor,
the same repository-directory search path that Cons
uses for its own dependency analysis.
If there are multiple repositories
and multiple C<CPPPATH> directories,
Cons will append the repository directories
to the beginning of each C<CPPPATH> directory,
rapidly multiplying the number of C<-I> flags.
As an extreme example, a C<Construct> file containing:

  Repository qw(
  	/u1
  	/u2
  );

  $env = new cons(
  	CPPPATH	=> 'a:b:c',
  );

Would yield a compilation command of:

	cc -Ia -I/u1/a -I/u2/a -Ib -I/u1/b -I/u2/b -Ic -I/u1/c -I/u2/c -c hello.c -o hello.o

Because Cons relies on the compiler's C<-I> flags
to communicate the order in which repository directories
must be searched,
Cons' handling of repository directories
is fundamentally incompatible with using double-quotes
on the C<#include> directives in your C source code:

  #include "file.h"	/* DON'T USE DOUBLE-QUOTES LIKE THIS */

This is because most C preprocessors,
when faced with such a directive,
will always first search the directory containing the source file.
This undermines the elaborate C<-I> options
that Cons constructs to make the
preprocessor conform to its preferred search path.

Consequently, when using repository trees in Cons,
B<always> use angle-brackets for included files:

  #include <file.h>	/* USE ANGLE-BRACKETS INSTEAD */


=item B<Repository_List>

Cons provides a C<Repository_List> command
to return a list of all repository directories
in their current search order.
This can be used for debugging,
or to do more complex Perl stuff:

  @list = Repository_List;
  print join(' ', @list), "\n";


=item B<Repository interaction with other Cons features>

Cons' handling of repository trees interacts correctly
with other Cons features--which is to say,
it generally does what you would expect.

Most notably, repository trees interact correctly,
and rather powerfully, with the 'Link' command.
A repository tree may contain one or more subdirectories
for version builds
established via C<Link> to a source subdirectory.
Cons will search for derived files
in the appropriate build subdirectories
under the repository tree.


=head1 B<Selective builds>

Cons provides two methods for reducing the size of given build. The first is by specifying targets on the command line, and the second is a method for pruning the build tree. We'll consider target specification first.


=item B<Selective targeting>

Like make, Cons allows the specification of ``targets'' on the command line. Cons targets may be either files or directories. When a directory is specified, this is simply a short-hand notation for every derivable product--that Cons knows about--in the specified directory and below. For example:



  cons build/hello/hello.o

means buildC<hello.o> and everything that C<hello.o> might need. This is from a previous version of the B<Hello, World!> program in which C<hello.o> depended upon C<export/include/world.h>. If that file is not up-to-date (because someone modified C<src/world/world.h)>, then it will be rebuilt, even though it is in a directory remote from C<build/hello>.

In this example:



  cons build

Everything in the C<build> directory is built, if necessary. Again, this may cause more files to be built. In particular, both C<export/include/world.h> and C<export/lib/libworld.a> are required by the C<build/hello> directory, and so they will be built if they are out-of-date.

If we do, instead:



  cons export

then only the files that should be installed in the export directory will be rebuilt, if necessary, and then installed there. Note that C<cons build> might build files that C<cons export> doesn't build, and B<vice-versa>.


=item B<No ``special'' targets>

With Cons, make-style ``special'' targets are not required. The simplest analog with Cons is to use special C<export> directories, instead. Let's suppose, for example, that you have a whole series of unit tests that are associated with your code. The tests live in the source directory near the code. Normally, however, you don't want to build these tests. One solution is to provide all the build instructions for creating the tests, and then to install the tests into a separate part of the tree. If we install the tests in a top-level directory called C<tests>, then:



  cons tests

will build all the tests.



  cons export

will build the production version of the system (but not the tests), and:



  cons build

should probably be avoided (since it will compile tests unecessarily).

If you want to build just a single test, then you could explicitly name the test (in either the C<tests> directory or the C<build> directory). You could also aggregate the tests into a convenient hierarchy within the tests directory. This hierarchy need not necessarily match the source hierarchy, in much the same manner that the include hierarchy probably doesn't match the source hierarchy (the include hierarchy is unlikely to be more than two levels deep, for C programs).

If you want to build absolutely everything in the tree (subject to whatever options you select), you can use:



  cons .

This is not particularly efficient, since it will redundantly walk all the trees, including the source tree. The source tree, of course, may have buildable objects in it--nothing stops you from doing this, even if you normally build in a separate build tree.


=head1 B<Build Pruning>

In conjunction with target selection, B<build pruning> can be used to reduce the scope of the build. In the previous peAcH and baNaNa example, we have already seen how script-driven build pruning can be used to make only half of the potential build available for any given invocation of C<cons>. Cons also provides, as a convenience, a command line convention that allows you to specify which C<Conscript> files actually get ``built''--that is, incorporated into the build tree. For example:



  cons build +world

The C<+> argument introduces a Perl regular expression. This must, of course, be quoted at the shell level if there are any shell meta-characters within the expression. The expression is matched against each C<Conscript> file which has been mentioned in a C<Build> statement, and only those scripts with matching names are actually incorporated into the build tree. Multiple such arguments are allowed, in which case a match against any of them is sufficient to cause a script to be included.

In the example, above, the C<hello> program will not be built, since Cons will have no knowledge of the script C<hello/Conscript>. The C<libworld.a> archive will be built, however, if need be.

There are a couple of uses for build pruning via the command line. Perhaps the most useful is the ability to make local changes, and then, with sufficient knowledge of the consequences of those changes, restrict the size of the build tree in order to speed up the rebuild time. A second use for build pruning is to actively prevent the recompilation of certain files that you know will recompile due to, for example, a modified header file. You may know that either the changes to the header file are immaterial, or that the changes may be safely ignored for most of the tree, for testing purposes.With Cons, the view is that it is pragmatic to admit this type of behavior, with the understanding that on the next full build everything that needs to be rebuilt will be. There is no equivalent to a ``make touch'' command, to mark files as permanently up-to-date. So any risk that is incurred by build pruning is mitigated. For release quality work, obviously, we recommend that you do not use build pruning (it's perfectly OK to use during integration, however, for checking compilation, etc. Just be sure to do an unconstrained build before committing the integration).


=head1 B<Backing builds>

T.B.S.


=head1 B<Temporary overrides>

Cons provides a very simple mechanism for overriding aspects of a build. The essence is that you write an override file containing one or more C<Override> commands, and you specify this on the command line, when you run C<cons>:



  cons -o over export

will build the C<export> directory, with all derived files subject to the overrides present in the file C<over>. If you leave out the C<-o> option, then everything necessary to remove all overrides will be rebuilt.


=item B<Overriding environment variables>

The override file can contain two types of overrides. The first is incoming environment variables. These are normally accessible by the C<Construct> file from the C<%ENV> hash variable. These can trivially be overridden in the override file by setting the appropriate elements of C<%ENV> (these could also be overridden in the user's environment, of course).


=item B<The>B< Override>B< command>

The second type of override is accomplished with the C<Override> command, which looks like this:



  Override <regexp>, <var1> => <value1>, <var2> => <value2>, ...;

The regular expression I<regexp> is matched against every derived file that is a candidate for the build. If the derived file matches, then the variable/value pairs are used to override the values in the construction environment associated with the derived file.

Let's suppose that we have a construction environment like this:



  $CONS = new cons(
  	COPT => '',
  	CDBG => '-g',
  	CFLAGS => '%COPT %CDBG',
  );

Then if we have an override file C<over> containing this command:



  Override '\.o$', COPT => '-O', CDBG => '';

then any C<cons> invocation with C<-o over> that creates C<.o> files via this environment will cause them to be compiled with C<-O >and no C<-g>. The override could, of course, be restricted to a single directory by the appropriate selection of a regular expression.

Here's the original version of the Hello, World! program, built with this environment. Note that Cons rebuilds the appropriate pieces when the override is applied or removed:



  % cons hello
  cc -g -c hello.c -o hello.o
  cc -o hello hello.o
  % cons -o over hello
  cc -O -c hello.c -o hello.o
  cc -o hello hello.o
  % cons -o over hello
  cons: "hello" is up-to-date.
  % cons hello
  cc -g -c hello.c -o hello.o
  cc -o hello hello.o

It's important that the C<Override> command only be used for temporary, on-the-fly overrides necessary for development because the overrides are not platform independent and because they rely too much on intimate knowledge of the workings of the scripts. For temporary use, however, they are exactly what you want.

Note that it is still useful to provide, say, the ability to create a fully optimized version of a system for production use--from the C<Construct> and C<Conscript> files. This way you can tailor the optimized system to the platform. Where optimizer trade-offs need to be made (particular files may not be compiled with full optimization, for example), then these can be recorded for posterity (and reproducibility) directly in the scripts.


=head1 B<More on construction environments:Default construction variables>

We have mentioned, and used, the concept of a B<construction environment>, many times in the preceding pages. Now it's time to make this a little more concrete. With the following statement:



  $env = new cons();


=item B<Default construction variables>

a reference to a new, default construction environment is created. This contains a number of construction variables and some methods. At the present writing, the default list of construction variables is defined as follows:



  CC	=> 'cc',
  CFLAGS	=> '',
  CCCOM	=> '%CC %CFLAGS %_IFLAGS -c %< -o %>',
  CPPPATH	=> '',
  LINK	=> '%CC',		    
  LINKCOM	=> '%LINK %LDFLAGS -o %> %< %_LDIRS %LIBS',
  LIBPATH	=> '',
  LIBS	=> '',
  AR	=> 'ar',
  ARCOM	=> "%AR %ARFLAGS %> %<\n%RANLIB %>",
  ARFLAGS	=> 'r',
  RANLIB	=> 'ranlib',		
  AS	=> 'as',
  ASFLAGS	=> '',
  ASCOM	=> '%AS %ASFLAGS %< -o %>',
  LD	=> 'ld',	
  LDFLAGS	=> '',
  SUFLIB	=> '.a',
  SUFOBJ	=> '.o',
  ENV	=> { PATH => '/bin:/usr/bin' },

These variables are used by the various methods associated with the environment, in particular any method that ultimately invokes an external command will substitute these variables into the final command, as appropriate. For example, the C<Objects> method takes a number of source files and arranges to derive, if necessary, the corresponding object files. For example:



  Objects $env 'foo.c', 'bar.c';

This will arrange to produce, if necessary, C<foo.o> and C<bar.o>. The command invoked is simply C<%CCOM>, which expands through substitution, to the appropriate external command required to build each object. We will explore the substitution rules further under the C<Command> method, below.

The construction variables are also used for other purposes. For example, C<CPPPATH> is used to specify a colon-separated path of include directories. These are intended to be passed to the C preprocessor and are also used by the C-file scanning machinery to determine the dependencies involved in a C Compilation. Variables beginning with underscore, are created by various methods, and should normally be considered ``internal'' variables. For example, when a method is called which calls for the creation of an object from a C source, the variable C<_IFLAGS> is created: this corresponds to the C<-I> switches required by the C compiler to represent the directories specified by C<CPPPATH>.

Note that, for any particular environment, the value of a variable is set once, and then never reset (to change a variable, you must create a new environment. Methods are provided for copying existing environments for this purpose). Some internal variables, such as C<_IFLAGS> are created on demand, but once set, they remain fixed for the life of the environment.

Another variable, C<ENV>, is used to determine the system environment during the execution of an external command. By default, the only environment variable that is set is C<PATH>, which is the execution path for a UNIX command. For the utmost reproducibility, you should really arrange to set your own execution path, in your top-level C<Construct> file (or perhaps by importing an appropriate construction package with the Perl C<use> command). The default variables are intended to get you off the ground.


=head1 B<Default construction methods>

The list of default construction methods includes the following:


=item B<The >B<new>B< constructor>

The C<new> method is a Perl object constructor. That is, it is not invoked via a reference to an existing construction environment B<reference>, but, rather statically, using the name of the Perl B<package> where the constructor is defined. The method is invoked like this:



  $env = new cons(<overrides>);

The environment you get back is blessed into the package C<cons>, which means that it will have associated with it the default methods described below. Individual construction variables can be overridden by providing name/value pairs in an override list. Note that to override any command environment variable (i.e. anything under C<ENV>), you will have to override all of them. You can get around this difficulty by using the C<copy> method on an existing construction environment.


=item B<The >B<clone>B< method>

The C<clone> method creates a clone of an existing construction environment, and can be called as in the following example:



  $env2 = $env1->clone(<overrides>);

You can provide overrides in the usual manner to create a different environment from the original. If you just want a new name for the same environment (which may be helpful when exporting environments to existing components), you can just use simple assignment.


=item B<The >B<copy>B< method>

The C<copy> method extracts the externally defined construction variables from an environment and returns them as a list of name/value pairs. Overrides can also be provided, in which case, the overridden values will be returned, as appropriate. The returned list can be assigned to a hash, as shown in the prototype, below, but it can also be manipulated in other ways: 



  %env = $env1->copy(<overrides>);

The value of C<ENV>, which is itself a hash, is also copied to a new hash, so this may be changed without fear of affecting the original environment. So, for example, if you really want to override just the C<PATH> variable in the default environment, you could do the following:



  %cons = new cons()->copy();
  $cons{ENV}{PATH} = "<your path here>";
  $cons = new cons(%cons);

This will leave anything else that might be in the default execution environment undisturbed.


=item B<The >B<Install>B< method>

The C<Install> method arranges for the specified files to be installed in the specified directory. The installation is optimized: the file is not copied if it can be linked. If this is not the desired behavior, you will need to use a different method to install the file. It is called as follows:



  Install $env <directory>, <names>;

Note that, while the files to be installed may be arbitrarily named, only the last component of each name is used for the installed target name. So, for example, if you arrange to install C<foo/bar> in C<baz>, this will create a file C<bar> in directory C<baz> (not C<foo/bar>).


=item B<The >B<Command>B< method>

The C<Command> method is a catchall method which can be used to arrange for any external command to be called to update the target. For this command, a target file and list of inputs is provided. In addition a construction command line, or lines, is provided as a string (this string may have multiple commands embedded within it, separated by new lines). C<Command> is called as follows:



  Command $env <target>, <inputs>, <construction command>;

The target is made dependent upon the list of input files specified, and the inputs must be built successfully or Cons will not attempt to build the target.

Within the construction command, any variable from the construction environment may be introduced by prefixing the name of the construction variable with C<%>. This is recursive: the command is expanded until no more substitutions can be made. If a construction variable is not defined in the environment, then the null string will be substituted.

There are several pseudo variables which will also be expanded:


=over 10


=item C<%>>

The target file name (in a multi-target command, this is always the first target mentioned). 

=item C<%0>

Same as C<%>>. 

=item C<%1,>

%2, ..., %9 These refer to the first through ninth input file, respectively. 

=item C<%<>

The full set of inputs. If any of these have been used anywhere else in the current command line (via C<%1>, C<%2>, etc.), then those will be deleted from the list provided by C<%<>. Consider the following command found in a C<Conscript> file in the directory C<test>: 

  Command $env 'tgt', qw(foo bar baz), qq(
  	echo %< -i %1 > %>
  	echo %< -i %2 >> %>
  	echo %< -i %3 >> %>
  );



=item 

If C<tgt> needed to be updated, then this would result in the execution of the following commands, assuming that no remapping has been established for directory C<test>: 

  echo test/bar test/baz -i test/foo > test/tgt
  echo test/foo test/baz -i test/bar >> test/tgt
  echo test/foo test/bar -i test/baz >> test/tgt



=back

 Any of the above pseudo variables may be followed immediately by C<:d> or C<:f>, to indicate the directory or file associated with the name. Continuing with the above example, C<%<:f> would expand to C<foo bar baz>, and C<%>:d> would expand to C<test>.

After substitution occurs, strings of white space are converted into single blanks, and leading and trailing white space is eliminated. It is therefore not possible to introduce variable length white space in strings passed into a command, without resorting to some sort of shell quoting.

If a multi-line command string is provided, the commands are executed sequentially. If any of the commands fails, then none of the rest are executed, and the target is not marked as updated, i.e. a new signature is not stored for the target.

Normally, if all the commands succeed, and return a zero status (or whatever platform-specific indication of success is required), then a new signature is stored for the target. If a command erroneously reports success even after a failure, then Cons will assume that the target file created by that command is accurate and up-to-date.

The first word of each command string, after expansion, is assumed to be an executable command looked up on the C<PATH> environment variable (which is, in turn, specified by the C<ENV> construction variable). If this command is found on the path, then the target will depend upon it: the command will therefore be automatically built, as necessary. It's possible to write multi-part commands to some shells, separated by semi-colons. Only the first command word will be depended upon, however, so if you write your command strings this way, you must either explicitly set up a dependency (with the C<Depends> method), or be sure that the command you are using is a system command which is expected to be available. If it isn't available, you will, of course, get an error.

If there are shell meta characters anywhere in the expanded command line, such as C<<>, C<>>, quotes, or semi-colon, then the command will actually be executed by invoking a shell. This means that a command such as:



  cd foo

alone will typically fail, since there is no command C<cd> on the path. But the command string:



  cd $<:d; tar cf $>:f $<:f

when expanded will still contain the shell meta character semi-colon, and a shell will be invoked to interpret the command. Since C<cd> is interpreted by this sub-shell, the command will execute as expected.

To specify a command with multiple targets, you can specify a reference to a list of targets. In Perl, a list reference can be created by enclosing a list in square brackets. Hence the following command:



  Command $env ['foo.h', 'foo.c'], 'foo.template', q(
  	gen %1
  );

could be used in a case where the command C<gen> creates two files, both C<foo.h> and C<foo.c>.


=item B<The >B<Objects>B< method>

The C<Objects> method arranges to create the object files that correspond to the specified source files. It is invoked as shown below:



  @files = Objects $env <source or object files>;

Under Unix, source files ending in C<.s> and C<.c> are currently supported, and will be compiled into a name of the same file ending in C<.o>. By default, all files are created by invoking the external command which results from expanding the C<CCOM> construction variable, with C<%<> and C<%>> set to the source and object files, respectively (see the C<Command> method for expansion details) . The variable C<CPPPATH> is also used when scanning source files for dependencies. This is a colon separated list of pathnames, and is also used to create the construction variable C<_IFLAGS,> which will contain the appropriate list of -C<I> options for the compilation. Any relative pathnames in C<CPPPATH> is interpreted relative to the directory in which the associated construction environment was created (absolute and top-relative names may also be used). This variable is used by C<CCOM>. The behavior of this command can be modified by changing any of the variables which are interpolated into C<CCOM>, such as C<CC>, C<CFLAGS>, and, indirectly, C<CPPPATH>. It's also possible to replace the value of C<CCOM>, itself. As a convenience, this file returns the list of object filenames.


=item B<The >B<Program>B< method>

The C<Program> method arranges to link the specified program with the specified object files. It is invoked in the following manner:



  Program $env <program name>, <source or object files>;

Source files may be specified in place of objects files--the C<Objects> method will be invoked to arrange the conversion of all the files into object files, and hence all the observations about the C<Objects> method, above, apply to this method also. The actual linking of the program will be handled by an external command which results from expanding the C<LINKCOM> construction variable, with C<%<> set to the object files to be linked (in the order presented), and C<%>> set to the target (see the C<Command> method for expansion details). The user may set additional variables in the construction environment, including C<LINK>, to define which program to use for linking, C<LIBPATH>, a colon-separated list of library search paths, for use with library specifications of the form I<-llib>, and C<LIBS>, specifying the list of libraries to link against (in either I<-llib> form or just as pathnames. Relative pathnames in both C<LIBPATH> and C<LIBS> are interpreted relative to the directory in which the associated construction environment created (absolute and top-relative names may also be used). Cons automatically sets up dependencies on any libraries mentioned in C<LIBS>: those libraries will be built before the command is linked.


=item B<The >B<Library>B< method>

The C<Library> method arranges to create the specified library from the specified object files. It is invoked as follows:



  Library $env <library name>, <source or object files>;

Source files may be specified in place of objects files--the C<Objects> method will be invoked to arrange the conversion of all the files into object files, and hence all the observations about the C<Objects> method, above, apply to this method also. The actual creation of the library will be handled by an external command which results from expanding the C<ARCOM> construction variable, with C<%<> set to the library members (in the order presented), and C<%>> to the library to be created (see the C<Command> method for expansion details). The user may set variables in the construction environment which will affect the operation of the command. These include C<AR>, the archive program to use, C<ARFLAGS>, which can be used to modify the flags given to the program specified by C<AR>, and C<RANLIB>, the name of a archive index generation program, if needed (if the particular need does not require the latter functionality, then C<ARCOM> must be redefined to not reference C<RANLIB>).

The C<Library> method allows the same library to be specified in multiple method invocations. All of the contributing objects from all the invocations (which may be from different directories) are combined and generated by a single archive command. Note, however, that if you prune a build so that only part of a library is specified, then only that part of the library will be generated (the rest will disappear!).


=item B<The >B<Module>B< method>

The C<Module> method is a combination of the C<Program> and C<Command> methods. Rather than generating an executable program directly, this command allows you to specify your own command to actually generate a module. The method is invoked as follows:



  Module $env <module name>, <source or object files>, <construction command>;

This command is useful in instances where you wish to create, for example, dynamically loaded modules, or statically linked code libraries. 


=item B<The >B<Depends>B< method>

The C<Depends> method allows you to specify additional dependencies for a target. It is invoked as follows:



  Depends $env <target>, <dependencies>;

This may be occasionally useful, especially in cases where no scanner exists (or is writable) for particular types of files. Normally, dependencies are calculated automatically from a combination of the explicit dependencies set up by the method invocation or by scanning source files.


=head1 B<Extending Cons>


=item B<Overriding construction variables>

There are several ways of extending Cons, which vary in degree of difficulty. The simplest method is to define your own construction environment, based on the default environment, but modified to reflect your particular needs. This will often suffice for C-based applications. You can use the C<new> constructor, and the C<clone> and C<copy> methods to create hybrid environments. These changes can be entirely transparent to the underlying C<Conscript> files.


=item B<Adding new methods>

For slightly more demanding changes, you may wish to add new methods to the C<cons> package. Here's an example of a very simple extension, C<InstallScript>, which installs a tcl script in a requested location, but edits the script first to reflect a platform-dependent path that needs to be installed in the script:



  # cons::InstallScript - Create a platform dependent version of a shell script
  # by replacing string ``#!your-path-here'' with platform specific path $BIN_DIR.
  sub cons::InstallScript {
  	my($env, $dst, $src) = shift;
  	Command $env $dst, $src, qq(
  		sed s+your-path-here+$BINDIR+ %< > %>
  		chmod oug+x %>
  	);
  }

Notice that this method is defined directly in the C<cons> package (by prefixing the name with C<cons::>). A change made in this manner will be globally visible to all environments, and could be called as in the following example:



  InstallScript $env "$BIN/foo", "foo.tcl";

For a small improvement in generality, the C<BINDIR> variable could be passed in as an argument or taken from the construction environment--as C<%BINDIR>.


=item B<Overriding methods>

Instead of adding the method to the C<cons> name space, you could define a new package which inherits existing methods from the C<cons> package and overrides or adds others. This can be done using Perl's inheritance mechanisms.

The following example defines a new package cons::switch which overrides the standard C<Library> method. The overridden method builds linked library modules, rather than library archives. A new constructor is provided. Environments created with this Constructor will have the new library method; others won't.



  



  package cons::switch;
  BEGIN {@ISA = `cons'}



  sub new {
  	shift;
  	bless new cons(@_);
  }



  sub Library {
  	my($env) = shift;
  	my($lib) = shift;
  	my(@objs) = Objects $env @_;
  	Command $env $lib, @objs, q(
  		%LD -r %LDFLAGS %< -o %>
  	);
  }

This functionality could be invoked as in the following example:



  $env = new cons::switch(@overrides);
  ...
  Library $env 'lib.o', 'foo.c', 'bar.c';


=head1 B<Invoking Cons>

The C<cons> command is always invoked from the root of the build tree. A C<Construct> file must exist in that directory. If the C<-f> argument is used, then an alternate C<Construct> file may be used (and, possibly, an alternate root, since C<cons> will cd to C<Construct> file's containing directory). The command is invoked as follows:



  cons <arguments>

where I<arguments> can be any of the following, in any order:


=over 10


=item I<target>

Build the specified target. If I<target> is a directory, then recursively build everything within that directory. 

=item I<+pattern>

Limit the C<Conscript >files considered to just those that match I<pattern>, which is a Perl regular expression. Multiple C<+> arguments are accepted. 

=item I<name>=<val>

Sets I<name> to value I<val> in the C<ARG> hash passed to the top-level C<Construct> file. 

=item C<-f>

<file> Use the specified file instead of C<Construct> (but first change to containing directory of I<file>). 

=item C<-o>

<file> Read override file I<file>. 

=item C<-k>

Keep going as far as possible after errors. 

=item C<-p>

Show construction products in specified trees. No build is attempted. 

=item C<-pa>

Show construction products and associated actions. No build is attempted. 

=item C<-pw>

Show products and where they are defined. No build is attempted. 

=item C<-r>

Remove construction products associated with <targets>. No build is attempted. 

=item C<-v>

Show C<cons> version and continue processing. 

=item C<-x>

Show a help message similar to this one, and exit. 

=back

Note thatC<cons -r .> is equivalent to a full recursive C<make clean>, but requires no support in the C<Construct> file or any C<Conscript> files. This is most useful if you are compiling files into source directories (if you separate the C<build>/C<export> directories, then you can just remove the directories).

The options C<-p>, C<-pa>, and C<-pw> are extremely useful for use as an aid in reading scripts or debugging them. If you want to know what script installs C<export/include/foo.h>, for example, just type:



  cons -pw export/include/foo.h


=head1 B<Using and writing dependency scanners>

T.B.S.






=item B<Last Modified: Thu Jul  2 17:24:03 EDT 1998>