cpphs
cpphs is a simplified re-implementation of
cpp,
the C pre-processor, in Haskell.
Why re-implement cpp? Rightly or wrongly, the C pre-processor is
widely used in Haskell source code. It enables conditional compilation
for different compilers, different versions of the same compiler,
and different OS platforms. It is also occasionally used for its
macro language, which can enable certain forms of platform-specific
detail-filling, such as the tedious boilerplate generation of instance
definitions and FFI declarations. However, there are two problems with
cpp, aside from the obvious aesthetic ones:
- For some Haskell systems, notably Hugs on Windows, a true cpp
is not available by default.
- Even for the other Haskell systems, the common cpp provided by
the gcc 3.x series is changing subtly in ways that are
incompatible with Haskell's syntax. There have always been
problems with, for instance, string gaps, and prime characters
in identifiers. These problems are only going to get worse.
So, it seemed right to attempt to provide an alternative to cpp,
both more compatible with Haskell, and itself written in Haskell so
that it can be distributed with compilers.
This version of the C pre-processor is pretty-much feature-complete,
and compatible with the -traditional style.
It has two modes:
- conditional compilation only (--nomacro),
- and full macro-expansion (default).
In --nomacro mode, cpphs performs only conditional
compilation actions, namely #include's, #if's,
and #ifdef's are processed according to text-replacement
definitions (both command-line and internal), but no parameterised
macro expansion is performed. In full compatibility mode (the
default), textual replacements and macro expansions are also processed
in the remaining body of non-cpp text.
Working Features:
- #ifdef
- simple conditional compilation
- #if
- the full boolean language of defined(),
&&, ||, ==, etc.
- #elif
- chained conditionals
- #define
- in-line definitions (text replacements and macros)
- #undef
- in-line revocation of definitions
- #include
- file inclusion
- #line
- line number directives
- \\n
- line continuations within all # directives
- /**/
- token catenation within a macro definition
- ##
- ANSI-style token catenation
- #
- ANSI-style token stringisation
- __FILE__
- special text replacement for DIY error messages
- __LINE__
- special text replacement for DIY error messages
- __DATE__
- special text replacement
- __TIME__
- special text replacement
Macro expansion is recursive. Redefinition of a macro name does not
generate a warning. Macros can be defined on the command-line with
-D just like textual replacements. Macro names are permitted to be
Haskell identifiers e.g. with the prime ' and backtick ` characters,
which is slightly looser than in C, but they still may not include
operator symbols.
Numbering of lines in the output is preserved so that any later
processor can give meaningful error messages. When a file is
#include'd, cpphs inserts #line directives for the
same reason. Numbering should be correct even in the presence of
line continuations. If you don't want #line directives in
the final output, use the --noline option.
Any syntax errors in cpp directives gives a message to stderr and
halts the program. Failure to find a #include'd file produces a
warning to stderr, but processing continues.
Usage: cpphs [ filename | -Dsym | -Dsym=val | -Ipath ]+ [-Ofile]
[--nomacro] [--noline] [--strip] [--hashes] [--layout]
cpphs --version
You can give any number of filenames on the command-line. The
results are catenated on standard output.
Options:
- -Dsym
- define a textual replacement (default value is 1)
- -Dsym=val
- define a textual replacement with a specific value
- -Ipath
- add a directory to the search path for #include's
- -Ofile
- specify a file for output (default is stdout)
- --nomacro
- only process #ifdef's and #include's,
do not expand macros
- --noline
- remove #line droppings from the output
- --strip
- convert C-style comments to whitespace, even outside
cpp directives
- --hashes
- recognise the ANSI # stringise operator, and ## for
token catenation, within macros
- --layout
- preserve newlines within macro expansions
- --version
- report version number of cpphs and stop
There are NO textual replacements defined by default. (Normal cpp
usually has definitions for machine, OS, etc. These could easily
be added to the cpphs source code if you wish.) The search path is
searched in order of the -I options, except that the directory of the
calling file, then the current directory, are always searched first.
Again, there is no default search path (and again, this could easily
be changed).
Current version:
cpphs-0.7, release date 2004.09.01
By HTTP:
.tar.gz,
.zip,
FreeBSD port.
- Enable the __FILE__, __LINE__, __DATE__,
and __TIME__ specials, which can be useful for creating
DIY error messages.
Older versions:
cpphs-0.6, release date 2004.07.30
By HTTP:
.tar.gz,
.zip,
FreeBSD port.
cpphs-0.5, release date 2004.06.07
By HTTP:
.tar.gz,
.zip,
FreeBSD port.
- Added a --version flag to report the version number.
- Renamed --stringise to --hashes, and use it to turn on ## catenation
as well.
- Bugfix for #if 1, previously interpreted as false.
- Bugfix for --nolines: it no longer adds extra spurious newlines.
- File inclusion now looks in the directory of the calling file.
- Failure to find an include file is now merely a warning to stderr
rather than an error.
- Added a --layout flag. Previously, line continuations in a macro
definition were always preserved in the output, permitting use
of the Haskell layout rule even inside a macro. The default is now
to remove line continuations for conformance with cpp, but the option
of using --layout is still possible.
cpphs-0.4, release date 2004.05.19
By HTTP:
.tar.gz,
.zip.
- New flag -Ofile to redirect output
- Bugfix for precedence of ! in #if !False && False
- Bugfix for whitespace permitted between # and if
- Bugfix for #define F "blah"; #include F
cpphs-0.3, release date 2004.05.18
By HTTP:
.tar.gz,
.zip.
Fix recursive macro expansion bug. Added option to strip C comments.
Added option to recognise the # stringise operator.
cpphs-0.2, release date 2004.05.15
By HTTP:
.tar.gz,
.zip.
Implements textual replacement and macro expansion.
cpphs-0.1, release date 2004.04.07
By HTTP:
.tar.gz,
.zip.
Initial release: implements conditional compilation and file inclusion only.
Building instructions
To build cpphs, use
hmake Main [-package base]; mv Main cpphs
or
ghc --make Main; mv Main cpphs
or
runhugs Main
In general, cpphs is based on the -traditional behaviour, not
ANSI C, and has the following main differences from the standard cpp.
General
- The # that introduces any cpp directive must be in the first
column of a line (whereas ANSI permits whitespace before the #).
- Generates the #line n "filename" syntax, not the # n
"filename" variant.
- C comments are only removed from within cpp directives. They are
not stripped from other text. Consider for instance that in
Haskell, all of the following are valid operator symbols: /*
*/ */* However, you can turn on C-comment removal with the
--strip option.
Macro language
- Accepts /**/ for token-pasting in a macro definition.
However, /* */ (with any text between the open/close
comment) inserts whitespace.
- The ANSI ## token-pasting operator is available with
the --hashes flag. This is to avoid misinterpreting
any valid Haskell operator of the same name.
- Replaces a macro formal parameter with the actual, even inside a
string (double or single quoted). This is -traditional behaviour,
not supported in ANSI.
- Recognises the # stringisation operator in a macro
definition only if you use the --hashes option. (It is
an ANSI addition, only needed because quoted stringisation (above)
is prohibited by ANSI.)
- Preserves whitespace within a textual replacement definition
exactly (modulo newlines), but leading and trailing space is eliminated.
- Preserves whitespace within a macro definition (and trailing it)
exactly (modulo newlines), but leading space is eliminated.
- Preserves whitespace within macro call arguments exactly
(including newlines), but leading and trailing space is eliminated.
- With the --layout option, line continuations in a textual
replacement or macro definition are preserved as line-breaks in the
macro call. (Useful for layout-sensitive code in Haskell.)
I am interested in hearing your feedback on cpphs. Bug reports
especially welcome. You can send feature requests too, but I won't
guarantee to implement them if they depart much from the ordinary
cpp's behaviour. Please mail
Copyright: © 2004 Malcolm Wallace,
except for ParseLib (Copyright © 1995 Graham Hutton and Erik Meijer)
License: The library modules in cpphs are distributed under
the terms of the LGPL (see file LICENCE-LGPL
for more details). If that's a problem for you, contact me to make
other arrangements. The application module 'Main.hs' itself is GPL
(see file LICENCE-GPL).
This software comes with no warranty. Use at your own risk.