GNU Source-highlight, given a source file, produces a document with syntax highlighting.
This is Edition 3.1.7 of the Source-highlight Library manual.
This file documents GNU Source-highlight Library version 3.1.7.
This manual is for GNU Source-highlight Library (version 3.1.7, 16 December 2011), which given a source file, produces a document with syntax highlighting.
Copyright © 2005-2008 Lorenzo Bettini, http://www.lorenzobettini.it.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with the Front-Cover Texts being “A GNU Manual,” and with the Back-Cover Texts as in (a) below. A copy of the license is included in the section entitled “GNU Free Documentation License.”(a) The FSF's Back-Cover Text is: “You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development.”
GNU Source-highlight, given a source file, produces a document with syntax highlighting. see Introduction for a wider introduction about GNU Source-highlight.
This file documents the Library provided by GNU Source-highlight, thus its audience is programmers only, who want to use source-highlight features inside their programs, not the users of Source-highlight. This library is part of GNU Source-highlight since version 3.0.
However, the main principles of GNU Source-highlight will be given for granted, together with all the notions for writing language definition files, output definition files, and so on. Again, we refer to the documentation of GNU Source-highlight for all these features.
GNU Source-highlight library is part of GNU Source-highlight, thus it will be installed together with Source-highlight itself; we refer to see Installation for further instructions on installing GNU Source-highlight. Here we detail only the parts concerning the library.
If you want to build and install the API documentation of
Source-highlight library, you need to run configure
with the
option --with-doxygen
, but you need the program Doxygen,
http://www.doxygen.org, to build the documentation.
The documentation will be installed in the following directory:
Library API documentation
prefix/share/doc/source-highlight/api
library examples
prefix/share/doc/source-highlight/examples
conf files
prefix/share/source-highlight
You can use GNU Source-highlight library in your programs, by including its headers and linking to the file libsource-highlight.ext1.
All the classes of the library are part of the namespace
srchilite
, and all the header files are in the subdirectory
srchilite
.
The easiest way to use GNU Source-highlight library in your program is
to rely on autotools, i.e., Automake, Autoconf, etc. In
particular, the library is installed with a
pkg-config
2
configuration file (metadata file), source-highlight.pc.
pkg-config is a tool for helping compiling applications and libraries. It helps you insert the correct compiler options on the command line so an application can use Source-highlight library simply by running
gcc -o test test.c `pkg-config --libs --cflags source-highlight`
rather than hard-coding values on where to find the library. Moreover, this will provide also with the correct compiler flags and libraries used by Source-highlight library itself, e.g., Boost Regex library.
Note that pkg-config
searches for .pc files in its
standard directories. If you installed the library in a non standard
directory, you'll need to set the PKG_CONFIG_PATH
environment
variable accordingly.
For instance, if I install the library into
/usr/local/lib
, the .pc file will be installed into
/usr/local/lib/pkgconfig
, and then I'll need to call
pkg-config
as follows:
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig \ pkg-config --libs --cflags source-highlight
In your configure.ac you can use the autoconf macro provided
by pkg-config
; here is an example:
# Checks for libraries. PKG_CHECK_MODULES(SRCHILITE, [source-highlight >= 3.0]) AC_SUBST(SRCHILITE_CFLAGS) AC_SUBST(SRCHILITE_LIBS)
Then, you can use the variables SRCHILITE_CFLAGS
and
SRCHILITE_LIBS
in your makefiles accordingly.
For instance,
... AM_CPPFLAGS = $(SRCHILITE_CFLAGS) ... LDADD = $(SRCHILITE_LIBS) ...
Here we present the main classes of the Source-highlight library, together with some example of use. For the documentation of all the classes (and methods of the classes) we refer to the generated API documentation (see See Installation).
You will note that often, methods and constructors of the
classes of the libraries do not take a pointer or a reference
to a class, say MyClass
, but an object of type MyClassPtr
;
these are
shared pointers, in particular the ones provided by the Boost
libraries (they are typedefs using, e.g.,
boost::shared_ptr<MyClass>
). This will avoid dangerous dangling
pointers and possible memory leaks in the library.
If on the contrary, a method or a constructor in a class of the library
takes a standard pointer, say MyClass *
, then that class will
NEVER delete such pointer. It is up to the actual owner the object of
MyClass *
to delete the object when it is not needed anymore.
The classes of the libraries can raise exceptions if errors are
encountered (e.g., an input file cannot be opened, or a language
definition file cannot be parsed); the exception classes can be found in
the API documentation, and all exception classes inherit from
std::exception
class.
The SourceHighlight
class is the class of the library that basically
implements all the functionalities used by the program
source-highlight
itself; thus it highlights an input file generating
an output file. It can be configured with many options, and basically
it has a get/set methods for all the command line options of
source-highlight
(we refer also to see Invoking source-highlight).
For instance, the following example (source-highlight-console-main.cpp) highlights an input file to the console (the colors are obtained through ANSI color escape sequences (so you need a console program that supports this):
#ifdef HAVE_CONFIG_H #include "config.h" #endif #include <iostream> #include "srchilite/sourcehighlight.h" #include "srchilite/langmap.h" using namespace std; #ifndef DATADIR #define DATADIR "" #endif int main(int argc, char *argv[]) { // we highlight to the console, through ANSI escape sequences srchilite::SourceHighlight sourceHighlight("esc.outlang"); // make sure we find the .lang and .outlang files sourceHighlight.setDataDir(DATADIR); // by default we highlight C++ code string inputLang = "cpp.lang"; if (argc > 1) { // we have a file name so we detect the input source language srchilite::LangMap langMap(DATADIR, "lang.map"); string lang = langMap.getMappedFileNameFromFileName(argv[1]); if (lang != "") { inputLang = lang; } // otherwise we default to C++ // output file name is empty => cout sourceHighlight.highlight(argv[1], "", inputLang); } else { // input file name is empty => cin sourceHighlight.highlight("", "", inputLang); } return 0; }
Note that if a file name is passed at the command line, the program
tries to detect the source language by using a LangMap
class
object, specifying the map file lang.map, which is the one
mapping file extensions to language definition files (e.g., if the file
name has extension .java it will use the corresponding
java.lang). Otherwise we assume that we want to highlight
a C++ file.
All the highlighting is performed by the highlight
method; since
we don't specify an output file name it will output the highlighted
result directly to the console. In case we don't have an input filename
either, highlight
method will read from the standard input. Since
the highlighting takes place one line per time, you can test the program
this way: you'll enter a line on the console and when you press enter,
the program will echo the same line highlighted.
The DATADIR
is not even mandatory, provided you installed
Source-highlight correctly, or that you set it up, using
source-highlight-settings
program.
The formatting of Source-highlight library, i.e., how to actually perform the highlighting, or what to do when we need to highlight something, can be completely customized; the library detects (using regular expressions based on language definition files) that something must be highlighted as, say, a keyword, and you can then do whatever you want with this information. The default formatting strategy is to output an highlighted text using a specific formatting format, but you're free to do whatever you like, if you want.
This formatting abstraction is done through Formatter
class, which
basically declares only the abstract format
method which takes as
parameters the string to format, and further (possibly empty) additional
parameters, implemented by FormatterParams
class. Note that the
format
method does not get as an argument how the passed string
must be formatted (e.g., as a keyword, as a type, etc.); this
information must be stored in the formatter from the start. Indeed, the
mapping between a language element and a formatter is performed by
FormatterManager
class. An object of this class must be created
by specifying a default formatter object, that will be used when the
formatter manager will be queried for a formatter for a specific
language element that it is not able to handle (in this it will fall
back by returning the default formatter).
You can implement a completely customized formatting strategy. For
instance, this is a customized formatter (infoformatter.h) which,
when requested to format a string, it simply writes this information
specifying which kind of language element it is, and the position in the
line (the start
field in FormatterParams
class). Note that
the language element is stored in a field of the class, and it is set at
object creation time. We avoid to write anything if we are requested to
format something as "normal"
, or if the string to format is empty.
class InfoFormatter: public srchilite::Formatter { /// the language element represented by this formatter std::string elem; public: InfoFormatter(const std::string &elem_ = "normal") : elem(elem_) { } virtual void format(const std::string &s, const srchilite::FormatterParams *params = 0) { // do not print anything if normal or string to format is empty if (elem != "normal" || !s.size()) { std::cout << elem << ": " << s; if (params) std::cout << ", start: " << params->start; std::cout << std::endl; } } }; /// shared pointer for InfoFormatter typedef boost::shared_ptr<InfoFormatter> InfoFormatterPtr;
For convenience we also declare a typedef for the shared pointer (since the formatter manager takes only shared pointers to formatters).
In order to customize the formatting, there are some more steps
to do, and in particular, you cannot use SourceHighlight
class anymore
but you need to use more classes.
First of all, you need LangDefManager
class which takes care of
building the regular expressions starting from a language definition
file; in order to do this it uses a HighlightRuleFactory
class
object; for the moment, only the implementation based on boost regular
expression exists, so you can simply pass an object of
RegexRuleFactory
class. Once you have an object of
LangDefManager
class, you can use the
getHighlightState
method to build the
automaton to perform the
highlight (in particular the initial state of such automaton, of
HighlightState
class), and you should pass this to an object that
can use the automaton to perform the highlighting. To do this, you can
use SourceHighlighter
class whose objects can be used to highlight
a line of text, using highlightParagraph
method.
You can then create a FormatterManager
class object and populate
it with your formatters and set it to the SourceHighlighter
class
object. The following example (infoformatter-main.cpp) shows how
to perform these steps; note that we can share the same formatter for
different language elements:
#ifdef HAVE_CONFIG_H #include "config.h" #endif #include <iostream> #include "srchilite/langdefmanager.h" #include "srchilite/regexrulefactory.h" #include "srchilite/sourcehighlighter.h" #include "srchilite/formattermanager.h" #include "infoformatter.h" using namespace std; #ifndef DATADIR #define DATADIR "" #endif int main() { srchilite::RegexRuleFactory ruleFactory; srchilite::LangDefManager langDefManager(&ruleFactory); // we highlight C++ code for simplicity srchilite::SourceHighlighter highlighter(langDefManager.getHighlightState( DATADIR, "cpp.lang")); srchilite::FormatterManager formatterManager(InfoFormatterPtr( new InfoFormatter)); InfoFormatterPtr keywordFormatter(new InfoFormatter("keyword")); formatterManager.addFormatter("keyword", keywordFormatter); formatterManager.addFormatter("string", InfoFormatterPtr(new InfoFormatter( "string"))); // for "type" we use the same formatter as for "keyword" formatterManager.addFormatter("type", keywordFormatter); formatterManager.addFormatter("comment", InfoFormatterPtr( new InfoFormatter("comment"))); formatterManager.addFormatter("symbol", InfoFormatterPtr(new InfoFormatter( "symbol"))); formatterManager.addFormatter("number", InfoFormatterPtr(new InfoFormatter( "number"))); formatterManager.addFormatter("preproc", InfoFormatterPtr( new InfoFormatter("preproc"))); highlighter.setFormatterManager(&formatterManager); // make sure it uses additional information srchilite::FormatterParams params; highlighter.setFormatterParams(¶ms); string line; // we now highlight a line a time while (getline(cin, line)) { // reset position counter within a line params.start = 0; highlighter.highlightParagraph(line); } return 0; }
Note that, since we highlight a line a time, we must reset the
start
field each time we start to examine a new line.
For simplicity this example highlights only C++ code and reads directly from the standard input and writes to the standard output. This is a run of the example reading from the standard input (so each time you insert a line you get the output of your formatters):
// this is a comment comment: //, start: 0 comment: this is a comment, start: 2 #include <foobar.h> preproc: #include, start: 0 string: <foobar.h>, start: 9 int abc = 100 + 5; keyword: int, start: 0 symbol: =, start: 8 number: 100, start: 10 symbol: +, start: 14 number: 5, start: 16 symbol: ;, start: 17
Source-highlight can rely on style (and css style) files for generating
formatting. Usually, the formatters are built according to the output
format, specified through .outlang files, see Output Language Definitions. However, you can also create
your own formatters based on the information of the style file (or css
style file). During the parsing of these style files, a
FormatterFactory
class object is used by the library, and you can
provide a customized factory (the one that is used by the library is
TextStyleFormatterFactory
class). The only abstract method
of FormatterFactory
class is createFormatter
method.
In order to parse a style file, you can use the static methods of the
StyleFileParser
class, which require the file name of the style
file (and possibly the path to search for the style file, otherwise the
default one is used), the factory to create formatters, and a reference
to a string where the document background color will be stored. The
methods are parseStyleFile
method and
parseCssStyleFile
method.
For instance, let's create a customized formatter styleformatter.h that simply prints how a language element will be formatted (but no formatting will take place); for the sake of simplicity we will use only public fields:
struct StyleFormatter: public srchilite::Formatter { /// the language element represented by this formatter std::string elem; bool bold, italic, underline, fixed, not_fixed; std::string color; std::string bgColor; StyleFormatter(const std::string &elem_ = "normal") : elem(elem_), bold(false), italic(false), underline(false), fixed(false), not_fixed(false) { } virtual void format(const std::string &s, const srchilite::FormatterParams *params = 0) { // do not print anything if normal or string to format is empty if (elem != "normal" || !s.size()) { std::cout << elem << ": \"" << s << "\"" << std::endl; std::cout << "formatted as: " << (bold ? "bold " : "") << (italic ? "italic " : "") << (underline ? "underline " : ""); std::cout << (color.size() ? "color: " + color + " " : ""); std::cout << (bgColor.size() ? "bgcolor: " + bgColor : "") << std::endl; } } }; /// shared pointer for StyleFormatter typedef boost::shared_ptr<StyleFormatter> StyleFormatterPtr;
Now, we create a customized factory (file
styleformatterfactory.h), implementing the method
createFormatter
method. Note that the base class
FormatterFactory
class does not provide any means to store the
created formatters, so it's up to the derived classes to store the
created formatters somewhere:
struct StyleFormatterFactory: public srchilite::FormatterFactory { StyleFormatterMap formatterMap; bool hasFormatter(const string &key) const { return formatterMap.find(key) != formatterMap.end(); } bool createFormatter(const string &key, const string &color, const string &bgcolor, srchilite::StyleConstantsPtr styleconstants) { if (hasFormatter(key)) return false; StyleFormatter *formatter = new StyleFormatter(key); formatterMap[key] = StyleFormatterPtr(formatter); if (styleconstants.get()) { for (srchilite::StyleConstantsIterator it = styleconstants->begin(); it != styleconstants->end(); ++it) { switch (*it) { case srchilite::ISBOLD: formatter->bold = true; break; case srchilite::ISITALIC: formatter->italic = true; break; case srchilite::ISUNDERLINE: formatter->underline = true; break; case srchilite::ISFIXED: formatter->fixed = true; break; case srchilite::ISNOTFIXED: formatter->not_fixed = true; break; case srchilite::ISNOREF: // ignore references here break; } } } formatter->color = color; formatter->bgColor = bgcolor; return true; } };
The createFormatter
method will be called when parsing a style
file to create a formatter corresponding to a specific language element;
this method should return false if the creation of a formatter failed
(e.g., in this case, if a formatter for a given element had already been
created). The method is passed the language element name, the colors
for the element as specified in the style file (that can be empty if no
color was specified), and a StyleConstants
enum shared pointer
with formatting informations such as, boldface, italics, etc. The
factory can use this information to create the customized formatter.
Now, we can use this customized formatter factory in our program (file styleformatter-main.cpp):
#ifdef HAVE_CONFIG_H #include "config.h" #endif #include <iostream> #include "srchilite/langdefmanager.h" #include "srchilite/regexrulefactory.h" #include "srchilite/sourcehighlighter.h" #include "srchilite/formattermanager.h" #include "srchilite/stylefileparser.h" // for parsing style files #include "styleformatterfactory.h" using namespace std; #ifndef DATADIR #define DATADIR "" #endif int main() { srchilite::RegexRuleFactory ruleFactory; srchilite::LangDefManager langDefManager(&ruleFactory); // we highlight C++ code for simplicity srchilite::SourceHighlighter highlighter(langDefManager.getHighlightState( DATADIR, "cpp.lang")); // our factory for our formatters StyleFormatterFactory factory; // the background color for the document (not used here) string docBgColor; // let's parse the default.style and create our formatters srchilite::StyleFileParser::parseStyleFile(DATADIR, "default.style", &factory, docBgColor); // now we need to fill up the formatter manager with our formatters srchilite::FormatterManager formatterManager(StyleFormatterPtr( new StyleFormatter)); for (StyleFormatterMap::const_iterator it = factory.formatterMap.begin(); it != factory.formatterMap.end(); ++it) { formatterManager.addFormatter(it->first, it->second); } highlighter.setFormatterManager(&formatterManager); string line; // we now highlight a line a time while (getline(cin, line)) { highlighter.highlightParagraph(line); } return 0; }
Note that, once we created all the formatters with our factory (while
parsing the style file default.style), we still need to manually
set these formatters in the FormatterManager
class object used by
our highlighter.
For simplicity this example highlights only C++ code and reads directly from the standard input and writes to the standard output. This is a run of the example reading from the standard input (so each time you insert a line you get the output of your formatters):
/// my class TODO: nothing special comment: "///" formatted as: italic color: brown comment: " my class " formatted as: italic color: brown todo: "TODO:" formatted as: bold bgcolor: cyan comment: " nothing special" formatted as: italic color: brown #include <foobar.h> preproc: "#include" formatted as: bold color: darkblue string: "<foobar.h>" formatted as: color: red
During the highlighting (and regular expression matching) the library
generates events that can be “listened” by using a customized event
listener. An event is represented by an object of
HighlightEvent
class, which stores the HighlightToken
class
object and the type (an HighlightEventType
enum) of the event.
A customized listener can be implemented by deriving from
HighlightEventListener
class and by defining the virtual method
notify
method, which, of course, takes an
HighlightEvent
class object as parameter.
For instance, source-highlight
implements the debugging
functionalities by using a customized listener,
DebugListener
class, whose method implementation we report here as
an example:
void DebugListener::notify(const HighlightEvent &event) { switch (event.type) { case HighlightEvent::FORMAT: // print information about the rule if (event.token.rule) { os << event.token.rule->getAdditionalInfo() << endl; os << "expression: \"" << event.token.rule->toString() << "\"" << endl; } // now format the matched strings for (MatchedElements::const_iterator it = event.token.matched.begin(); it != event.token.matched.end(); ++it) { os << "formatting \"" << it->second << "\" as " << it->first << endl; } step(); break; case HighlightEvent::FORMATDEFAULT: os << "formatting \"" << event.token.matched.front().second << "\" as default" << endl; step(); break; case HighlightEvent::ENTERSTATE: os << "entering state: " << event.token.rule->getNextState()->getId() << endl; break; case HighlightEvent::EXITSTATE: int level = event.token.rule->getExitLevel(); os << "exiting state, level: "; if (level < 0) os << "all"; else os << level; os << endl; break; } }
Source-highlight library reads language map files, language definition files,
output format definitions, styles, and other files it needs during the
execution from a specific directory, which we call
data dir;
the library comes with an hardcoded value for this path, which is
based on the --prefix
value specified at configuration
time (in particular, it is prefix/share/source-highlight
).
In particular, the user can set the value also with the environment
variable
SOURCE_HIGHLIGHT_DATADIR
(see also the program
source-highlight-settings
which can store settings in a configuration
file of the user's home, see The program source-highlight-settings).
When running the program source-highlight
this value can be overridden with
the command line option --data-dir
(see Configuration files).
When using the Source-highlight library from a program, one might need to change the value for data dir, dynamically, and in a consistent way, i.e., to have a static and single point where this setting can be set and retrieved. Note that for the moment, the only setting you can manage is the value of data dir.
The library provides the Settings
class for this purpose.
Although you can create objects of this class to mainuplate, check and
save settings (you may want to look at the source code of the program
source-highlight-settings
), you probably only need the static methods
of this class. You can set the global value of data dir with the
setGlobalDataDir
method. The retrieveDataDir
method
retrieves the value for the data dir. If the global value was set with
setGlobalDataDir
method then always returns this global
value. Otherwise, it returns the value of the environment variable
SOURCE_HIGHLIGHT_DATADIR
if set. Otherwise, it returns the value
read from the configuration file. If also the reading of configuration
file fails, then it returns the hardcoded value.
If you need to get a list of all the files in the data dir with a
specific role (e.g., language definition files, style files, etc.) you
can use the static methods of the SourceHighlightUtils
class,
which will take care of using the data dir specified in the settings
(Settings).
The Instances
class provides access to static instances of some
classes that can be used, e.g., to read a language definition file and
create the automaton for the highlighting, using
LangDefManager
class, or to access the map of language definition
files, using LangMap
class. This class ensures that these
instances use the global settings; in particular, if you change the
global settings, you should call the static reload
method, so that
the instances are updated.
Using these instances also makes the use of some classes easier; for
instance, the beginning part of the main
of the examples shown in
Customizing Formatting can be written as follows:
#include "srchilite/langdefmanager.h" #include "srchilite/instances.h" int main() { // we highlight C++ code for simplicity srchilite::SourceHighlighter highlighter (srchilite::Instances::getLangDefManager().getHighlightState( DATADIR, "cpp.lang"));
If you know that you will not use these instances anymore in your
application, and it is crucial to recover all the memory used by these
instances, you then need to call the static unload
method, and the
memory of these instances will be released.
If you find a bug in source-highlight, please send electronic mail to
bug-source-highlight at gnu dot org
Include the version number, which you can find by running ‘source-highlight --version’. Also include in your message the output that the program produced and the output you expected.
If you have other questions, comments or suggestions about source-highlight, contact the author via electronic mail (find the address at http://www.lorenzobettini.it). The author will try to help you out, although he may not have time to fix your problems.
The following mailing lists are available:
help-source-highlight at gnu dot org
for generic discussions about the program and for asking for help about it (open mailing list), http://mail.gnu.org/mailman/listinfo/help-source-highlight
info-source-highlight at gnu dot org
for receiving information about new releases and features (read-only mailing list), http://mail.gnu.org/mailman/listinfo/info-source-highlight.
If you want to subscribe to a mailing list just go to the URL and follow the instructions, or send me an e-mail and I'll subscribe you.
I'll describe new features in new releases also in my blog, at this URL:
http://tronprog.blogspot.com/search/label/source-highlight
--with-doxygen
: InstallationcreateFormatter
method: Style-based Customized FormattingDebugListener
class: Events and Listenersformat
method: Customizing FormattingFormatter
class: Customizing FormattingFormatterFactory
class: Style-based Customized FormattingFormatterManager
class: Style-based Customized FormattingFormatterManager
class: Completely Customized FormattingFormatterManager
class: Customizing FormattingFormatterParams
class: Completely Customized FormattingFormatterParams
class: Customizing FormattinggetHighlightState
method: Completely Customized Formattinghighlight
method: SourceHighlight classHighlightEvent
class: Events and ListenersHighlightEventListener
class: Events and ListenersHighlightEventType
enum: Events and ListenershighlightParagraph
method: Completely Customized FormattingHighlightRuleFactory
class: Completely Customized FormattingHighlightState
class: Completely Customized FormattingHighlightToken
class: Events and ListenersInstances
class: Global instancesLangDefManager
class: Global instancesLangDefManager
class: Completely Customized FormattingLangMap
class: Global instancesLangMap
class: SourceHighlight classnotify
method: Events and ListenersparseCssStyleFile
method: Style-based Customized FormattingparseStyleFile
method: Style-based Customized FormattingPKG_CONFIG_PATH
: Using Automake and AutotoolsRegexRuleFactory
class: Completely Customized Formattingreload
method: Global instancesretrieveDataDir
method: SettingssetGlobalDataDir
method: SettingsSettings
class: Settingssource-highlight
: Settingssource-highlight
: Events and Listenerssource-highlight
: SourceHighlight classsource-highlight-settings
: Settingssource-highlight-settings
: SourceHighlight classSOURCE_HIGHLIGHT_DATADIR
: SettingsSourceHighlight
class: Completely Customized FormattingSourceHighlight
class: SourceHighlight classSourceHighlighter
class: Completely Customized FormattingSourceHighlightUtils
class: Utility functionsstart
field: Completely Customized Formattingstd::exception
class: Main ClassesStyleConstants
enum: Style-based Customized FormattingStyleFileParser
class: Style-based Customized FormattingTextStyleFormatterFactory
class: Style-based Customized Formattingunload
method: Global instances[1] The extension of course depends
on the library being shared or static, e.g., .so
, .la
,
.a
, and on the system
[2] http://pkg-config.freedesktop.org.