File: README

package info (click to toggle)
libmakefile-dom-perl 0.008-2
  • links: PTS, VCS
  • area: main
  • in suites: bullseye, buster
  • size: 680 kB
  • ctags: 620
  • sloc: perl: 7,616; makefile: 2
file content (253 lines) | stat: -rw-r--r-- 10,479 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
NAME
    Makefile::DOM - Simple DOM parser for Makefiles

VERSION
    This document describes Makefile::DOM 0.008 released on 18 November
    2014.

DESCRIPTION
    This library can serve as an advanced lexer for (GNU) makefiles. It
    parses makefiles as "documents" and the parsing is lossless. The results
    are data structures similar to DOM trees. The DOM trees hold every
    single bit of the information in the original input files, including
    white spaces, blank lines and makefile comments. That means it's
    possible to reproduce the original makefiles from the DOM trees. In
    addition, each node of the DOM trees is modifiable and so is the whole
    tree, just like the PPI module used for Perl source parsing and the
    HTML::TreeBuilder module used for parsing HTML source.

    If you're looking for a true GNU make parser that generates an AST,
    please see Makefile::Parser::GmakeDB instead.

    The interface of "Makefile::DOM" mimics the API design of PPI. In fact,
    I've directly stolen the source code and POD documentation of PPI::Node,
    PPI::Element, and PPI::Dumper, with the full permission from the author
    of PPI, Adam Kennedy.

    "Makefile::DOM" tries to be independent of specific makefile's syntax.
    The same set of DOM node types is supposed to get shared by different
    makefile DOM generators. For example, MDOM::Document::Gmake parses GNU
    makefiles and returns an instance of MDOM::Document, i.e., the root of
    the DOM tree while the NMAKE makefile lexer in the future,
    "MDOM::Document::Nmake", also returns instances of the MDOM::Document
    class. Later, I'll also consider adding support for dmake and bsdmake.

Structure of the DOM
    Makefile DOM (MDOM) is a structured set of a series of data types. They
    provide a flexible document model conformed to the makefile syntax.
    Below is a complete list of the 19 MDOM classes in the current
    implementation where the indentation indicates the class inheritance
    relationships.

        MDOM::Element
            MDOM::Node
                MDOM::Unknown
                MDOM::Assignment
                MDOM::Command
                MDOM::Directive
                MDOM::Document
                    MDOM::Document::Gmake
                MDOM::Rule
                    MDOM::Rule::Simple
                    MDOM::Rule::StaticPattern
            MDOM::Token
                MDOM::Token::Bare
                MDOM::Token::Comment
                MDOM::Token::Continuation
                MDOM::Token::Interpolation
                MDOM::Token::Modifier
                MDOM::Token::Separator
                MDOM::Token::Whitespace

    It's not hard to see that all of the MDOM classes inherit from the
    MDOM::Element class. MDOM::Token and MDOM::Node are its direct children.
    The former represents a string token which is atomic from the
    perspective of the lexer while the latter represents a structured node,
    which usually has one or more children, and serves as the container for
    other DOM::Element objects.

    Next we'll show a few examples to demonstrate how to map DOM trees to
    particular makefiles.

    Case 1
        Consider the following simple "hello, world" makefile:

            all : ; echo "hello, world"

        We can use the MDOM::Dumper class provided by Makefile::DOM to dump
        out the internal structure of its corresponding MDOM tree:

            MDOM::Document::Gmake
              MDOM::Rule::Simple
                MDOM::Token::Bare         'all'
                MDOM::Token::Whitespace   ' '
                MDOM::Token::Separator    ':'
                MDOM::Token::Whitespace   ' '
                MDOM::Command
                  MDOM::Token::Separator    ';'
                  MDOM::Token::Whitespace   ' '
                  MDOM::Token::Bare         'echo "hello, world"'
                  MDOM::Token::Whitespace   '\n'

        In this example, speparators ":" and ";" are all instances of the
        MDOM::Token::Separator class while spaces and new line characters
        are all represented as MDOM::Token::Whitespace. The other two leaf
        nodes, "all" and "echo "hello, world"" both belong to
        MDOM::Token::Bare.

        It's worth mentioning that, the space characters in the rule command
        "echo "hello, world"" were not represented as
        MDOM::Token::Whitespace. That's because in makefiles, the spaces in
        commands do not make any sense to "make" in syntax; those spaces are
        usually sent to shell programs verbatim. Therefore, the DOM parser
        does not try to recognize those spaces specifially so as to reduce
        memory use and the number of nodes. However, leading spaces and
        trailing new lines will still be recognized as
        MDOM::Token::Whitespace.

        On a higher level, it's a MDOM::Rule::Simple instance holding
        several "Token" and one MDOM::Command. On the highest level, it's
        the root node of the whole DOM tree, i.e., an instance of
        MDOM::Document::Gmake.

    Case 2
        Below is a relatively complex example:

            a: foo.c  bar.h $(baz) # hello!
                @echo ...

        It's corresponding DOM structure is

          MDOM::Document::Gmake
            MDOM::Rule::Simple
              MDOM::Token::Bare         'a'
              MDOM::Token::Separator    ':'
              MDOM::Token::Whitespace   ' '
              MDOM::Token::Bare         'foo.c'
              MDOM::Token::Whitespace   '  '
              MDOM::Token::Bare         'bar.h'
              MDOM::Token::Whitespace   '\t'
              MDOM::Token::Interpolation   '$(baz)'
              MDOM::Token::Whitespace      ' '
              MDOM::Token::Comment         '# hello!'
              MDOM::Token::Whitespace      '\n'
            MDOM::Command
              MDOM::Token::Separator    '\t'
              MDOM::Token::Modifier     '@'
              MDOM::Token::Bare         'echo ...'
              MDOM::Token::Whitespace   '\n'

        Compared to the previous example, here appears several new node
        types.

        The variable interpolation "$(baz)" on the first line of the
        original makefile corresponds to a MDOM::Token::Interpolation node
        in its MDOM tree. Similarly, the comment "# hello" corresponds to a
        MDOM::Token::Comment node.

        On the second line, the rule command indented by a tab character is
        still represented by a MDOM::Command object. Its first child node
        (or its first element) is also an MDOM::Token::Seperator instance
        corresponding to that tab. The command modifier "@" follows the
        "Separator" immediately, which is of type MDOM::Token::Modifier.

    Case 3
        Now let's study a sample makefile with various global structures:

          a: b
          foo = bar
              # hello!

        Here on the top level, there are three language structures: one rule
        ""a: b"", one assignment statement "foo = bar", and one comment "#
        hello!".

        Its MDOM tree is shown below:

          MDOM::Document::Gmake
            MDOM::Rule::Simple
              MDOM::Token::Bare                  'a'
              MDOM::Token::Separator            ':'
              MDOM::Token::Whitespace           ' '
              MDOM::Token::Bare                   'b'
              MDOM::Token::Whitespace           '\n'
            MDOM::Assignment
              MDOM::Token::Bare                  'foo'
              MDOM::Token::Whitespace           ' '
              MDOM::Token::Separator            '='
              MDOM::Token::Whitespace           ' '
              MDOM::Token::Bare                  'bar'
              MDOM::Token::Whitespace           '\n'
            MDOM::Token::Whitespace            '\t'
            MDOM::Token::Comment               '# hello!'
            MDOM::Token::Whitespace            '\n'

        We can see that below the root node MDOM::Document::Gmake, there are
        MDOM::Rule::Simple, MDOM::Assignment, and MDOM::Comment three
        elements, as well as two MDOM::Token::Whitespace objects.

    It can be observed from the examples above that the MDOM representation
    for the makefile's lexical elements is rather loose. It only provides
    very limited structural representation instead of making a bad guess.

OPERATIONS FOR MDOM TREES
    Generating an MDOM tree from a GNU makefile only requires two lines of
    Perl code:

        use MDOM::Document::Gmake;
        my $dom = MDOM::Document::Gmake->new('Makefile');

    If the makefile source code being parsed is already stored in a Perl
    variable, say, $var, then we can construct an MDOM via the following
    code:

        my $dom = MDOM::Document::Gmake->new(\$var);

    Now $dom becomes the reference to the root of the MDOM tree and its type
    is now MDOM::Document::Gmake, which is also an instance of the
    MDOM::Node class.

    Just as mentioned above, "MDOM::Node" is the container for other
    MDOM::Element instances. So we can retrieve some element node's value
    via its "child" method:

        $node = $dom->child(3);
        # or $node = $dom->elements(0);

    And we may also use the "elements" method to obtain the values of all
    the nodes:

        @elems = $dom->elements;

    For every MDOM node, its corresponding makefile source can be generated
    by invoking its "content" method.

BUGS AND TODO
    The current implementation of the MDOM::Document::Gmake lexer is based
    on a hand-written state machie. Although the efficiency of the engine is
    not bad, the code is rather complicated and messy, which hurts both
    extensibility and maintanabilty. So it's expected to rewrite the parser
    using some grammatical tools like the Perl 6 regex engine
    Pugs::Compiler::Rule or a yacc-style one like Parse::Yapp.

SOURCE REPOSITORY
    You can always get the latest source code of this module from its GitHub
    repository:

    <http://github.com/agentzh/makefile-dom-pm>

    If you want a commit bit, please let me know.

AUTHOR
    Yichun "agentzh" Zhang (章亦春) <agentzh@gmail.com>

COPYRIGHT
    Copyright 2006-2014 by Yichun "agentzh" Zhang (章亦春).

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.

SEE ALSO
    MDOM::Document, MDOM::Document::Gmake, PPI, Makefile::Parser::GmakeDB,
    makesimple.