File: design.rst

package info (click to toggle)
pygccxml 3.0.2-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 8,444 kB
  • sloc: xml: 29,841; python: 13,914; cpp: 2,671; makefile: 163; ansic: 59
file content (299 lines) | stat: -rw-r--r-- 9,431 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
===============
Design overview
===============

pygccxml has 4 packages:

* :mod:`declarations <pygccxml.declarations>`

  This package defines classes that describe C++ declarations and types.

* :mod:`parser <pygccxml.parser>`

  This package defines classes that parse `GCC-XML`_
  or `CastXML`_ generated files. It also defines a few classes that will help
  you unnecessary parsing of C++ source files.

* :mod:`utils <pygccxml.utils>`

  This package defines a few functions useful for the whole project,
  but which are mainly used internally by pygccxml.

------------------------
``declarations`` package
------------------------

Please take a look on the `UML diagram`_. This `UML diagram`_ describes almost all
classes defined in the package and their relationship. ``declarations`` package
defines two hierarchies of class:

1. types hierarchy - used to represent a C++ type

2. declarations hierarchy - used to represent a C++ declaration


Types hierarchy
---------------

Types hierarchy is used to represent an arbitrary type in C++. class ``type_t``
is the base class.

``type_traits``
~~~~~~~~~~~~~~~

Are you aware of `boost::type_traits`_ library? The `boost::type_traits`_
library contains a set of very specific traits classes, each of which
encapsulate a single trait from the C++ type system; for example, is a type
a pointer or a reference? Or does a type have a trivial constructor, or a
const-qualifier?

pygccxml implements a lot of functionality from the library:

* a lot of algorithms were implemented

  + ``is_same``

  + ``is_enum``

  + ``is_void``

  + ``is_const``

  + ``is_array``

  + ``is_pointer``

  + ``is_volatile``

  + ``is_integral``

  + ``is_reference``

  + ``is_arithmetic``

  + ``is_convertible``

  + ``is_fundamental``

  + ``is_floating_point``

  + ``is_base_and_derived``

  + ``is_unary_operator``

  + ``is_binary_operator``

  + ``remove_cv``

  + ``remove_const``

  + ``remove_alias``

  + ``remove_pointer``

  + ``remove_volatile``

  + ``remove_reference``

  + ``has_trivial_copy``

  + ``has_trivial_constructor``

  + ``has_any_non_copyconstructor``

  For a full list of implemented algorithms, please consult API documentation.

* a lot of unit tests has been written base on unit tests from the
  `boost::type_traits`_ library.


If you are going to build code generator, you will find ``type_traits`` very handy.

Declarations hierarchy
----------------------

A declaration hierarchy is used to represent an arbitrary C++ declaration.
Basically, most of the classes defined in this package are just "set of properties".

``declaration_t`` is the base class of the declaration hierarchy. Every declaration
has ``parent`` property. This property keeps a reference to the scope declaration
instance, in which this declaration is defined.

The ``scopedef_t`` class derives from ``declaration_t``. This class is used to
say - "I may have other declarations inside". The "composite" design pattern is
used here. ``class_t`` and ``namespace_t`` declaration classes derive from the
``scopedef_t`` class.

------------------
``parser`` package
------------------

Please take a look on `parser package UML diagram`_ . Classes defined in this
package, implement parsing and linking functionality. There are few kind of
classes defined by the package:

* classes, that implements parsing algorithms of `GCC-XML`_ generated XML file

* parser configuration classes

* cache - classes, those one will help you to eliminate unnecessary parsing

* patchers - classes, which fix `GCC-XML`_ generated declarations. ( Yes, sometimes
  GCC-XML generates wrong description of C++ declaration. )

Parser classes
--------------

``source_reader_t`` - the only class that have a detailed knowledge about `GCC-XML`_.
It has only one responsibility: it calls `GCC-XML`_ with a source file specified
by user and creates declarations tree. The implementation of this class is split
to 2 classes:

1. ``scanner_t`` - this class scans the "XML" file, generated by `GCC-XML`_ and
   creates pygccxml declarations and types classes. After the xml file has
   been processed declarations and type class instances keeps references to
   each other using `GCC-XML`_ generated ids.

2. ``linker_t`` - this class contains logic for replacing `GCC-XML`_ generated
   ids with references to declarations or type class instances.

Both those classes are implementation details and should not be used by user.
Performance note: ``scanner_t`` class uses Python ``xml.sax`` package in order
to parse XML. As a result, ``scanner_t`` class is able to parse even big XML files
pretty quick.

``project_reader_t`` - think about this class as a linker. In most cases you work
with few source files. GCC-XML does not supports this mode of work. So, pygccxml
implements all functionality needed to parse few source files at once.
``project_reader_t`` implements 2 different algorithms, that solves the problem:

1. ``project_reader_t`` creates temporal source file, which includes all the source
   files.

2. ``project_reader_t`` parse separately every source file, using ``source_reader_t``
   class and then joins the resulting declarations tree into single declarations
   tree.

Both approaches have different trades-off. The first approach does not allow you
to reuse information from already parsed source files. While the second one
allows you to setup cache.

Parser configuration classes
----------------------------

``gccxml_configuration_t`` - a class, that accumulates all the settings needed to invoke `GCC-XML`_:


``file_configuration_t`` - a class, that contains some data and description how
to treat the data. ``file_configuration_t`` can contain reference to the the following types
of data:

(1) path to C++ source file

(2) path to `GCC-XML`_ generated XML file

(3) path to C++ source file and path to `GCC-XML`_ generated XML file

    In this case, if XML file does not exists, it will be created. Next time
    you will ask to parse the source file, the XML file will be used instead.

    Small tip: you can setup your makefile to delete XML files every time,
    the relevant source file has changed.

(4) Python string, that contains valid C++ code

There are few functions that will help you to construct ``file_configuration_t``
object:

* ``def create_source_fc( header )``

  ``header`` contains path to C++ source file

* ``def create_gccxml_fc( xml_file )``

  ``xml_file`` contains path to `GCC-XML`_ generated XML file

* ``def create_cached_source_fc( header, cached_source_file )``

  - ``header`` contains path to C++ source file
  - ``xml_file`` contains path to `GCC-XML`_ generated XML file

* ``def create_text_fc( text )``

  ``text`` - Python string, that contains valid C++ code


Cache classes
-------------

There are few cache classes, which implements different cache strategies.

1. ``file_configuration_t`` class, that keeps path to C++ source file and path to
   `GCC-XML`_ generated XML file.

2. ``file_cache_t`` class, will save all declarations from all files within single
   binary file.

3. ``directory_cache_t`` class will store one index file called "index.dat" which
   is always read by the cache when the cache object is created. Each header file
   will have its corresponding \*.cache file that stores the declarations found
   in the header file. The index file is used to determine whether a \*.cache file
   is still valid or not (by checking if one of the dependent files
   (i.e. the header file itself and all included files) have been modified since
   the last run).

In some cases, ``directory_cache_t`` class gives much better performance, than
``file_cache_t``. Many thanks to Matthias Baas for its implementation.

**Warning**: when pygccxml writes information to files, using cache classes,
it does not write any version information. It means, that when you upgrade
pygccxml you have to delete all your cache files. Otherwise you will get very
strange errors. For example: missing attribute.


Patchers
--------

Well, `GCC-XML`_ has few bugs, which could not be fixed from it. For example

.. code-block:: c++

  namespace ns1{ namespace ns2{
      enum fruit{ apple, orange };
  } }

.. code-block:: c++

  void fix_enum( ns1::ns2::fruit arg=ns1::ns2::apple );

`GCC-XML`_ will report the default value of ``arg`` as ``apple``. Obviously
this in an error. pygccxml knows how to fix this bug.

This is not the only bug, which could be fixed, there are few of them. pygccxml
introduces few classes, which knows how to deal with specific bug. More over, those
bugs are fixed, only if I am 101% sure, that this is the right thing to do.

-----------------
``utils`` package
-----------------

 Use internally by pygccxml.
 Some methods/classes may be still usefull: loggers, find_xml_generator

-------
Summary
-------

That's all. I hope I was clear, at least I tried. Any way, pygccxml is an open
source project. You always can take a look on the source code. If you need more
information please read API documentation.


.. _`SourceForge`: http://sourceforge.net/index.php
.. _`Python`: http://www.python.org
.. _`GCC-XML`: http://www.gccxml.org
.. _`CastXML`: https://github.com/CastXML/CastXML
.. _`UML diagram` : declarations_uml.png
.. _`parser package UML diagram` : parser_uml.png
.. _`ReleaseForge` : http://releaseforge.sourceforge.net
.. _`boost::type_traits` : http://www.boost.org/libs/type_traits/index.html