File: parser-cxx.rst

package info (click to toggle)
universal-ctags 0%2Bgit20181215-2
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 17,444 kB
  • sloc: ansic: 84,242; vhdl: 5,924; sh: 5,830; perl: 1,743; cpp: 1,599; cs: 1,193; python: 812; sql: 572; f90: 534; php: 479; yacc: 459; fortran: 341; makefile: 325; asm: 311; objc: 284; ruby: 261; xml: 245; java: 157; tcl: 133; cobol: 122; lisp: 113; erlang: 61; ada: 55; ml: 49; awk: 43
file content (269 lines) | stat: -rw-r--r-- 8,729 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
.. _cxx:

======================================================================
The new C/C++ parser
======================================================================

:Maintainer: Szymon Tomasz Stefanek <s.stefanek@gmail.com>

Introduction
---------------------------------------------------------------------

The C++ language has strongly evolved since the old C/C++ parser was
written. The old parser was struggling with some of the new features
of the language and has shown signs of reaching its limits. For this
reason in February/March 2016 the C/C++ parser was rewritten from
scratch.

In the first release several outstanding bugs were fixed and some new
features were added. Among them:

- Tagging of "using namespace" declarations
- Tagging of function parameters
- Extraction of function parameter types
- Tagging of anonymous structures/unions/classes/enums
- Support for C++11 lambdas (as anonymous functions)
- Support for function-level scopes (for local variables and parameters)
- Extraction of local variables which include calls to constructors
- Extraction of local variables from within the for(), while(), if()
  and switch() parentheses.
- Support for function prototypes/declarations with trailing return type

At the time of writing (March 2016) more features are planned.

Notable New Features
---------------------------------------------------------------------

Some of the notable new features are described below.

Properties
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Several properties of functions and variables can be extracted
and placed in a new field called ``properties``.
The syntax to enable it is:

.. code-block:: console

	$ ctags ... --fields-c++=+{properties} ...

At the time of writing the following properties are reported:

- ``virtual``: a function is marked as virtual
- ``static``: a function/variable is marked as static
- ``inline``: a function implementation is marked as inline
- ``explicit``: a function is marked as explicit
- ``extern``: a function/variable is marked as extern
- ``const``: a function is marked as const
- ``pure``: a virtual function is pure (i.e = 0)
- ``override``: a function is marked as override
- ``default``: a function is marked as default
- ``final``: a function is marked as final
- ``delete``: a function is marked as delete
- ``mutable``: a variable is marked as mutable
- ``volatile``: a function is marked as volatile
- ``specialization``: a function is a template specialization
- ``scopespecialization``: template specialization of scope ``a<x>::b()``
- ``deprecated``: a function is marked as deprecated via ``__attribute__``
- ``scopedenum``: a scoped enumeration (C++11)

Preprocessor macros
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The new parser supports the definition of real preprocessor macros
via the ``-D`` option. All types of macros are supported,
including the ones with parameters and variable arguments.
Stringification, token pasting and recursive macro expansion are also supported.

Option ``-I`` is now simply a backward-compatible syntax to define a
macro with no replacement.

The syntax is similar to the corresponding gcc ``-D`` option.

Some examples follow.

.. code-block:: console

	$ ctags ... -D IGNORE_THIS ...

With this commandline the following C/C++ input

.. code-block:: C

	int IGNORE_THIS a;

will be processed as if it was

.. code-block:: C

	int a;

Defining a macro with parameters uses the following syntax:

.. code-block:: console

	$ ctags ... -D "foreach(arg)=for(arg;;)" ...

This example defines ``for(arg;;)`` as the replacement ``foreach(arg)``.
So the following C/C++ input

.. code-block:: C

	foreach(char * p,pointers)
	{

	}

is processed in new C/C++ parser as:

.. code-block:: C

	for(char * p;;)
	{

	}

and the p local variable can be extracted.

The previous commandline includes quotes since the macros generally contain
characters that are treated specially by the shells. You may need some escaping.

Token pasting is performed by the ``##`` operator, just like in the normal
C preprocessor.

.. code-block:: console

	$ ctags ... -D "DECLARE_FUNCTION(prefix)=int prefix ## Call();"

So the following code

.. code-block:: C

	DECLARE_FUNCTION(a)
	DECLARE_FUNCTION(b)

will be processed as

.. code-block:: C

	int aCall();
	int bCall();

Macros with variable arguments use the gcc ``__VA_ARGS__`` syntax.

.. code-block:: console

	$ ctags ... -D "DECLARE_FUNCTION(name,...)=int name(__VA_ARGS__);"

So the following code

.. code-block:: C

	DECLARE_FUNCTION(x,int a,int b)

will be processed as

.. code-block:: C

	int x(int a,int b);

Incompatible Changes
---------------------------------------------------------------------

The parser is mostly compatible with the old one. There are some minor
incompatible changes which are described below.


Anonymous structure names
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The old parser produced structure names in the form ``__anonN`` where N
was a number starting at 1 in each file and increasing at each new
structure. This caused collisions in symbol names when ctags was run
on multiple files.

In the new parser the anonymous structure names depend on the file name
being processed and on the type of the structure itself. Collisions are
far less likely (though not impossible as hash functions are unavoidably
imperfect).

Pitfall: the file name used for hashing includes the path as passed to the
ctags executable. So the same file "seen" from different paths will produce
different structure names. This is unavoidable and is up to the user to
ensure that multiple ctags runs are started from a common directory root.

File scope
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The file scope information is not 100% reliable. It never was.
There are several cases in that compiler, linker or even source code
tricks can "unhide" file scope symbols (for instance \*.c files can be
included into each other) and several other cases in that the limitation
of the scope of a symbol to a single file simply cannot be determined
with a single pass or without looking at a program as a whole.

The new parser defines a simple policy for file scope association
that tries to be as compatible as possible with the old parser and
should reflect the most common usages. The policy is the following:

- Namespaces are in file scope if declared inside a .c or .cpp file

- Function prototypes are in file scope if declared inside a .c or .cpp file

- K&R style function definitions are in file scope if declared static
  inside a .c file.

- Function definitions appearing inside a namespace are in file scope only
  if declared static inside a .c or .cpp file.
  Note that this rule includes both global functions (global namespace)
  and class/struct/union members defined outside of the class/struct/union
  declaration.

- Function definitions appearing inside a class/struct/union declaration
  are in file scope only if declared static inside a .cpp file

- Function parameters are always in file scope

- Local variables are always in file scope

- Variables appearing inside a namespace are in file scope only if
  they are declared static inside a .c or .cpp file

- Variables that are members of a class/struct/union are in file scope
  only if declared in a .c or .cpp file

- Typedefs are in file scope if appearing inside a .c or .cpp file

Most of these rules are debatable in one way or the other. Just keep in mind
that this is not 100% reliable.

Inheritance information
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The new parser does not strip template names from base classes.
For a declaration like

.. code-block:: C

	template<typename A> class B : public C<A>

the old parser reported ``C`` as base class while the new one reports
``C<A>``.

Typeref
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The syntax of the typeref field (``typeref:A:B``) was designed with only
struct/class/union/enum types in mind. Generic types don't have ``A``
information and the keywords became entirely optional in C++:
you just can't tell. Furthermore, struct/class/union/enum types
share the same namespace and their names can't collide, so the ``A``
information is redundant for most purposes.

To accommodate generic types and preserve some degree of backward
compatibility the new parser uses struct/class/union/enum in place
of ``A`` where such keyword can be inferred. Where the information is
not available it uses the 'typename' keyword.

Generally, you should ignore the information in field ``A`` and use
only information in field ``B``.