1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861
|
.. _optlib:
Extending ctags with Regex parser (*optlib*)
---------------------------------------------------------------------
:Maintainer: Masatake YAMATO <yamato@redhat.com>
.. contents:: `Table of contents`
:depth: 3
:local:
.. TODO:
add a section on debugging
Exuberant Ctags allows a user to add a new parser to ctags with ``--langdef=<LANG>``
and ``--regex-<LANG>=...`` options.
Universal Ctags follows and extends the design of Exuberant Ctags in more
powerful ways and call the feature as *optlib parser*, which is described in in
:ref:`ctags-optlib(7) <ctags-optlib(7)>` and the following sections.
:ref:`ctags-optlib(7) <ctags-optlib(7)>` is the primary document of the optlib
parser feature. The following sections provide additional information and more
advanced features. Note that some of the features are experimental, and will be
marked as such in the documentation.
Lots of optlib parsers are included in Universal Ctags,
`optlib/*.ctags <https://github.com/universal-ctags/ctags/tree/master/optlib>`_.
They will be good examples when you develop your own parsers.
A optlib parser can be translated into C source code. Your optlib parser can
thus easily become a built-in parser. See ":ref:`optlib2c`" for details.
Regular expression (regex) engine
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Universal Ctags uses `the POSIX Extended Regular Expressions (ERE)
<https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html>`_
syntax as same as Exuberant Ctags by default.
During building Universal Ctags the ``configure`` script runs compatibility
tests of the regex engine in the system library. If tests pass the engine is
used, otherwise the regex engine imported from `the GNU Gnulib library
<https://www.gnu.org/software/gnulib/manual/gnulib.html#Regular-expressions>`_
is used. In the latter case, ``ctags --list-features`` will contain
``gnulib_regex``.
See ``regex(7)`` or `the GNU Gnulib Manual
<https://www.gnu.org/software/gnulib/manual/gnulib.html#Regular-expressions>`_
for the details of the regular expression syntax.
.. note::
The GNU regex engine supports some GNU extensions described `here
<https://www.gnu.org/software/gnulib/manual/gnulib.html#posix_002dextended-regular-expression-syntax>`_.
Note that an optlib parser using the extensions may not work with Universal
Ctags on some other systems.
The POSIX Extended Regular Expressions (ERE) does
*not* support many of the "modern" extensions such as lazy captures,
non-capturing grouping, atomic grouping, possessive quantifiers, look-ahead/behind,
etc. It may be notoriously slow when backtracking.
A common error is forgetting that a
POSIX ERE engine is always *greedy*; the '``*``' and '``+``' quantifiers match
as much as possible, before backtracking from the end of their match.
For example this pattern::
foo.*bar
Will match this entire string, not just the first part::
foobar, bar, and even more bar
Another detail to keep in mind is how the regex engine treats newlines.
Universal Ctags compiles the regular expressions in the ``--regex-<LANG>`` and
``--mline-regex-<LANG>`` options with ``REG_NEWLINE`` set. What that means is documented
in the
`POSIX specification <https://pubs.opengroup.org/onlinepubs/9699919799/functions/regcomp.html>`_.
One obvious effect is that the regex special dot any-character '``.``' does not match
newline characters, the '``^``' anchor *does* match right after a newline, and
the '``$``' anchor matches right before a newline. A more subtle issue is this text from the
chapter "`Regular Expressions <https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html>`_";
"the use of literal <newline>s or any escape sequence equivalent produces undefined
results". What that means is using a regex pattern with ``[^\n]+`` is invalid,
and indeed in glibc produces very odd results. **Never use** '``\n``' in patterns
for ``--regex-<LANG>``, and **never use them** in non-matching bracket expressions
for ``--mline-regex-<LANG>`` patterns. For the experimental ``--_mtable-regex-<LANG>``
you can safely use '``\n``' because that regex is not compiled with ``REG_NEWLINE``.
And it may also have some known "quirks"
with respect to escaping special characters in bracket expressions.
For example, a pattern of ``[^\]]+`` is invalid in POSIX ERE, because the '``]``' is
*not* special inside a bracket expression, and thus should **not** be escaped.
Most regex engines ignore this subtle detail in POSIX ERE, and instead allow
escaping it with '``\]``' inside the bracket expression and treat it as the
literal character '``]``'. GNU glibc, however, does not generate an error but
instead considers it undefined behavior, and in fact it will match very odd
things. Instead you **must** use the more unintuitive ``[^]]+`` syntax. The same
is technically true of other special characters inside a bracket expression,
such as ``[^\)]+``, which should instead be ``[^)]+``. The ``[^\)]+`` will
appear to work usually, but only because what it is really doing is matching any
character but '``\``' *or* '``)``'. The only exceptions for using '``\``' inside a
bracket expression are for '``\t``' and '``\n``', which ctags converts to their
single literal character control codes before passing the pattern to glibc.
You should always test your regex patterns against test files with strings that
do and do not match. Pay particular emphasis to when it should *not* match, and
how *much* it matches when it should.
Perl-compatible regular expressions (PCRE2) engine
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Universal Ctags optionally supports `Perl-Compatible Regular Expressions (PCRE2)
<https://www.pcre.org/current/doc/html/pcre2syntax.html>`_ syntax
only if the Universal Ctags is built with ``pcre2`` library.
See the output of ``--list-features`` option to know whether your Universal
Ctags is built-with ``pcre2`` or not.
PCRE2 *does* support many "modern" extensions.
For example this pattern::
foo.*?bar
Will match just the first part, ``foobar``, not this entire string,::
foobar, bar, and even more bar
Regex option argument flags
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Many regex-based options described in this document support additional arguments
in the form of long flags. Long flags are specified with surrounding '``{``' and
'``}``'.
The general format and placement is as follows:
.. code-block:: ctags
--regex-<LANG>=<PATTERN>/<NAME>/[<KIND>/]LONGFLAGS
Some examples:
.. code-block:: ctags
--regex-Pod=/^=head1[ \t]+(.+)/\1/c/
--regex-Foo=/set=[^;]+/\1/v/{icase}
--regex-Man=/^\.TH[[:space:]]{1,}"([^"]{1,})".*/\1/t/{exclusive}{icase}{scope=push}
--regex-Gdbinit=/^#//{exclusive}
Note that the last example only has two '``/``' forward-slashes following
the regex pattern, as a shortened form when no kind-spec exists.
The ``--mline-regex-<LANG>`` option also follows the above format. The
experimental ``--_mtable-regex-<LANG>`` option follows a slightly
modified version as well.
Regex control flags
......................................................................
.. Q: why even discuss the single-character version of the flags? Just
make everyone use the long form.
The regex matching can be controlled by adding flags to the ``--regex-<LANG>``,
``--mline-regex-<LANG>``, and experimental ``--_mtable-regex-<LANG>`` options.
This is done by either using the single character short flags ``b``, ``e`` and
``i`` flags as explained in the *ctags.1* man page, or by using long flags
described earlier. The long flags require more typing but are much more
readable.
The mapping between the older short flag names and long flag names is:
=========== =========== ===========
short flag long flag description
=========== =========== ===========
b basic Posix basic regular expression syntax.
e extend Posix extended regular expression syntax (default).
i icase Case-insensitive matching.
=========== =========== ===========
So the following ``--regex-<LANG>`` expression:
.. code-block:: ctags
--kinddef-m4=d,definition,definitions
--regex-m4=/^m4_define\(\[([^]$\(]+).+$/\1/d/x
is the same as:
.. code-block:: ctags
--kinddef-m4=d,definition,definitions
--regex-m4=/^m4_define\(\[([^]$\(]+).+$/\1/d/{extend}
The characters '``{``' and '``}``' may not be suitable for command line
use, but long flags are mostly intended for option files.
Exclusive flag in regex
......................................................................
By default, lines read from the input files will be matched against all the
regular expressions defined with ``--regex-<LANG>``. Each successfully matched
regular expression will emit a tag.
In some cases another policy, exclusive-matching, is preferable to the
all-matching policy. Exclusive-matching means the rest of regular
expressions are not tried if one of regular expressions is matched
successfully, for that input line.
For specifying exclusive-matching the flags ``exclusive`` (long) and ``x``
(short) were introduced. For example, this is used in
:file:`optlib/gdbinit.ctags` for ignoring comment lines in gdb files,
as follows:
.. code-block:: ctags
--regex-Gdbinit=/^#//{exclusive}
Comments in gdb files start with '``#``' so the above line is the first regex
match line in :file:`gdbinit.ctags`, so that subsequent regex matches are
not tried for the input line.
If an empty name pattern (``//``) is used for the ``--regex-<LANG>`` option,
ctags warns it as a wrong usage of the option. However, if the flags
``exclusive`` or ``x`` is specified, the warning is suppressed.
This is useful to ignore matched patterns as above.
NOTE: This flag does not make sense in the multi-line ``--mline-regex-<LANG>``
option nor the multi-table ``--_mtable-regex-<LANG>`` option.
Experimental flags
......................................................................
.. note:: These flags are experimental. They apply to all regex option
types: basic ``--regex-<LANG>``, multi-line ``--mline-regex-<LANG>``,
and the experimental multi-table ``--_mtable-regex-<LANG>`` option.
``_extra``
This flag indicates the tag should only be generated if the given
``extra`` type is enabled, as explained in ":ref:`extras`".
``_field``
This flag allows a regex match to add additional custom fields to the
generated tag entry, as explained in ":ref:`fields`".
``_role``
This flag allows a regex match to generate a reference tag entry and
specify the role of the reference, as explained in ":ref:`roles`".
.. NOT REVIEWED YET
``_anonymous=PREFIX``
This flag allows a regex match to generate an anonymous tag entry.
ctags gives a name starting with ``PREFIX`` and emits it.
This flag is useful to record the position for a language object
having no name. A lambda function in a functional programming
language is a typical example of a language object having no name.
Consider following input (``input.foo``):
.. code-block:: lisp
(let ((f (lambda (x) (+ 1 x))))
...
)
Consider following optlib file (``foo.ctags``):
.. code-block:: ctags
:emphasize-lines: 4
--langdef=Foo
--map-Foo=+.foo
--kinddef-Foo=l,lambda,lambda functions
--regex-Foo=/.*\(lambda .*//l/{_anonymous=L}
You can get following tags file:
.. code-block:: console
$ u-ctags --options=foo.ctags -o - /tmp/input.foo
Le4679d360100 /tmp/input.foo /^(let ((f (lambda (x) (+ 1 x))))$/;" l
.. _extras:
Conditional tagging with extras
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. NEEDS MORE REVIEWS
If a matched pattern should only be tagged when an ``extra`` flag is enabled,
mark the pattern with ``{_extra=XNAME}`` where ``XNAME`` is the name of the
extra. You must define a ``XNAME`` with the
``--_extradef-<LANG>=XNAME,DESCRIPTION`` option before defining a regex flag
marked ``{_extra=XNAME}``.
.. code-block:: python
if __name__ == '__main__':
do_something()
To capture the lines above in a python program (``input.py``), an ``extra`` flag can
be used.
.. code-block:: ctags
:emphasize-lines: 1-2
--_extradef-Python=main,__main__ entry points
--regex-Python=/^if __name__ == '__main__':/__main__/f/{_extra=main}
The above optlib (``python-main.ctags``) introduces ``main`` extra to the Python parser.
The pattern matching is done only when the ``main`` is enabled.
.. code-block:: console
$ ctags --options=python-main.ctags -o - --extras-Python='+{main}' input.py
__main__ input.py /^if __name__ == '__main__':$/;" f
.. TODO: this "fields" section should probably be moved up this document, as a
subsection in the "Regex option argument flags" section
.. _fields:
Adding custom fields to the tag output
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. NEEDS MORE REVIEWS
Exuberant Ctags allows just one of the specified groups in a regex pattern to
be used as a part of the name of a tag entry.
Universal Ctags allows using the other groups in the regex pattern.
An optlib parser can have its specific fields. The groups can be used as a
value of the fields of a tag entry.
Let's think about `Unknown`, an imaginary language.
Here is a source file (``input.unknown``) written in `Unknown`:
.. code-block:: java
public func foo(n, m);
protected func bar(n);
private func baz(n,...);
With ``--regex-Unknown=...`` Exuberant Ctags can capture ``foo``, ``bar``, and ``baz``
as names. Universal Ctags can attach extra context information to the
names as values for fields. Let's focus on ``bar``. ``protected`` is a
keyword to control how widely the identifier ``bar`` can be accessed.
``(n)`` is the parameter list of ``bar``. ``protected`` and ``(n)`` are
extra context information of ``bar``.
With the following optlib file (``unknown.ctags``), ctags can attach
``protected`` to the field protection and ``(n)`` to the field signature.
.. code-block:: ctags
:emphasize-lines: 5-9
--langdef=unknown
--kinddef-unknown=f,func,functions
--map-unknown=+.unknown
--_fielddef-unknown=protection,access scope
--_fielddef-unknown=signature,signatures
--regex-unknown=/^((public|protected|private) +)?func ([^\(]+)\((.*)\)/\3/f/{_field=protection:\1}{_field=signature:(\4)}
--fields-unknown=+'{protection}{signature}'
For the line ``protected func bar(n);`` you will get following tags output::
bar input.unknown /^protected func bar(n);$/;" f protection:protected signature:(n)
Let's see the detail of ``unknown.ctags``.
.. code-block:: ctags
--_fielddef-unknown=protection,access scope
``--_fielddef-<LANG>=name,description`` defines a new field for a parser
specified by *<LANG>*. Before defining a new field for the parser,
the parser must be defined with ``--langdef=<LANG>``. ``protection`` is
the field name used in tags output. ``access scope`` is the description
used in the output of ``--list-fields`` and ``--list-fields=Unknown``.
.. code-block:: ctags
--_fielddef-unknown=signature,signatures
This defines a field named ``signature``.
.. code-block:: ctags
--regex-unknown=/^((public|protected|private) +)?func ([^\(]+)\((.*)\)/\3/f/{_field=protection:\1}{_field=signature:(\4)}
This option requests making a tag for the name that is specified with the group 3 of the
pattern, attaching the group 1 as a value for ``protection`` field to the tag, and attaching
the group 4 as a value for ``signature`` field to the tag. You can use the long regex flag
``_field`` for attaching fields to a tag with the following notation rule::
{_field=FIELDNAME:GROUP}
``--fields-<LANG>=[+|-]{FIELDNAME}`` can be used to enable or disable specified field.
When defining a new parser specific field, it is disabled by default. Enable the
field explicitly to use the field. See ":ref:`Parser specific fields <parser-specific-fields>`"
about ``--fields-<LANG>`` option.
`passwd` parser is a simple example that uses ``--fields-<LANG>`` option.
.. _roles:
Capturing reference tags
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. NOT REVIEWED YET
To make a reference tag with an optlib parser, specify a role with
``_role`` long regex flag. Let's see an example:
.. code-block:: ctags
:emphasize-lines: 3-6
--langdef=FOO
--kinddef-FOO=m,module,modules
--_roledef-FOO.m=imported,imported module
--regex-FOO=/import[ \t]+([a-z]+)/\1/m/{_role=imported}
--extras=+r
--fields=+r
A role must be defined before specifying it as value for ``_role`` flag.
``--_roledef-<LANG>.<KIND>=<ROLE>,<ROLEDESC>`` option is for defining a role.
See the line, ``--regex-FOO=...``. In this parser `FOO`, the name of an
imported module is captured as a reference tag with role ``imported``.
For specifying *<KIND>* where the role is defined, you can use either a
kind letter or a kind name surrounded by '``{``' and '``}``'.
The option has two parameters separated by a comma:
*<ROLE>*
the role name, and
*<ROLEDESC>*
the description of the role.
The first parameter is the name of the role. The role is defined in
the kind *<KIND>* of the language *<LANG>*. In the example,
``imported`` role is defined in the ``module`` kind, which is specified
with ``m``. You can use ``{module}``, the name of the kind instead.
The kind specified in ``--_roledef-<LANG>.<KIND>`` option must be
defined *before* using the option. See the description of
``--kinddef-<LANG>`` for defining a kind.
The roles are listed with ``--list-roles=<LANG>``. The name and description
passed to ``--_roledef-<LANG>.<KIND>`` option are used in the output like::
$ ctags --langdef=FOO --kinddef-FOO=m,module,modules \
--_roledef-FOO.m='imported,imported module' --list-roles=FOO
#KIND(L/N) NAME ENABLED DESCRIPTION
m/module imported on imported module
If specifying ``_role`` regex flag multiple times with different roles, you can
assign multiple roles to a reference tag. See following input of C language
.. code-block:: C
x = 0;
i += 1;
An ultra fine grained C parser may capture the variable ``x`` with
``lvalue`` role and the variable ``i`` with ``lvalue`` and ``incremented``
roles.
You can implement such roles by extending the built-in C parser:
.. code-block:: ctags
:emphasize-lines: 2-5
# c-extra.ctags
--_roledef-C.v=lvalue,locator values
--_roledef-C.v=incremented,incremented with ++ operator
--regex-C=/([a-zA-Z_][a-zA-Z_0-9]*) *=/\1/v/{_role=lvalue}
--regex-C=/([a-zA-Z_][a-zA-Z_0-9]*) *\+=/\1/v/{_role=lvalue}{_role=incremented}
.. code-block:: console
$ ctags with --options=c-extra.ctags --extras=+r --fields=+r
i input.c /^i += 1;$/;" v roles:lvalue,incremented
x input.c /^x = 0;$/;" v roles:lvalue
Scope tracking in a regex parser
......................................................................
About the ``{scope=..}`` flag itself for scope tracking, see "FLAGS FOR
--regex-<LANG> OPTION" section of :ref:`ctags-optlib(7) <ctags-optlib(7)>`.
Example 1:
.. code-block:: python
# in /tmp/input.foo
class foo:
def bar(baz):
print(baz)
class goo:
def gar(gaz):
print(gaz)
.. code-block:: ctags
:emphasize-lines: 7,8
# in /tmp/foo.ctags:
--langdef=Foo
--map-Foo=+.foo
--kinddef-Foo=c,class,classes
--kinddef-Foo=d,definition,definitions
--regex-Foo=/^class[[:blank:]]+([[:alpha:]]+):/\1/c/{scope=set}
--regex-Foo=/^[[:blank:]]+def[[:blank:]]+([[:alpha:]]+).*:/\1/d/{scope=ref}
.. code-block:: console
$ ctags --options=/tmp/foo.ctags -o - /tmp/input.foo
bar /tmp/input.foo /^ def bar(baz):$/;" d class:foo
foo /tmp/input.foo /^class foo:$/;" c
gar /tmp/input.foo /^ def gar(gaz):$/;" d class:goo
goo /tmp/input.foo /^class goo:$/;" c
Example 2:
.. code-block:: c
// in /tmp/input.pp
class foo {
int bar;
}
.. code-block:: ctags
:emphasize-lines: 7-9
# in /tmp/pp.ctags:
--langdef=pp
--map-pp=+.pp
--kinddef-pp=c,class,classes
--kinddef-pp=v,variable,variables
--regex-pp=/^[[:blank:]]*\}//{scope=pop}{exclusive}
--regex-pp=/^class[[:blank:]]*([[:alnum:]]+)[[[:blank:]]]*\{/\1/c/{scope=push}
--regex-pp=/^[[:blank:]]*int[[:blank:]]*([[:alnum:]]+)/\1/v/{scope=ref}
.. code-block:: console
$ ctags --options=/tmp/pp.ctags -o - /tmp/input.pp
bar /tmp/input.pp /^ int bar$/;" v class:foo
foo /tmp/input.pp /^class foo {$/;" c
Example 3:
.. code-block::
# in /tmp/input.docdoc
title T
...
section S0
...
section S1
...
.. code-block:: ctags
:emphasize-lines: 15,21
# in /tmp/doc.ctags:
--langdef=doc
--map-doc=+.docdoc
--kinddef-doc=s,section,sections
--kinddef-doc=S,subsection,subsections
--_tabledef-doc=main
--_tabledef-doc=section
--_tabledef-doc=subsection
--_mtable-regex-doc=main/section +([^\n]+)\n/\1/s/{scope=push}{tenter=section}
--_mtable-regex-doc=main/[^\n]+\n|[^\n]+|\n//
--_mtable-regex-doc=main///{scope=clear}{tquit}
--_mtable-regex-doc=section/section +([^\n]+)\n/\1/s/{scope=replace}
--_mtable-regex-doc=section/subsection +([^\n]+)\n/\1/S/{scope=push}{tenter=subsection}
--_mtable-regex-doc=section/[^\n]+\n|[^\n]+|\n//
--_mtable-regex-doc=section///{scope=clear}{tquit}
--_mtable-regex-doc=subsection/(section )//{_advanceTo=0start}{tleave}{scope=pop}
--_mtable-regex-doc=subsection/subsection +([^\n]+)\n/\1/S/{scope=replace}
--_mtable-regex-doc=subsection/[^\n]+\n|[^\n]+|\n//
--_mtable-regex-doc=subsection///{scope=clear}{tquit}
.. code-block:: console
% ctags --sort=no --fields=+nl --options=/tmp/doc.ctags -o - /tmp/input.docdoc
SEC0 /tmp/input.docdoc /^section SEC0$/;" s line:1 language:doc
SUB0-1 /tmp/input.docdoc /^subsection SUB0-1$/;" S line:3 language:doc section:SEC0
SUB0-2 /tmp/input.docdoc /^subsection SUB0-2$/;" S line:5 language:doc section:SEC0
SEC1 /tmp/input.docdoc /^section SEC1$/;" s line:7 language:doc
SUB1-1 /tmp/input.docdoc /^subsection SUB1-1$/;" S line:9 language:doc section:SEC1
SUB1-2 /tmp/input.docdoc /^subsection SUB1-2$/;" S line:11 language:doc section:SEC1
NOTE: This flag doesn't work well with ``--mline-regex-<LANG>=``.
Overriding the letter for file kind
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. Q: this was fixed in https://github.com/universal-ctags/ctags/pull/331
so can we remove this section?
One of the built-in tag kinds in Universal Ctags is the ``F`` file kind.
Overriding the letter for file kind is not allowed in Universal Ctags.
.. warning::
Don't use ``F`` as a kind letter in your parser. (See issue `#317
<https://github.com/universal-ctags/ctags/issues/317>`_ on github)
Generating fully qualified tags automatically from scope information
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If scope fields are filled properly with ``{scope=...}`` regex flags,
you can use the field values for generating fully qualified tags.
About the ``{scope=..}`` flag itself, see "FLAGS FOR --regex-<LANG>
OPTION" section of :ref:`ctags-optlib(7) <ctags-optlib(7)>`.
Specify ``{_autoFQTag}`` to the end of ``--langdef=<LANG>`` option like
``--langdef=Foo{_autoFQTag}`` to make ctags generate fully qualified
tags automatically.
'``.``' is the (ctags global) default separator combining names into a
fully qualified tag. You can customize separators with
``--_scopesep-<LANG>=...`` option.
input.foo::
class X
var y
end
foo.ctags:
.. code-block:: ctags
:emphasize-lines: 1
--langdef=foo{_autoFQTag}
--map-foo=+.foo
--kinddef-foo=c,class,classes
--kinddef-foo=v,var,variables
--regex-foo=/class ([A-Z]*)/\1/c/{scope=push}
--regex-foo=/end///{placeholder}{scope=pop}
--regex-foo=/[ \t]*var ([a-z]*)/\1/v/{scope=ref}
Output::
$ u-ctags --quiet --options=./foo.ctags -o - input.foo
X input.foo /^class X$/;" c
y input.foo /^ var y$/;" v class:X
$ u-ctags --quiet --options=./foo.ctags --extras=+q -o - input.foo
X input.foo /^class X$/;" c
X.y input.foo /^ var y$/;" v class:X
y input.foo /^ var y$/;" v class:X
``X.y`` is printed as a fully qualified tag when ``--extras=+q`` is given.
.. NOT REVIEWED YET (--_scopesep)
Customizing scope separators
......................................................................
Use ``--_scopesep-<LANG>=[<parent-kindLetter>]/<child-kindLetter>:<sep>``
option for customizing if the language uses ``{_autoFQTag}``.
``parent-kindLetter``
The kind letter for a tag of outer-scope.
You can use '``*``' for specifying as wildcards that means
*any kinds* for a tag of outer-scope.
If you omit ``parent-kindLetter``, the separator is used as
a prefix for tags having the kind specified with ``child-kindLetter``.
This prefix can be used to refer to global namespace or similar concepts if the
language has one.
``child-kindLetter``
The kind letter for a tag of inner-scope.
You can use '``*``' for specifying as wildcards that means
*any kinds* for a tag of inner-scope.
``sep``
In a qualified tag, if the outer-scope has kind and ``parent-kindLetter``
the inner-scope has ``child-kindLetter``, then ``sep`` is instead in
between the scope names in the generated tags file.
specifying '``*``' as both ``parent-kindLetter`` and ``child-kindLetter``
sets ``sep`` as the language default separator. It is used as fallback.
Specifying '``*``' as ``child-kindLetter`` and omitting ``parent-kindLetter``
sets ``sep`` as the language default prefix. It is used as fallback.
NOTE: There is no ctags global default prefix.
NOTE: ``_scopesep-<LANG>=...`` option affects only a parser that
enables ``_autoFQTag``. A parser building full qualified tags
manually ignores the option.
Let's see an example.
The input file is written in Tcl. Tcl parser is not an optlib
parser. However, it uses the ``_autoFQTag`` feature internally.
Therefore, ``_scopesep-Tcl=`` option works well. Tcl parser
defines two kinds ``n`` (``namespace``) and ``p`` (``procedure``).
By default, Tcl parser uses ``::`` as scope separator. The parser also
uses ``::`` as root prefix.
.. code-block:: tcl
namespace eval N {
namespace eval M {
proc pr0 {s} {
puts $s
}
}
}
proc pr1 {s} {
puts $s
}
``M`` is defined under the scope of ``N``. ``pr0`` is defined under the scope
of ``M``. ``N`` and ``pr1`` are at top level (so they are candidates to be added
prefixes). ``M`` and ``N`` are language objects with ``n`` (``namespace``) kind.
``pr0`` and ``pr1`` are language objects with ``p`` (``procedure``) kind.
.. code-block:: console
$ ctags -o - --extras=+q input.tcl
::N input.tcl /^namespace eval N {$/;" n
::N::M input.tcl /^ namespace eval M {$/;" n namespace:::N
::N::M::pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:::N::M
::pr1 input.tcl /^proc pr1 {s} {$/;" p
M input.tcl /^ namespace eval M {$/;" n namespace:::N
N input.tcl /^namespace eval N {$/;" n
pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:::N::M
pr1 input.tcl /^proc pr1 {s} {$/;" p
Let's change the default separator to ``->``:
.. code-block:: console
:emphasize-lines: 1
$ ctags -o - --extras=+q --_scopesep-Tcl='*/*:->' input.tcl
::N input.tcl /^namespace eval N {$/;" n
::N->M input.tcl /^ namespace eval M {$/;" n namespace:::N
::N->M->pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:::N->M
::pr1 input.tcl /^proc pr1 {s} {$/;" p
M input.tcl /^ namespace eval M {$/;" n namespace:::N
N input.tcl /^namespace eval N {$/;" n
pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:::N->M
pr1 input.tcl /^proc pr1 {s} {$/;" p
Let's define '``^``' as default prefix:
.. code-block:: console
:emphasize-lines: 1
$ ctags -o - --extras=+q --_scopesep-Tcl='*/*:->' --_scopesep-Tcl='/*:^' input.tcl
M input.tcl /^ namespace eval M {$/;" n namespace:^N
N input.tcl /^namespace eval N {$/;" n
^N input.tcl /^namespace eval N {$/;" n
^N->M input.tcl /^ namespace eval M {$/;" n namespace:^N
^N->M->pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:^N->M
^pr1 input.tcl /^proc pr1 {s} {$/;" p
pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:^N->M
pr1 input.tcl /^proc pr1 {s} {$/;" p
Let's override the specification of separator for combining a
namespace and a procedure with '``+``': (About the separator for
combining a namespace and another namespace, ctags uses the default separator.)
.. code-block:: console
:emphasize-lines: 1
$ ctags -o - --extras=+q --_scopesep-Tcl='*/*:->' --_scopesep-Tcl='/*:^' --_scopesep-Tcl='n/p:+' input.tcl
M input.tcl /^ namespace eval M {$/;" n namespace:^N
N input.tcl /^namespace eval N {$/;" n
^N input.tcl /^namespace eval N {$/;" n
^N->M input.tcl /^ namespace eval M {$/;" n namespace:^N
^N->M+pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:^N->M
^pr1 input.tcl /^proc pr1 {s} {$/;" p
pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:^N->M
pr1 input.tcl /^proc pr1 {s} {$/;" p
Let's override the definition of prefix for a namespace with '``@``':
(About the prefix for procedures, ctags uses the default prefix.)
.. code-block:: console
:emphasize-lines: 1
$ ctags -o - --extras=+q --_scopesep-Tcl='*/*:->' --_scopesep-Tcl='/*:^' --_scopesep-Tcl='n/p:+' --_scopesep-Tcl='/n:@' input.tcl
@N input.tcl /^namespace eval N {$/;" n
@N->M input.tcl /^ namespace eval M {$/;" n namespace:@N
@N->M+pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:@N->M
M input.tcl /^ namespace eval M {$/;" n namespace:@N
N input.tcl /^namespace eval N {$/;" n
^pr1 input.tcl /^proc pr1 {s} {$/;" p
pr0 input.tcl /^ proc pr0 {s} {$/;" p namespace:@N->M
pr1 input.tcl /^proc pr1 {s} {$/;" p
Multi-line pattern match
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We often need to scan multiple lines to generate a tag, whether due to
needing contextual information to decide whether to tag or not, or to
constrain generating tags to only certain cases, or to grab multiple
substrings to generate the tag name.
Universal Ctags has two ways to accomplish this: *multi-line regex options*,
and an experimental *multi-table regex options* described later.
The newly introduced ``--mline-regex-<LANG>`` is similar to ``--regex-<LANG>``
except the pattern is applied to the whole file's contents, not line by line.
This example is based on an issue `#219
<https://github.com/universal-ctags/ctags/issues/219>`_ posted by
@andreicristianpetcu:
.. code-block:: java
// in input.java:
@Subscribe
public void catchEvent(SomeEvent e)
{
return;
}
@Subscribe
public void
recover(Exception e)
{
return;
}
The above java code is similar to the Java `Spring <https://spring.io>`_
framework. The ``@Subscribe`` annotation is a keyword for the framework, and the
developer would like to have a tag generated for each method annotated with
``@Subscribe``, using the name of the method followed by a dash followed by the
type of the argument. For example the developer wants the tag name
``Event-SomeEvent`` generated for the first method shown above.
To accomplish this, the developer creates a :file:`spring.ctags` file with
the following:
.. code-block:: ctags
:emphasize-lines: 4
# in spring.ctags:
--langdef=javaspring
--map-javaspring=+.java
--mline-regex-javaspring=/@Subscribe([[:space:]])*([a-z ]+)[[:space:]]*([a-zA-Z]*)\(([a-zA-Z]*)/\3-\4/s,subscription/{mgroup=3}
--fields=+ln
And now using :file:`spring.ctags` the tag file has this:
.. code-block:: console
$ ctags -o - --options=./spring.ctags input.java
Event-SomeEvent input.java /^public void catchEvent(SomeEvent e)$/;" s line:2 language:javaspring
recover-Exception input.java /^ recover(Exception e)$/;" s line:10 language:javaspring
Multiline pattern flags
......................................................................
.. note:: These flags also apply to the experimental ``--_mtable-regex-<LANG>``
option described later.
``{mgroup=N}``
This flag indicates the pattern should be applied to the whole file
contents, not line by line. ``N`` is the number of a capture group in the
pattern, which is used to record the line number location of the tag. In the
above example ``3`` is specified. The start position of the regex capture
group 3, relative to the whole file is used.
.. warning:: You **must** add an ``{mgroup=N}`` flag to the multi-line
``--mline-regex-<LANG>`` option, even if the ``N`` is ``0`` (meaning the
start position of the whole regex pattern). You do not need to add it for
the multi-table ``--_mtable-regex-<LANG>``.
.. TODO: Q: isn't the above restriction really a bug? I think it is. I should fix it.
Q to @masatake-san: Do you mean that {mgroup=0} can be omitted? -> #2918 is opened
A. as proposed in #3514, I made {mgroup=N} be a must flag.
``{_advanceTo=N[start|end]}``
A regex pattern is applied to whole file's contents iteratively. This long
flag specifies from where the pattern should be applied in the next
iteration for regex matching. When a pattern matches, the next pattern
matching starts from the start or end of capture group ``N``. By default it
advances to the end of the whole match (i.e., ``{_advanceTo=0end}`` is
the default).
Let's think about following input
::
def def abc
Consider two sets of options, ``foo.ctags`` and ``bar.ctags``.
.. code-block:: ctags
:emphasize-lines: 5
# foo.ctags:
--langdef=foo
--langmap=foo:.foo
--kinddef-foo=a,something,something
--mline-regex-foo=/def *([a-z]+)/\1/a/{mgroup=1}
.. code-block:: ctags
:emphasize-lines: 5
# bar.ctags:
--langdef=bar
--langmap=bar:.bar
--kinddef-bar=a,something,something
--mline-regex-bar=/def *([a-z]+)/\1/a/{mgroup=1}{_advanceTo=1start}
``foo.ctags`` emits following tags output::
def input.foo /^def def abc$/;" a
``bar.ctags`` emits following tags output::
def input-0.bar /^def def abc$/;" a
abc input-0.bar /^def def abc$/;" a
``_advanceTo=1start`` is specified in ``bar.ctags``.
This allows ctags to capture ``abc``.
At the first iteration, the patterns of both
``foo.ctags`` and ``bar.ctags`` match as follows
::
0 1 (start)
v v
def def abc
^
0,1 (end)
``def`` at the group 1 is captured as a tag in
both languages. At the next iteration, the positions
where the pattern matching is applied to are not the
same in the languages.
``foo.ctags``
::
0end (default)
v
def def abc
``bar.ctags``
::
1start (as specified in _advanceTo long flag)
v
def def abc
This difference of positions makes the difference of tags output.
A more relevant use-case is when ``{_advanceTo=N[start|end]}`` is used in
the experimental ``--_mtable-regex-<LANG>``, to "advance" back to the
beginning of a match, so that one can generate multiple tags for the same
input line(s).
.. note:: This flag doesn't work well with scope related flags and ``exclusive`` flags.
.. Q: this was previously titled "Byte oriented pattern matching...", presumably
because it "matched against the input at the current byte position, not line".
But that's also true for --mline-regex-<LANG>, as far as I can tell.
Advanced pattern matching with multiple regex tables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. note:: This is a highly experimental feature. This will not go into
the man page of 6.0. But let's be honest, it's the most exciting feature!
In some cases, the ``--regex-<LANG>`` and ``--mline-regex-<LANG>`` options are not
sufficient to generate the tags for a particular language. Some of the common
reasons for this are:
* To ignore commented lines or sections for the language file, so that
tags aren't generated for symbols that are within the comments.
* To enter and exit scope, and use it for tagging based on contextual
state or with end-scope markers that are difficult to match to their
associated scope entry point.
* To support nested scopes.
* To change the pattern searched for, or the resultant tag for the same
pattern, based on scoping or contextual location.
* To break up an overly complicated ``--mline-regex-<LANG>`` pattern into
separate regex patterns, for performance or readability reasons.
To help handle such things, Universal Ctags has been enhanced with multi-table
regex matching. The feature is inspired by `lex`, the fast lexical analyzer
generator, which is a popular tool on Unix environments for writing parsers, and
`RegexLexer <http://pygments.org/docs/lexerdevelopment/>`_ of Pygments.
Knowledge about them will help you understand the new options.
The new options are:
``--_tabledef-<LANG>``
Declares a new regex matching table of a given name for the language,
as described in ":ref:`tabledef`".
``--_mtable-regex-<LANG>``
Adds a regex pattern and associated tag generation information and flags, to
the given table, as described in ":ref:`mtable_regex`".
``--_mtable-extend-<LANG>``
Includes a previously-defined regex table to the named one.
The above will be discussed in more detail shortly.
First, let's explain the feature with an example. Consider an
imaginary language `X` has a similar syntax as JavaScript: ``var`` is
used as defining variable(s), and "``/* ... */``" is used for block
comments.
Here is our input, :file:`input.x`:
.. code-block:: java
/* BLOCK COMMENT
var dont_capture_me;
*/
var a /* ANOTHER BLOCK COMMENT */, b;
We want ctags to capture ``a`` and ``b`` - but it is difficult to write a parser
that will ignore ``dont_capture_me`` in the comment with a classical regex
parser defined with ``--regex-<LANG>`` or ``--mline-regex-<LANG>``, because of
the block comments.
The ``--regex-<LANG>`` option only works on one line at a time, so can not know
``dont_capture_me`` is within comments. The ``--mline-regex-<LANG>`` could
do it in theory, but due to the greedy nature of the regex engine it is
impractical and potentially inefficient to do so, given that there could be
multiple block comments in the file, with '``*``' inside them, etc.
A parser written with multi-table regex, on the other hand, can capture only
``a`` and ``b`` safely. But it is more complicated to understand.
Here is the 1st version of :file:`X.ctags`:
.. code-block:: ctags
--langdef=X
--map-X=.x
--kinddef-X=v,var,variables
Not so interesting. It doesn't really *do* anything yet. It just creates a new
language named ``X``, for files ending with a :file:`.x` suffix, and defines a
new tag for variable kinds.
When writing a multi-table parser, you have to think about the necessary states
of parsing. For the parser of language `X`, we need the following states:
* `toplevel` (initial state)
* `comment` (inside comment)
* `vars` (var statements)
.. _tabledef:
Declaring a new regex table
......................................................................
Before adding regular expressions, you have to declare tables for each state
with the ``--_tabledef-<LANG>=<TABLE>`` option.
Here is the 2nd version of :file:`X.ctags` doing so:
.. code-block:: ctags
:emphasize-lines: 5-7
--langdef=X
--map-X=.x
--kinddef-X=v,var,variables
--_tabledef-X=toplevel
--_tabledef-X=comment
--_tabledef-X=vars
For table names, only characters in the range ``[0-9a-zA-Z_]`` are acceptable.
For a given language, for each file's input the ctags multi-table parser begins
with the first declared table. For :file:`X.ctags`, ``toplevel`` is the one.
The other tables are only ever entered/checked if another table specified to do
so, starting with the first table. In other words, if the first declared table
does not find a match for the current input, and does not specify to go to
another table, the other tables for that language won't be used. The flags to go
to another table are ``{tenter}``, ``{tleave}``, and ``{tjump}``, as described
later.
.. _mtable_regex:
Adding a regex to a regex table
......................................................................
The new option to add a regex to a declared table is ``--_mtable-regex-<LANG>``,
and it follows this form:
.. code-block:: ctags
--_mtable-regex-<LANG>=<TABLE>/<PATTERN>/<NAME>/[<KIND>]/LONGFLAGS
The parameters for ``--_mtable-regex-<LANG>`` look complicated. However,
``<PATTERN>``, ``<NAME>``, and ``<KIND>`` are the same as the parameters of the
``--regex-<LANG>`` and ``--mline-regex-<LANG>`` options. ``<TABLE>`` is simply
the name of a table previously declared with the ``--_tabledef-<LANG>`` option.
A regex pattern added to a parser with ``--_mtable-regex-<LANG>`` is matched
against the input at the current byte position, not line. Even if you do not
specify the '``^``' anchor at the start of the pattern, ctags adds '``^``' to
the pattern automatically. Unlike the ``--regex-<LANG>`` and
``--mline-regex-<LANG>`` options, a '``^``' anchor does not mean "beginning of
line" in ``--_mtable-regex-<LANG>``; instead it means the beginning of the
input string (i.e., the current byte position).
The ``LONGFLAGS`` include the already discussed flags for ``--regex-<LANG>`` and
``--mline-regex-<LANG>``: ``{scope=...}``, ``{mgroup=N}``, ``{_advanceTo=N}``,
``{basic}``, ``{extend}``, and ``{icase}``. The ``{exclusive}`` flag does not
make sense for multi-table regex.
In addition, several new flags are introduced exclusively for multi-table
regex use:
``{tenter}``
Push the current table on the stack, and enter another table.
``{tleave}``
Leave the current table, pop the stack, and go to the table that was
just popped from the stack.
``{tjump}``
Jump to another table, without affecting the stack.
``{treset}``
Clear the stack, and go to another table.
``{tquit}``
Clear the stack, and stop processing the current input file for this
language.
To explain the above new flags, we'll continue using our example in the
next section.
Skipping block comments
......................................................................
Let's continue with our example. Here is the 3rd version of :file:`X.ctags`:
.. code-block:: ctags
:emphasize-lines: 9-13
:linenos:
--langdef=X
--map-X=.x
--kinddef-X=v,var,variables
--_tabledef-X=toplevel
--_tabledef-X=comment
--_tabledef-X=vars
--_mtable-regex-X=toplevel/\/\*//{tenter=comment}
--_mtable-regex-X=toplevel/.//
--_mtable-regex-X=comment/\*\///{tleave}
--_mtable-regex-X=comment/.//
Four ``--_mtable-regex-X`` lines are added for skipping the block comments. Let's
discuss them one by one.
For each new file it scans, ctags always chooses the first pattern of the
first table of the parser. Even if it's an empty table, ctags will only try
the first declared table. (in such a case it would immediately fail to match
anything, and thus stop processing the input file and effectively do nothing)
The first declared table (``toplevel``) has the following regex added to
it first:
.. code-block:: ctags
:linenos:
:lineno-start: 9
--_mtable-regex-X=toplevel/\/\*//{tenter=comment}
A pattern of ``\/\*`` is added to the ``toplevel`` table, to match the
beginning of a block comment. A backslash character is used in front of the
leading '``/``' to escape the separation character '``/``' that separates the fields
of ``--_mtable-regex-<LANG>``. Another backslash inside the pattern is used
before the asterisk '``*``', to make it a literal asterisk character in regex.
The last ``//`` means ctags should not tag something matching this pattern.
In ``--regex-<LANG>`` you never use ``//`` because it would be pointless to
match something and not tag it using and single-line ``--regex-<LANG>``; in
multi-line ``--mline-regex-<LANG>`` you rarely see it, because it would rarely
be useful. But in multi-table regex it's quite common, since you frequently
want to transition from one state to another (i.e., ``tenter`` or ``tjump``
from one table to another).
The long flag added to our first regex of our first table is ``tenter``, which
is a long flag for switching the table and pushing on the stack. ``{tenter=comment}``
means "switch the table from toplevel to comment".
So given the input file :file:`input.x` shown earlier, ctags will begin at
the ``toplevel`` table and try to match the first regex. It will succeed, and
thus push on the stack and go to the ``comment`` table.
It will begin at the top of the ``comment`` table (it always begins at the top
of a given table), and try each regex line in sequence until it finds a match.
If it fails to find a match, it will pop the stack and go to the table that was
just popped from the stack, and begin trying to match at the top of *that* table.
If it continues failing to find a match, and ultimately reaches the end of the
stack, it will stop processing for this file. For the next input file, it will
begin again from the top of the first declared table.
Getting back to our example, the top of the ``comment`` table has this regex:
.. code-block:: ctags
:linenos:
:lineno-start: 12
--_mtable-regex-X=comment/\*\///{tleave}
Similar to the previous ``toplevel`` table pattern, this one for ``\*\/`` uses
a backslash to escape the separator '``/``', as well as one before the '``*``' to
make it a literal asterisk in regex. So what it's looking for, from a simple
string perspective, is the sequence ``*/``. Note that this means even though
you see three backslashes ``///`` at the end, the first one is escaped and used
for the pattern itself, and the ``--_mtable-regex-X`` only has ``//`` to
separate the regex pattern from the long flags, instead of the usual ``///``.
Thus it's using the shorthand form of the ``--_mtable-regex-X`` option.
It could instead have been:
.. code-block:: ctags
--_mtable-regex-X=comment/\*\////{tleave}
The above would have worked exactly the same.
Getting back to our example, remember we're looking at the :file:`input.x`
file, currently using the ``comment`` table, and trying to match the first
regex of that table, shown above, at the following location::
,ctags is trying to match starting here
v
/* BLOCK COMMENT
var dont_capture_me;
*/
var a /* ANOTHER BLOCK COMMENT */, b;
The pattern doesn't match for the position just after ``/*``, because that
position is a space character. So ctags tries the next pattern in the same
table:
.. code-block:: ctags
:linenos:
:lineno-start: 13
--_mtable-regex-X=comment/.//
This pattern matches any any one character including newline; the current
position moves one character forward. Now the character at the current position is
'``B``'. The first pattern of the table ``*/`` still does not match with the input. So
ctags uses next pattern again. When the current position moves to the ``*/``
of the 3rd line of :file:`input.x`, it will finally match this:
.. code-block:: ctags
:linenos:
:lineno-start: 12
--_mtable-regex-X=comment/\*\///{tleave}
In this pattern, the long flag ``{tleave}`` is specified. This triggers table
switching again. ``{tleave}`` makes ctags switch the table back to the last
table used before doing ``{tenter}``. In this case, ``toplevel`` is the table.
ctags manages a stack where references to tables are put. ``{tenter}`` pushes
the current table to the stack. ``{tleave}`` pops the table at the top of the
stack and chooses it.
So now ctags is back to the ``toplevel`` table, and tries the first regex
of that table, which was this:
.. code-block:: ctags
:linenos:
:lineno-start: 9
--_mtable-regex-X=toplevel/\/\*//{tenter=comment}
It tries to match that against its current position, which is now the
newline on line 3, between the ``*/`` and the word ``var``::
/* BLOCK COMMENT
var dont_capture_me;
*/ <--- ctags is now at this newline (/n) character
var a /* ANOTHER BLOCK COMMENT */, b;
The first regex of the ``toplevel`` table does not match a newline, so it tries
the second regex:
.. code-block:: ctags
:linenos:
:lineno-start: 13
--_mtable-regex-X=toplevel/.//
This matches a newline successfully, but has no actions to perform. So ctags
moves one character forward (the newline it just matched), and goes back to the
top of the ``toplevel`` table, and tries the first regex again. Eventually we'll
reach the beginning of the second block comment, and do the same things as before.
When ctags finally reaches the end of the file (the position after ``b;``),
it will not be able to match either the first or second regex of the
``toplevel`` table, and quit processing the input file.
So far, we've successfully skipped over block comments for our new ``X``
language, but haven't generated any tags. The point of ctags is to generate
tags, not just keep your computer warm. So now let's move onto actually tagging
variables...
Capturing variables in a sequence
......................................................................
Here is the 4th version of :file:`X.ctags`:
.. code-block:: ctags
:emphasize-lines: 10,16-19
:linenos:
--langdef=X
--map-X=.x
--kinddef-X=v,var,variables
--_tabledef-X=toplevel
--_tabledef-X=comment
--_tabledef-X=vars
--_mtable-regex-X=toplevel/\/\*//{tenter=comment}
--_mtable-regex-X=toplevel/var[ \n\t]//{tenter=vars}
--_mtable-regex-X=toplevel/.//
--_mtable-regex-X=comment/\*\///{tleave}
--_mtable-regex-X=comment/.//
--_mtable-regex-X=vars/;//{tleave}
--_mtable-regex-X=vars/\/\*//{tenter=comment}
--_mtable-regex-X=vars/([a-zA-Z][a-zA-Z0-9]*)/\1/v/
--_mtable-regex-X=vars/.//
One pattern in ``toplevel`` was added, and a new table ``vars`` with four
patterns was also added.
The new regex in ``toplevel`` is this:
.. code-block:: ctags
:linenos:
:lineno-start: 10
--_mtable-regex-X=toplevel/var[ \n\t]//{tenter=vars}
The purpose of this being in `toplevel` is to switch to the `vars` table when
the keyword ``var`` is found in the input stream. We need to switch states
(i.e., tables) because we can't simply capture the variables ``a`` and ``b``
with a single regex pattern in the ``toplevel`` table, because there might be
block comments inside the ``var`` statement (as there are in our
:file:`input.x`), and we also need to create *two* tags: one for ``a`` and one
for ``b``, even though the word ``var`` only appears once. In other words, we
need to "remember" that we saw the keyword ``var``, when we later encounter the
names ``a`` and ``b``, so that we know to tag each of them; and saving that
"in-variable-statement" state is accomplished by switching tables to the
``vars`` table.
The first regex in our new ``vars`` table is:
.. code-block:: ctags
:linenos:
:lineno-start: 16
--_mtable-regex-X=vars/;//{tleave}
This pattern is used to match a single semi-colon '``;``', and if it matches
pop back to the ``toplevel`` table using the ``{tleave}`` long flag. We
didn't have to make this the first regex pattern, because it doesn't overlap
with any of the other ones other than the ``/.//`` last one (which must be
last for this example to work).
The second regex in our ``vars`` table is:
.. code-block:: ctags
:linenos:
:lineno-start: 17
--_mtable-regex-X=vars/\/\*//{tenter=comment}
We need this because block comments can be in variable definitions::
var a /* ANOTHER BLOCK COMMENT */, b;
So to skip block comments in such a position, the pattern ``\/\*`` is used just
like it was used in the ``toplevel`` table: to find the literal ``/*`` beginning
of the block comment and enter the ``comment`` table. Because we're using
``{tenter}`` and ``{tleave}`` to push/pop from a stack of tables, we can
use the same ``comment`` table for both ``toplevel`` and ``vars`` to go to,
because ctags will *remember* the previous table and ``{tleave}`` will
pop back to the right one.
The third regex in our ``vars`` table is:
.. code-block:: ctags
:linenos:
:lineno-start: 18
--_mtable-regex-X=vars/([a-zA-Z][a-zA-Z0-9]*)/\1/v/
This is nothing special, but is the one that actually tags something: it
captures the variable name and uses it for generating a ``variable`` (shorthand
``v``) tag kind.
The last regex in the ``vars`` table we've seen before:
.. code-block:: ctags
:linenos:
:lineno-start: 19
--_mtable-regex-X=vars/.//
This makes ctags ignore any other characters, such as whitespace or the
comma '``,``'.
Running our example
......................................................................
.. code-block:: console
$ cat input.x
/* BLOCK COMMENT
var dont_capture_me;
*/
var a /* ANOTHER BLOCK COMMENT */, b;
$ u-ctags -o - --fields=+n --options=X.ctags input.x
u-ctags -o - --fields=+n --options=X.ctags input.x
a input.x /^var a \/* ANOTHER BLOCK COMMENT *\/, b;$/;" v line:4
b input.x /^var a \/* ANOTHER BLOCK COMMENT *\/, b;$/;" v line:4
It works!
You can find additional examples of multi-table regex in our github repo, under
the ``optlib`` directory. For example ``puppetManifest.ctags`` is a serious
example. It is the primary parser for testing multi-table regex parsers, and
used in the actual ctags program for parsing puppet manifest files.
.. _guest-regex-flag:
Scheduling a guest parser with ``_guest`` regex flag
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. NOT REVIEWED YET
With ``_guest`` regex flag, you can run a parser (a guest parser) on an
area of the current input file.
See ":ref:`host-guest-parsers`" about the concept of the guest parser.
The ``_guest`` regex flag specifies a *guest spec*, and attaches it to
the associated regex pattern.
A guest spec has three fields: *<PARSER>*, *<START>* of area, and *<END>* of area.
The ``_guest`` regex flag has following forms::
{_guest=<PARSER>,<START>,<END>}
ctags maintains a data called *guest request* during parsing. A
guest request also has three fields: `parser`, `start of area`, and
`end of area`.
You, a parser developer, have to fill the fields of guest specs.
ctags inquiries the guest spec when matching the regex pattern
associated with it, tries to fill the fields of the guest request,
and runs a guest parser when all the fields of the guest request are
filled.
If you use `Multi-line pattern match`_ to define a host parser,
you must specify all the fields of `guest request`.
On the other hand if you don't use `Multi-line pattern match`_ to define a host parser,
ctags can fill fields of `guest request` incrementally; more than
one guest specs are used to fill the fields. In other words, you can
make some of the fields of a guest spec empty.
The *<PARSER>* field of ``_guest`` regex flag
......................................................................
For *<PARSER>*, you can specify one of the following items:
a name of a parser
If you know the guest parser you want to run before parsing
the input file, specify the name of the parser. Aliases of parsers
are also considered when finding a parser for the name.
An example of running C parser as a guest parser::
{_guest=C,...
the group number of a regex pattern started from '``\``' (backslash)
If a parser name appears in an input file, write a regex pattern
to capture the name. Specify the group number where the name is
stored to the parser. In such case, use '``\``' as the prefix for
the number. Aliases of parsers are also considered when finding
a parser for the name.
Let's see an example. Git Flavor Markdown (GFM) is a language for
documentation. It provides a notation for quoting a snippet of
program code; the language treats the area started from ``~~~`` to
``~~~`` as a snippet. You can specify a programming language of
the snippet with starting the area with
``~~~<THE_NAME_OF_LANGUAGE>``, like ``~~~C`` or ``~~~Java``.
To run a guest parser on the area, you have to capture the
*<THE_NAME_OF_LANGUAGE>* with a regex pattern:
.. code-block:: ctags
--_mtable-regex-Markdown=main/~~~([a-zA-Z0-9][-#+a-zA-Z0-9]*)[\n]//{_guest=\1,0end,}
The pattern captures the language name in the input file with the
regex group 1, and specify it to *<PARSER>*::
{guest=\1,...
the group number of a regex pattern started from '``*``' (asterisk)
If a file name implying a programming language appears in an input
file, capture the file name with the regex pattern where the guest
spec attaches to. ctags tries to find a proper parser for the
file name by inquiring the langmap.
Use '``*``' as the prefix to the number for specifying the group of
the regex pattern that captures the file name.
Let's see an example. Consider you have a shell script that emits
a program code instantiated from one of the templates. Here documents
are used to represent the templates like:
.. code-block:: sh
i=...
cat > foo.c <<EOF
int main (void) { return $i; }
EOF
cat > foo.el <<EOF
(defun foo () (1+ $i))
EOF
To run guest parsers for the here document areas, the shell
script parser of ctags must choose the parsers from the file
names (``foo.c`` and ``foo.el``):
.. code-block:: ctags
--regex-sh=/cat > ([a-z.]+) <<EOF//{_guest=*1,0end,}
The pattern captures the file name in the input file with the
regex group 1, and specify it to *<PARSER>*::
{_guest=*1,...
The *<START>* and *<END>* fields of `_guest` regex flag
......................................................................
The *<START>* and *<END>* fields specify the area the *<PARSER>* parses. *<START>*
specifies the start of the area. *<END>* specifies the end of the area.
The forms of the two fields are the same: a regex group number
followed by ``start`` or ``end``. e.g. ``3start``, ``0end``. The suffixes,
``start`` and ``end``, represents one of two boundaries of the group.
Let's see an example::
{_guest=C,2end,3start}
This guest regex flag means running C parser on the area between
``2end`` and ``3start``. ``2end`` means the area starts from the end of
matching of the 2nd regex group associated with the flag. ``3start``
means the area ends at the beginning of matching of the 3rd regex
group associated with the flag.
Let's more realistic example.
Here is an optlib file for an imaginary language `single`:
.. code-block:: ctags
:emphasize-lines: 3
--langdef=single
--map-single=.single
--regex-single=/^(BEGIN_C<).*(>END_C)$//{_guest=C,1end,2start}
This parser can run C parser and extract ``main`` function from the
following input file::
BEGIN_C<int main (int argc, char **argv) { return 0; }>END_C
^ ^
`- "1end" points here. |
"2start" points here. -+
.. NOT REVIEWED YET
.. _defining-subparsers:
Defining a subparser
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Basic
.........................................................................
About the concept of subparser, see ":ref:`base-sub-parsers`".
``--langdef=<LANG>`` option is extended as
``--langdef=<LANG>[{base=<LANG>}[{shared|dedicated|bidirectional}]][{_autoFQTag}]`` to define
a subparser for a specified base parser. Combining with ``--kinddef-<LANG>``
and ``--regex-<KIND>`` options, you can extend an existing parser
without risk of kind confliction.
Let's see an example.
input.c
.. code-block:: C
static int set_one_prio(struct task_struct *p, int niceval, int error)
{
}
SYSCALL_DEFINE3(setpriority, int, which, int, who, int, niceval)
{
...;
}
.. code-block:: console
$ ctags -x --_xformat="%20N %10K %10l" -o - input.c
set_one_prio function C
SYSCALL_DEFINE3 function C
C parser doesn't understand that ``SYSCALL_DEFINE3`` is a macro for defining an
entry point for a system.
Let's define `linux` subparser which using C parser as a base parser (``linux.ctags``):
.. code-block:: ctags
:emphasize-lines: 1,3
--langdef=linux{base=C}
--kinddef-linux=s,syscall,system calls
--regex-linux=/SYSCALL_DEFINE[0-9]\(([^, )]+)[\),]*/\1/s/
The output is change as follows with `linux` parser:
.. code-block:: console
:emphasize-lines: 2
$ ctags --options=./linux.ctags -x --_xformat="%20N %10K %10l" -o - input.c
setpriority syscall linux
set_one_prio function C
SYSCALL_DEFINE3 function C
``setpriority`` is recognized as a ``syscall`` of `linux`.
Using only ``--regex-C=...`` you can capture ``setpriority``.
However, there were concerns about kind confliction; when introducing
a new kind with ``--regex-C=...``, you cannot use a letter and name already
used in C parser and ``--regex-C=...`` options specified in the other places.
You can use a newly defined subparser as a new namespace of kinds.
In addition you can enable/disable with the subparser usable
``--languages=[+|-]`` option:
.. code-block::console
$ ctags --options=./linux.ctags --languages=-linux -x --_xformat="%20N %10K %10l" -o - input.c
set_one_prio function C
SYSCALL_DEFINE3 function C
.. _optlib_directions:
Direction flags
.........................................................................
.. TESTCASE: Units/flags-langdef-directions.r
As explained in ":ref:`multiple_parsers_directions`" in
":ref:`multiple_parsers`", you can choose direction(s) how a base parser and a
guest parser work together with direction flags.
The following examples are taken from `#1409
<https://github.com/universal-ctags/ctags/issues/1409>`_ submitted by @sgraham on
github Universal Ctags repository.
``input.cc`` and ``input.mojom`` are input files, and have the same
contents::
ABC();
int main(void)
{
}
C++ parser can capture ``main`` as a function. `Mojom` subparser defined in the
later runs on C++ parser and is for capturing ``ABC``.
shared combination
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``{shared}`` is specified, for ``input.cc``, both tags capture by C++ parser
and mojom parser are recorded to tags file. For ``input.mojom``, only
tags captured by mojom parser are recorded to tags file.
mojom-shared.ctags:
.. code-block:: ctags
:emphasize-lines: 1
--langdef=mojom{base=C++}{shared}
--map-mojom=+.mojom
--kinddef-mojom=f,function,functions
--regex-mojom=/^[ ]+([a-zA-Z]+)\(/\1/f/
.. code-block:: ctags
:emphasize-lines: 2
$ ctags --options=mojom-shared.ctags --fields=+l -o - input.cc
ABC input.cc /^ ABC();$/;" f language:mojom
main input.cc /^int main(void)$/;" f language:C++ typeref:typename:int
.. code-block:: ctags
:emphasize-lines: 2
$ ctags --options=mojom-shared.ctags --fields=+l -o - input.mojom
ABC input.mojom /^ ABC();$/;" f language:mojom
Mojom parser uses C++ parser internally but tags captured by C++ parser are
dropped in the output.
dedicated combination
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``{dedicated}`` is specified, for ``input.cc``, only tags capture by C++
parser are recorded to tags file. For ``input.mojom``, both tags capture
by C++ parser and mojom parser are recorded to tags file.
mojom-dedicated.ctags:
.. code-block:: ctags
:emphasize-lines: 1
--langdef=mojom{base=C++}{dedicated}
--map-mojom=+.mojom
--kinddef-mojom=f,function,functions
--regex-mojom=/^[ ]+([a-zA-Z]+)\(/\1/f/
.. code-block:: ctags
$ ctags --options=mojom-dedicated.ctags --fields=+l -o - input.cc
main input.cc /^int main(void)$/;" f language:C++ typeref:typename:int
.. code-block:: ctags
:emphasize-lines: 2-3
$ ctags --options=mojom-dedicated.ctags --fields=+l -o - input.mojom
ABC input.mojom /^ ABC();$/;" f language:mojom
main input.mojom /^int main(void)$/;" f language:C++ typeref:typename:int
Mojom parser works only when ``.mojom`` file is given as input.
bidirectional combination
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``{bidirectional}`` is specified, both tags capture by C++ parser and
mojom parser are recorded to tags file for either input ``input.cc`` and
``input.mojom``.
mojom-bidirectional.ctags:
.. code-block:: ctags
:emphasize-lines: 1
--langdef=mojom{base=C++}{bidirectional}
--map-mojom=+.mojom
--kinddef-mojom=f,function,functions
--regex-mojom=/^[ ]+([a-zA-Z]+)\(/\1/f/
.. code-block:: ctags
:emphasize-lines: 2
$ ctags --options=mojom-bidirectional.ctags --fields=+l -o - input.cc
ABC input.cc /^ ABC();$/;" f language:mojom
main input.cc /^int main(void)$/;" f language:C++ typeref:typename:int
.. code-block:: ctags
:emphasize-lines: 2-3
$ ctags --options=mojom-bidirectional.ctags --fields=+l -o - input.mojom
ABC input.cc /^ ABC();$/;" f language:mojom
main input.cc /^int main(void)$/;" f language:C++ typeref:typename:int
.. _optlib2c:
Translating an option file into C source code (optlib2c)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Universal Ctags has an ``optlib2c`` script that translates an option file into C
source code. Your optlib parser can thus easily become a built-in parser.
To add your optlib file, ``foo.ctags``, into ctags do the following steps;
* copy ``foo.ctags`` file on ``optlib/`` directory
* add ``foo.ctags`` on ``OPTLIB2C_INPUT`` variable in ``source.mak``
* add ``fooParser`` on ``PARSER_LIST`` macro variable in ``main/parser_p.h``
You are encouraged to submit your :file:`.ctags` file to our repository on
github through a pull request. See ":ref:`contributions`" for more details.
|