1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968
|
0.090 2025-09-21 T. R. Wyant
Explain s///eee... Perl commit 040a4d7 (perlop: properly document
s///e modifier) by mauke, makes perlop explicitly state that more
than 2 'e' modifiers are permitted, and cause the result of the
expression to be eval-ed n-1 times, where n is the number of 'e'
modifiers.
Fix typo in comment. Thanks to Michal Josef Špaček for picking this
up and providing the pull request.
0.089 2025-05-18 T. R. Wyant
The /x modifier should not affect the parse of the replacement
string in a substitution operator. That is, the '#' in s/x/#/x does
NOT introdice a comment.
Correct POD link in PPIx::Regexp::Token::Literal.
Annotate sub class(), since with Perl 5.38 it is a built-in.
0.088 2023-02-28 T. R. Wyant
Remove support for (**{ ... code ... }). This was introduced in
Perl 5.37.8 along with a single-splat version. The double-splat
version was removed without deprecation in Perl 5.37.9, so it is
being removed without deprecation here as well, per my stated policy
about development functionality. The single-splat version still
exists (and is documented) in Perl 5.37.9, and in this package.
0.087 2023-01-28 T. R. Wyant
Add support for code in optimized regex, a.k.a. (*{...}). This
involved making the recognition of backtracking control more
specific, since it also uses (?*...).
If (*{...}) and (**{...}) are removed from Perl before Perl 5.38.0,
support for them will be removed from this package.
0.086 2022-12-13 T. R. Wyant
Add width(), which returns the number of characters matched. Note
that an indefinite upper boumd is represented as IEEE 754 Inf if
that appears to be supported; otherwise by a singleton object
overloaded to allow stringification, numification, and numeric
tests.
Use width() to enhance the detection of variable-width look-behinds.
Serious clean-up on accepts_perl() subsystem.
0.085 2022-04-16 T. R. Wyant
Remove 'postderef' argument to PPIx::Regexp->new(). Postfix
dereference is always recognized.
0.084 2022-04-02 T. R. Wyant
Require PPI 1.238 for postfix deref support, and recode the
postfix deref logic in terms of 1.238's functionality.
Parse '@{[ ... ]}' as code, not interpolation. This is more in line
with what it actually represents, and allows correct versioning of
postfix dereferences. But is is an incompatible change.
0.083 2022-03-17 T. R. Wyant
Correct and optimize the computation of logical column position (the
one that takes account of tabs).
0.082 2021-11-29 T. R. Wyant
Add --version to eg/predump, and document all options with double
dashes.
Silence 'uninitialized' warning generated by /(?<=.{35})/.
Thanks to Brian Fraser for reporting this.
Try to quell weird Win32 test failures which seem to occur only in
tests where I am using 'use open' to put the standard handles into
UTF-8 mode. The fix (I hope) is to do this to the Test::Harness
handles at run time instead of to the standard handles at compile
time.
Add file CONTRIBUTING.
0.081 2021-10-22 T. R. Wyant
Any use of the postderef argument is now fatal.
Correct generation of 'provides' metadata. Thanks to Favio Poletti
for blogging
https://github.polettix.it/ETOOBUSY/2021/06/15/the-real-pause-workaround/,
and ultimately to Joel Berger for the pointer to
https://metacpan.org/pod/CPAN::Meta::Spec#no_index
Add YAPE::Regex to SEE ALSO
0.080 2021-04-16 T. R. Wyant
All uses of the postderef argument to new() now warn.
0.079 2021-03-26 T. R. Wyant
Get prerequisites up to snuff, and add xt/author/prereq.t to ensure
they stay that way.
Add rt.cpan.org back to bug reporting methods. Long live RT!
0.078 2021-01-28 T. R. Wyant
Allow CPAN to index Script_Run, Atomic_Script_Run, since they made
it into a production release.
Allow {,3} and { 0 , 3 } as quantifiers, requiring at least Perl
5.33.6. Previously these parsed as literals. This parse will be
retracted if it does not make it into 5.34.0.
0.077 2021-01-14 T. R. Wyant
Add Travis CI testing.
Use GitHub as bug tracker. R.I.P. rt.cpan.org.
0.076 2020-11-28 T. R. Wyant
Correct (I hope) detection of \K in nested assertions.
Variable-length look-behind is version 5.029009.
Look-behinds quantified longer than 255 characters are an error, and
are made into unknown tokens or structures. I ended up refactoring
the PPIx::Regexp::Token::GroupType class initialization for the
latter two changes.
0.075 2020-10-08 T. R. Wyant
Warn on first use of attribute 'postderef'.
0.074 2020-09-08 T. R. Wyant
Remove PPIx::Regexp::StringTokenizer itself and all documentation
referring to it or the 'parse' argument to PPIx::Regexp->new().
0.073 2020-07-28 T. R. Wyant
Remove prototypes from testing subroutines defined in t/*.t.
0.072 2020-05-20 T. R. Wyant
Drop dependency on List::MoreUtils. Thanks to Graham Ollis
(plicease) for bringing the issues with this module to my attention.
0.071 2020-03-28 T. R. Wyant
Recognize wildcard Unicode names (Perl 5.31.10).
Try to get correct line number in derived PPI. This is done by
injecting "\n" as needed. The initial #line directive becomes "#line
2", but is suppressed if I need to generate line 1.
Improve normalization of content for ppi(). This involves the
un-bracketing of things like ${foo}.
Deprecate new() argument postderef. At this stage it is only
documented as deprecated. In the first release after October 1 2020
it will warn on the first use. Eventually it will be retracted, and
postfix dereferences will always be recognized. This is the default
behavior now.
Add dump argument/option 'short' which, if true, causes leading
'PPIx::Regexp::' to be removed from class names.
0.070 2020-02-27 T. R. Wyant
Add index_locations option to PPIx::Regexp->new(). This defaults to
true if the regexp is specified as a PPI::Element object. The
locations are consistent with the containing PPI::Document.
Add methods location(), column_number(), line_number(),
logical_filename(), logical_line_number(), and
visual_column_number() to PPIx::Regexp::Element. All return undef if
the locations could not be determined.
Add method statement() to PPIx::Regexp::Element. This returns the
PPI statement containing the regexp element, or nothing if none.
Add method is_matcher() to PPIx::Regexp::Element. This classifies
objects as to whether they actually match something in the target
string. Possible returns are true (they do), false but defined (they
do not) and undef (no clue).
Add methods first_token() and last_token() to PPIx::Regexp::Node.
Add methods next_token() and previous_token() to
PPIx::Regexp::Element.
0.069 2020-02-07 T. R. Wyant
The PPIx::Regexp->new() 'parse' option is now fatal. This selected
either string or regex parse. I consider the string parse a failed
experiment. This is the latest step in removing it in favor of the
PPIx::QuoteLike package.
0.068 2020-01-21 T. R. Wyant
Expose PPIx::Regexp::Util::is_ppi_regexp_element()
explain() on [[=x=]] now calls it a Character Equivalence.
It's still a PPIx::Regexp::Token::CharClass::POSIX::Unknown (and
therefore an error), though.
0.067 2019-08-30 T. R. Wyant
\K was retracted in Perl 5.31.3, but only inside look-around
assertions.
0.066 2019-08-16 T. R. Wyant
Fix broken POD, and add tests to ensure it remains fixed.
0.065 2019-05-25 T. R. Wyant
Quash undef error in __is_ppi_regexp_element() when passed a
PPI::Token::Regexp::Transliterate
Support proper version for qr'\N{name}'. Until 5.29.10 this
construction failed to parse because it did not interpolate. But
PPIx::Regexp blithely ignored this detail. As of 5.29.10, something
like m'\N{LATIN CAPITAL LETTER L}' matches identically to m'L'. So I
implemented introduction as of that version.
Have explain() recognize Unicode property wildcards.
0.064 2019-04-01 T. R. Wyant
Empty \p{} should be an error.
\x{} and \x{ non-hex } should be errors under "use re 'strict'"
\o{} should be an error
\o{ non-octal } should be an error under "use re 'strict'"
Support wildcard Unicode property values. These were added in
5.29.9.
Add eg/find-variable-length-lookarounds
Add convenience method extract_regexps(). This is a static method
on PPIx::Regexp that takes as its argument a PPI::Document and
manufactures PPIx::Regexp objects out of anything that parses to a
regexp of some sort.
Don't run illegal character tests before Perl 5.18 unless we're
author testing, because they are noisy. I think the issue is not the
Perl version per se, but the version of Unicode; Perl5180delta says
it shipped with Unicode 6.2.
0.063 2018-11-08 T. R. Wyant
Silence weird-character parse tests and make them no longer
author-only.
Further deprecate 'parse' argument to new(). You now get a warning
on each use.
0.062 2018-08-12 T. R. Wyant
Remove tokenizer method prior(). This is the last step in its
deprecation.
0.061 2018-07-09 T. R. Wyant
Only standalone graphemes and non-characters allowed as delimiters
starting with Perl 5.29.0.
Non-ASCII delimiters started working in 5.8.3, so that is what
perl_version_introduced() returns for them.
Collateral with all this, accept word characters as delimiters, but
only with at least one space between the operator and the expression
-- that is, 'qr xyx' is OK, but 'qrxyx' is not.
0.060 2018-06-16 T. R. Wyant
\N{} now parses as the unknown token, not NoOp, regardless of the
setting of 'use re qw< strict >;'. \N{} became unconditionally fatal
in 5.28.0 (5.27.1, actually). The policy when the parse changes is
to use the most-modern parse. Hence this change.
As a side effect of this, the unknown token's explain() method now
returns something -- normally the associated error.
Add method remove_insignificant(). If the invocant isa Node, this
returns a clone of the invocant with non-significant elements
removed. Otherwise it returns either the invocant or nothing.
0.059 2018-05-08 T. R. Wyant
Install @CARP_NOT everywhere so that warnings and exceptions
generated in the bowels of the system appear to come from the point
where the system is entered.
Further deprecate string (versus regexp) parsing. The first use of
the 'parse' argument to new() will result in a warning. If the
value of the argument is 'guess' or 'string', the warning refers to
PPIx::QuoteLike.
0.058 2018-04-26 T. R. Wyant
Prefer /[0-9]/ over /\d/ for numeric checks. The latter can match
non-ASCII digits.
Explain the negated POSIX character classes. Also tweak some of the
asserted explanations -- mostly for readability and parallel
construction with the negated explanations, but it turns out
[[:digit:]] is NOT equivalent to [0-9].
0.057 2018-04-17 T. R. Wyant
Allow ->asserts( 'a*' ). This modification actually allows wild
cards in asserts() on all match semantic modifiers, but it is
probably only useful in the case of 'a*', because that is the only
one that can be doubled.
Explain grouping structure as 'Grouping', not 'Capture or grouping'.
Caret modifier was not turning off /n. This was complicated by the
fact that (?^) was introduced in 5.13.6, but (?n) was not introduced
until 5.21.8. The solution was to include -n in the expansion of the
caret if and only if /n had been seen in the scope of the caret.
Recognize caret in /(?^)x/.
Acknowledge Regexp::Parsertron in SEE ALSO
0.056 2018-03-07 T. R. Wyant
Support removal of unescaped literal left curlys after left parens,
which was deprecated in 5.27.8. No actual change in output yet,
since deprecation is not tracked, but the perl_version_removed()
logic is there.
Add next_element() and kin. These are analogous to next_sibling()
and kin, but will cross over from content proper into structure
(beginning and end delimiters, etc) and vice versa.
Correct requirements_for_perl() for impossible regular expression.
It now returns '! $]' when the components of the regexp are valid, but
none are valid under any specific version of Perl. It used to think all
Perls were OK when this happened.
Add the alpha_assertions introduced in 5.27.9.
Handle 5.27.9's change from +script_run to *script_run, and support
*sr as a synonym.
0.055 2018-02-08 T. R. Wyant
Tokenizer method prior() is now fatal. This was documented as
package-private, but as it WAS documented, I am putting it through a
deprecation cycle anyway. Six months from now it will be removed.
Add Script_Run classes as subclasses to their superclass docs. This
was missed in the last update.
0.054 2018-01-29 T. R. Wyant
Add support for (+script_run:...). This is an experimental feature
added in Perl 5.27.8. It imposes on any matches it contains the
additional restriction that everything matched has to belong to the
same Unicode script. This support will be retracted if the
functionality does not make it into Perl 5.28.
Add method scontent(). This returns significant content only. That
is, if called on the parse of '/ f u b a r /x', it returns
'/fubar/x'.
0.053 2017-10-30 T. R. Wyant
Recognize \px as Unicode char class. At least, when the x is C, L,
M, N, P, S or Z.
The 'parse' argument to new() is now deprecated.
0.052 2017-09-07 T. R. Wyant
RT 122715: Clarify Node->find_parents() documentation. Thanks to
Salvatore Bonaccorso for letting me know about this problem.
Further deprecate tokenizer method prior() in favor of
prior_significant_token().
Add requirements_for_perl(). This is analogous to the
CPAN::Meta::Requirements method requirements_for_module(), though
the output is formatted differently. Also put in the actual
requirements for an un-escaped literal left curly after a constant,
which was removed in 5.25.1 and reinstated in 5.27.1.
Add accepts_perl(). This is analogous to
CPAN::Meta::Requirements->accepts_module(). I decided that
CPAN::Meta;:Requirements was overkill, but this may turn out to be
the wrong decision, so I will be careful what I expose.
Document behavior of perl_version_introduced() and
perl_version_removed() when a feature is re-introduced after
removal, or re-removed after re-introduction.
\N{} (empty curlys) removed in 5.27.1.
0.051 2017-01-29 T. R. Wyant
Support whitespace inside [] if /xx in effect.
Starting with Perl 5.25.9, a space or tab appearing inside a bracketed
character class is not significant if /xx is asserted.
Further deprecate tokenizer method prior()
Add 'provides' data to ExtUtils::MakeMaker output
SOME unescaped litaral '{' removed in 5.025001.
After '.', Unicode classes, and bracketed classes (including extended)
they are still legal.
Make /\b{/ an error
Perl fails to parse the above, because once it sees the '\b{' it wants
to find one of the extended boundary assertions (like \b{wb}), and
declares an error when it does not. So we check for this and rebless the
curly into an unknown token, not a literal.
0.050 2016-05-06 T. R. Wyant
Parse bracketed substitution with embedded comment. This is something
like s{foo}\n#{bar}\n{baz} which is equivalent to s/foo/baz/. PPI
gets this wrong, and we're not smart enough to fix up the PPI parse,
but if given this as text, we now parse it correctly.
We now recognize postfix dereferences by default, since Perl does
beginning with 5.24. In other words, default new() argument
'postderef' to true.
Unterminated substitutions (i.e. 's//') should no longer cause an
exception. Instead they parse as an unknown token.
0.049 2016-04-19 T. R. Wyant
Robustify PPIx::Regexp->perl_version_removed()
The problem here was that if the expression being parsed was
sufficiently badly-formed, $self->delimiters() would be undef, throwing
a warning.
Correct dump of embedded modifiers (eg: (?i:...))
0.048 2016-02-29 T. R. Wyant
Add option 'strict', like 'use re "strict"'
In the presence of strict(), I opted to set perl_version_introduced
to the version of Perl where the construct became an error.
Parse '\N{}' as no-op.
The previous parse was a character class ('\N') followed by two
literals ('{' and '}'). But perl5238delta said that it had been
ignored up to that time. Starting with 5.23.8 it is an error if 'use
re strict' is in effect.
Quash 'NOT a POSIX class ...' warning under 5.23.8
Add Makefile targets authortest and testcover.
0.047 2016-01-29 T. R. Wyant
Recognize \b{lb}, introduced in 5.23.7. If this is retracted before
5.24, it will be removed outright.
0.046 2016-01-08 T. R. Wyant
Add GitHub repository to mmetadata.
0.045 2015-12-31 T. R. Wyant
Deprecate tokenizer method prior() in favor of
prior_significant_token(). This is not part of the public interface,
so I suppose I could have just slam-dunked it, but ...
Add ability to parse strings as well as regexes
The new functionality is controlled by the new new() argument
'parse', whose permitted values are 'regex' (the default), 'string',
or 'guess'. String parsing, and the 'string' and 'guess' values of
'parse', are experimental.
0.044 2015-12-08 T. R. Wyant
Allow nesting of \Q with \U, \L, and \F
The perlop docs say these nest with each other. Playing with Perl
suggests that \U, \L and \F supersede each other, but thet they as a
group nest with \Q in either order, so that if you specify \Q and
one of the \U, \L, \F group you need two \Es to turn them all back
off.
Restrict recognition of back references in replacement strings to
\number form, since Perl itself does not recognize \g{...} or
\k{...} there.
Recognize postfix dereference if desired. This is controlled by the
Boolean argument 'postderef' passed to PPIx::Regexp->new(). The
default is false, but will become true if postfix dereference
becomes mainstream Perl 5.
Add explain() and supporting methods main_structure() and
in_regex_set(). The explain() method returns a brief explanation of
what the element does.
0.043 2015-11-18 T. R. Wyant
Do not end regex set prematurely on finding '])'
The problem is that '])' can occur within an extended bracketed
character class if it contains grouping parentheses and the last
item in a group is a regular bracketed character class and there is
no white space between the end of the character class and the end of
the group.
Record parse failure if switch condition is unknown
The structure was being reblessed to
PPIx::Regexp::Structure::Unknown, but the number of parse failures
was not being incremented.
Parse \U and friends as meta-characters inside \Q...\E
This turns out to be what Perl itself does, as shown by
$ perl -E 'say qr{\Q\Ufoo}'
Clear error when lexer identifies unknown token. Those who peruse the
changes in this release will see that a bunch of refactoring was
done as part of this.
Parse white space inside bracketed character classes inside extended
bracketed character classes (whew!) as literals, except for the
space character itself and the horizontal tab. This tracks the
corresponding change in Perl 5.23.4. This will be reverted if the
corresponding Perl change does not make it into 5.24.0.
Beginning with version 0.035, PPIx::Regexp was incorrectly reporting
the sense of modifiers when the same token both asserted and negated
modifiers (e.g. '(?x-i:...)'). This release should correct the
problem.
Document policy when Perl changes in such a way that the proper parse
for a regular expression changes. In this case the more modern parse
is preferred.
0.042 2015-10-09 T. R. Wyant
Report error rather than failing when parsing a string consisting
wholly of white space.
Group types were not being recognized if they contained the delimiter
character for the regexp (e.g. in qr<(?\<foo)> the look-behind
assertion was not recognized as such).
Correct mis-parse of ' s///'. Leading white space is supposed to be
acceptable, but the leading whitespace token caused
PPIx::Regexp::Lexer not to recognize the substitution as such.
Tokenizer was failing when the string to be parsed was so bad it was
trying to return the whole thing as a single
PPIx::Regexp::Token::Unknown.
PPIx::Regexp::Dumper now displays a message if a structure is missing
its end delimiter.
RT 107331 Produce parse error in the presence of trailing cruft.
Thanks to Klaus Rindfrey for catching this.
The tokenizer now does a preliminary scan for delimiting brackets
and modifiers. Anything after the modifiers except for white space
is now made into a PPIx::Regexp::Token::Unknown, resulting in a
parse failure being reported. The previous implementation simply
assumed a valid expression, and in the case of the expression in the
ticket blithely mismatched the delimiters and returned a parse
without failures, but which was manifestly bogus.
Tweak documentation in PPIx::Regexp.
0.041 2015-07-02 T. R. Wyant
Report \C (match octet) as removed in 5.23.0.
Accept non-ASCII whitespace under /x. The Whitespace object can be
multiple characters; the perl_version_introduced() becomes
'5.021001' if any of them is a code point above 127.
The perl_version_removed() method now returns '5.021001' when called
on a PPIx::Regexp object produced by parsing '?foo?' (match once
without explicit 'm'). The object produced by parsing 'm?foo?' still
returns the minimum Perl version.
0.040 2015-05-31 T. R. Wyant
Do not parse unadorned parentheses as capture groups when /n is in
effect. Instead, they are parsed as PPIx::Regexp::Structure. Named
captures appear to be unaffected by /n.
Made a verbose dump a little more so. Specifically, dump
max_capture_group where relevant, and display dumped values a bit
more informatively.
Report /n (no captures) as having been added in 5.21.8.
0.039 2015-04-02 T. R. Wyant
Recognize nested subscripts in interpolation.
Thanks to Andy Lester for finding this, which actually manifested in
Perl-Critic-Policy-Variables-ProhibitUnusedVarsStricter. The problem is
that the actual heuristics for finding the end of an interpolation are
undocumented, and I missed this rather-obvious case.
Add \b{g} (= \b{gcb})
0.038 2015-03-09 T. R. Wyant
Make \b{foo} into an unknown token (and therefore an error. This
applies to \b{anything}, where 'anything' is anything bur 'gcb',
'wb', or 'sb'.
Handle the boundary assertions introduced in Perl 5.21.9: '\b{gcb}'
(grapheme cluster boundary), '\b{wb}' (word boundary), '\b{sb}'
(sentence boundary), and the corresponding '\B{...}' constructions.
Similar-looking things like '\b{foo}' are not recognized as
assertions, and end up being literals. This is less general than I
usually make things, but was done against the possibility that
(e.g.) '\b{foo}' might be introduced later, requiring
perl_version_released() to return a different number. Any of these
retracted prior to Perl 5.22.0 will simply be removed from
PPIx::Regexp.
0.037 2014-11-12 T. R. Wyant
Have PPIx::Regexp::Structure::RegexSet POD recognize that the Perl
docs (specifically perlrecharclass) now call this construction
Extended Bracketed Character Classes, not sets.
Correctly mark the replacement portion of s///ee as code. Prior to
this release it was parsed as though no /e were present.
Make available the number of times a given modifier is asserted
(except for the match semantics modifiers which get handled
differently). See PPIx::Regexp::Token::Modifier->asserted() and
PPIx::Regexp::Tokenizer->modifier() for details.
0.036 2014-01-04 T. R. Wyant
Retract the "Allow non-ASCII white space under /x" change introduced
in version 0.033. I misread perl5170delta, and implemented early.
Change to explicit character class to recognize white space under /x.
I was previously using \s, which matched too much. Thanks to Nobuo
Kumagai for finding and reporting this.
0.035 2013-11-15 T. R. Wyant
Properly handle multi-character modifiers like /ee. We now handle /eie
as being the same as /eei. Thanks to Anonymous Monk for finding
this.
Properly handle \g and \k back references that do not correspond to an
actual capture group. They are now reblessed into the unknown token,
and counted as errors. Thanks to Anonymous Monk for finding this.
Add method error() to PPIx::Regexp::Element. This should return an
error message when the element is in error -- normally when it has
been blessed into the unknown token or structure.
Add method modifier_asserted() to PPIx::Regexp::Element. This walks
the parse tree backward to determine if the given modifier is in
effect for the element.
0.034 2013-05-11 T. R. Wyant
Correct spelling and grammar errors in POD and comments. RT #85050.
Thanks David Steinbrunner for catching these.
Allow interpolation in regex sets. It implies Perl 5.17.9 or higher.
Allow non-ASCII white space under /x. It implies Perl 5.17.9 or
higher.
Fix problems with Regex Set functionality under Perl 5.6.2. CPAN
testers RULE!
0.031 2013-01-31 T. R. Wyant
Have PPIx::Regexp::Token::Code (and offspring) become
PPIx::Regexp::Token::Unknown inside a regex set.
0.030 2013-01-22 T. R. Wyant
Add Regex Sets, which were added to Perl as an experimental feature in
5.17.8. This is experimental in Perl, therefore the parse may
change.
Ditch PPIx::Regexp::Token::GroupType method __expect_after_match() in
favor of the more general __match_setup(). This is done without
deprecation because __expect_after_match() was documeted as
package-private, but noted in the change log because it _was_
documented.
0.029 2013-01-14 T. R. Wyant
Add method unescaped_content() to PPIx::Regexp::Element().
Rewrite the tokenizing code in PPIx::Regexp::Token::GroupType and
offspring to use regular expressions specific to the regexp
delimiter, and escaping only that delimiter. Thanks again to
Alexandr Ciornii for finding more of these.
Fix mis-parse of /(\?|I)/ as a branch reset (it's really an
alternation). There may be more of these lurking. Thanks to Alexandr
Ciornii for finding this one.
Add options -files and -objectify to eg/predump.
0.028 2012-06-06 T. R. Wyant
Replace all uses of YAML::Any with YAML, since they come in the same
distro, and YAML does not suffer from deprecation warnings.
0.027 2012-05-28 T. R. Wyant
Eliminate unescaped literal "{" characters in regexps in
PPIx::Regexp::Token::Backreference and
PPIx::Regexp::Token::CharClass::Simple. These are deprecated in 5.17.0.
0.026 2012-02-24 T. R. Wyant
Add support for \F (fold case), added in 5.15.8.
0.025 2012-01-04 T. R. Wyant
Tolerate leading and trailing white space around the regular
expression. These are still round-trip safe, since the white space
is tokenized.
Make Changes file conform to CPAN::Changes, and add
xt/author/changes.t to ensure continued compliance.
0.024 2011-12-17 T. R. Wyant
Reinstate author test xt/author/manifest.t, which was clobbered
shortly before the release of 0.021_10.
0.023 2011-12-08 T. R. Wyant
Correct address of FSF in the version of the GPL distributed in
LICENSES/Copying. Thanks to Petr Pisar for picking this up.
0.022 2011-11-24 T. R. Wyant
Correct various documentation errors.
Don't initialize effective modifiers with '^', since that wrongly
asserts that /d has been seen somewhere along the line.
Implement negation of match-semantic modifiers (e.g. 'no re /u;') by
setting the relevant datum to undef.
Support for default modifiers. This includes:
* default_modifiers argument to new() in PPIx::Regexp,
PPIx::Regexp::Tokenizer, and PPIx::Regexp::Dumper
* Public method modifier_asserted() on PPIx::Regexp, to return
whether a given modifier is actually in effect. The results of the
modifier() method are unchanged.
Require Test::More 0.88 for installation. Eliminate all the 'eval
{ require ... }' logic in favor of 'use Test::More 0.88'.
Have Makefile.PL make use of {BUILD_REQUIRES} if it is available.
Fix PPIx::Regexp::Token::Whitespace->can_be_quantified() to return
false.
0.021 2011-07-22 T. R. Wyant
Modified tokenizer to correctly handle a back slash used as a
delimiter. I believe.
PPIx::Regexp::Dumper now dumps the results of ppi() if that method is
present and -verbose is asserted.
0.020 2011-04-02 T. R. Wyant
Corrected perl_version_introduced():
* \R is now 5.009005 (was 5.000).
0.019 2011-03-01 T. R. Wyant
Various corrections to perl_version_introduced():
* \X is now 5.006 (was 5.000);
* \N{name} is now 5.006001 (was 5.006);
* \N{U+xxxx} is now 5.008 (was 5.006).
The \C is now parsed as a PPIx::Regexp::Token::CharClass::Simple. It
was previously considered a PPIx::Regexp::Token::Literal.
Ensure that \N{$foo} parses as a Unicode literal, not a quantified \N.
The ordinal() method returns undef for this.
Understand the /aa modifier, introduced with 5.13.10.
Report perl_version_introduced() of 5.013010 for the new semantic
modifiers when modifying the entire expression.
Correct handling of interpolations like ${^foo} and $#{foo}.
0.018 2011-02-16 T. R. Wyant
Override ppi() in PPIx::Regexp::Token::Interpolation to provide the
proper PPI when variable names are bracketed.
Properly parse bracketed variable names (I hope!), which may not be
subscripted.
Take account of possible '$' or '@' casts before a symbol in an
interpolation (e.g. $$foo{bar}, which is equivalent to $foo->{bar}).
Add the /a modifier to PPI::Regexp::Token::Modifiers, legal only in
the (?:...) construction. This was introduced in Perl 5.13.9.
When parsing an interpolation from a replacement string (rather than a
regular expression), take subscripts at face value rather than
trying to disambiguate them from quantifiers and character classes,
which they can't be in this context.
0.016 2011-01-05 T. R. Wyant
The PPIx::Regexp::Token::Code perl_version_introduced() method now
returns the minimum Perl version (currently set to 5.000) if it is
used to represent the substitution portion of s///e.
0.015 2010-10-25 T. R. Wyant
Documented intent to revoke support for features introduced in a
development Perl which do not make it to a production release. This
is necessary because in this case the syntax could be reused with
different semantics.
Added support for Perl 5.13.6 (?^...) construction.
Added support for Perl 5.13.6 d, l, and u modifiers.
Fixed inconsistency in perl_version_introduced() results between
PPIx::Regexp::Token::Modifier and
PPIx::Regexp::Token::GroupType::Modifier.
Corrected PPIx::Regexp::Constant RE_CAPTURE_NAME docs, somehow missed
back at 0.010_01.
0.014 2010-10-14 T. R. Wyant
Recognize \o{...} as a PPIx::Regexp::Token::Literal, with
perl_version_introduced() of 5.0013003.
Terminate \0.. through \7.. after three characters, as Perl does.
These two were brought to my attention by Brian D. Foy's "The
Effective Perler" for October 11 2010,
http://www.effectiveperlprogramming.com/blog/697
Correct the PPIx::Regexp::Token::Literal ordinal() method for '\b'. As
a literal, this is a back space.
0.013 2010-10-10 T. R. Wyant
Declare a parse failure if characters are found between the '}' and
the ')' of (?{...}) and (??{...}), and rebless the tokens to
::Unknown. Perl does not accept anything here, so I think I should
not either.
Whitespace tweak in the PPIx::Regexp::Dumper test output for the
failures test.
Replace the PPI logic in PPIx::Regexp::Token::Code with a call to
$tokenizer->find_matching_delimiter(). This is actually the way Perl
works, as a look at toke.c and regcomp.c makes clear.
Push the perl_version_introduced() back to 5.0 at the request of
Alexandr Ciornii, for the potential benefit of Perl::MinimumVersion.
This was done mostly by reading the various perlre, perldelta, and
perlop documents, so these should be taken with a HUGE grain of
salt.
0.012 2010-09-26 T. R. Wyant
Track all the features reported as introduced (or removed) in Perl
5.010 back to Perl 5.009005, and report them as such.
Report modifier /r as having been introduced in Perl 5.013002, rather
than the default of 5.006.
0.011 2010-09-16 T. R. Wyant
Remove dependencies on Params::Util and Readonly. The latter was
requested by ADAMK for the benefit of Padre. It involved changing
the symbols exported from PPIx::Regexp::Constant, but these were
documented as private, so ...
Parse POSIX character classes [=a=] and [.a.] as
PPIx::Regexp::Token::CharClass::POSIX::Unknown, which counts as a
parse failure since these are not supported by Perl.
Make the PPI::Document created by PPIx::Regexp::Token::Code->ppi() be
read only. This means we need PPI 1.116. Cache the document, and
ensure the cached result is returned on subsequent calls.
0.010 2010-08-06 T. R. Wyant
Fix fatal error in PPIx::Regexp::Token::Code->ppi().
Move author tests from xt/ to xt/author/.
0.009 2010-08-03 T. R. Wyant
Recognize s/.../.../ee as being different from s/.../.../e. In
particular, the replacement portion of the former is _not_ a Perl
expression: it's an interpolatble string, which later gets
eval{}'ed.
0.008 2010-07-01 T. R. Wyant
Promote methods can_be_quantified() and is_quantifier() from
PPIx::Regexp::Token to PPIx::Regexp::Element, so all classes inherit
them. They still return true and false respectively.
Override can_be_quantified() to return false on PPIx::Regexp,
PPIx::Regexp::Structure::Quantifier,
PPIx::Regexp::Structure::Regexp, and
PPIx::Regexp::Structure::Replacement.
Override is_quantifier() to return true on
PPIx::Regexp::Structure::Quantifier.
Modify PPIx::Regexp::Dumper to be able to display can_be_quantified
and is_quantifier for PPIx::Regexp::Node objects when dumping
verbosely.
Convert internal data to Readonly in PPIx::Regexp::Lexer,
PPIx::Regexp::Token::CharClass::Simple,
PPIx::Regexp::Token::Structure, and PPIx::Regexp::Tokenizer.
Remove leftover boilerplate in PPIx::Regexp::Token::CharClass::Simple.
Explicitly require a minimum Perl of 5.006.
Centralized dependencies in inc/PPIx/Regexp/Meta.pm.
Removed claim that PPIx::Regexp is alpha code. Docs still say that the
interface can be changed, but now it will go through a deprecation
cycle.
0.007 2010-04-28 T. R. Wyant
PPIx::Regexp::Lexer no longer fails when encountering expressions like
m{)}. Instead, it marks the right parenthesis as an unmatched
delimiter.
Fixed RT 56864 - PPIx::Regexp::Lexer fails in Perl::Critic under Perl
5.13.0. This was due to the value of a returned $+[0] getting
transmogrified before the caller saw it. I never did isolate what
triggered the bug.
You can now get a tokenizer trace by setting environment value
PPIX_REGEXP_TOKENIZER_TRACE to a non-zero numeric value. This is
unsupported, though.
0.006 2010-02-26 T. R. Wyant
Parse \N{...} in accordance with perl5115delta. The curlys must
contain an alpha followed by alphanumerics, spaces, parens, colons,
or dashes. \N{ without a matching } is a character class (if legal)
followed by a literal '{'.
Parse \N inside a character class as PPI::Regexp::Token::Unknown,
since Perl 5.11.5 considers this a compile error. A \N{...} inside a
character class is still OK.
Add method match() to PPIx::Regexp::Tokenizer. This is analogous to
capture(), but returns the entire matched string.
0.005 2009-12-26 T. R. Wyant
Recognize \N (without curlys), back-ported from Perl 6 into 5.11.
Recognize unicode characters as \N{[[:alpha:]] ... rather than
\N{[\w\s:] ... This is per the 5.11 documentation, but I think Perl
always worked this way.
Recognize loose matching of Unicode character classes, and allow '='
in lieu of a single ':' in a Unicode character class (this from Perl
5.11.3).
PPIx::Regexp::Dumper now produces the proper output when called with
perl_version => 1, test => 1.
Describe the typical content of the object in the documentation for
PPIx::Regexp::Structure::NamedCapture and
PPIx::Regexp::Token::GroupType::NamedCapture.
0.004 2009-11-09 T. R. Wyant
Have PPIx::Regexp::Token::Literal correctly recognize when
charnames::vianame() is unavailable, and decouple this from the
handling of \N{U+hhhh}.
Add dependency on Task::Weaken, since depending on Scalar::Util
appears not to cut it.
Correct the assignment of the license type in Makefile.PL.
0.003 2009-11-05 T. R. Wyant
Have PPIx::Regexp::Token::Literal recognize \N{U+hhhh} (where hhhh
represents hex digits), and provide its ordinal (hhhh). Remove
recognition of \N. (. = any character), which Perl does not do.
Fix $re->flush_cache() so that it actually removes $re and only
$re from the cache.
Add delimiters() method to PPIx::Regexp::Main and PPIx::Regexp.
Support this in eg/prenav.
Increase test coverage and remove dead code.
Count tests in t/parse.t and t/unit.t
0.002 2009-10-28 T. R. Wyant
In verbose mode, have PPIx::Regexp::Dumper dump the absolute capture
number referred to by a numbered reference.
Have eg/preslurp pass its -verbose option to PPIx::Regexp::Dumper
Don't use Test::More::isa_ok for the t/basic.t class heritage tests,
since some versions of Test::More require a reference for the first
argument of isa_ok().
0.001 2009-10-21 T. R. Wyant
Initial release.
# ex: set textwidth=72 autoindent :
|