File: Selector.pm

package info (click to toggle)
libpandoc-elements-perl 0.38-7
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 732 kB
  • sloc: perl: 1,630; makefile: 15; sh: 1
file content (144 lines) | stat: -rw-r--r-- 3,293 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
package Pandoc::Selector;
use strict;
use warnings;
use 5.010001;

use Pandoc::Elements;

my $IDENTIFIER = qr{[\p{L}\p{N}_-]+};
my $NAME       = qr{[A-Za-z]+};

sub new {
    my ($class, $selector) = @_;
    # TODO: compile selector
    bless { selector => $selector }, $class;
}

sub match {
    my ($self, $element) = @_;

    foreach my $selector ( split /\|/, $self->{selector} ) {
        return 1 if _match_expression($selector, $element);
    }

    return 0;
}

sub _match_expression {
    my ( $selector, $elem ) = @_;
    $selector =~ s/^\s+|\s+$//g;

    # name
    return 0
      if $selector =~ s/^($NAME)\s*//i and lc($1) ne lc( $elem->name );
    return 1 if $selector eq '';

    # type
    if ( $selector =~ s/^:(document|block|inline|meta)\s*// ) {
        my $method = "is_$1";
        return 0 unless $elem->$method;
        return 1 if $selector eq '';
    }

    # TODO: :method (e.g. :url)

    # TODO: [:level=1]

    # TODO [<number>]

    # TODO [@<attr>]

    # id and/or classes
    return 0 unless $elem->isa('Pandoc::Document::AttributesRole');
    return _match_attributes($selector, $elem);
}

# check #id and .class
sub _match_attributes {
    my ( $selector, $elem ) = @_;

    $selector =~ s/^\s+|\s+$//g; # trim

    while ( $selector ne '' ) {
        if ( $selector =~ s/^#($IDENTIFIER)\s*// ) {
            return 0 unless $elem->id eq $1;
        }
        elsif ( $selector =~ s/^\.($IDENTIFIER)\s*// ) {
            return 0 unless grep { $1 eq $_ } @{ $elem->attr->[1] };
        }
        else {
            return 0;
        }
    }

    return 1;
}

1;
__END__

=head1 NAME

Pandoc::Selector - Pandoc document selector language

=head1 SYNOPSIS

  my $selector = Pandoc::Selector->new('Code.perl|CodeBlock.perl');

  # check whether an element matches
  $selector->match($element);

  # use as element method
  $element->match('Code.perl|CodeBlock.perl')

=head1 DESCRIPTION

Pandoc::Selector provides a language to select elements of a Pandoc document.
It borrows ideas from L<CSS Selectors|https://www.w3.org/TR/selectors-3/>,
L<XPath|https://www.w3.org/TR/xpath/> and similar languages.

The language is being developed together with this implementation.

=head1 EXAMPLES

  Header#main
  Code.perl
  Code.perl.raw
  :inline

=head1 SELECTOR GRAMMAR

Whitespace between parts of the syntax is optional and not included in the
following grammar. A B<Selector> is a list of one or more B<expression lists>
separated by pipes (C<|>). For instance the selector C<Subscript|Superscript>
selects both Subscript elements and Superscript elements.

  Selector        ::= ExpressionList ( '|' ExpressionList )*

An B<expression list> is a list of one or more B<expressions>:

  ExpressionList  ::= Expression ( Expression )*

An B<expression> is any of B<name expression>, B<id expression>, B<class
expression>, and B<type expression>.

  Expression      ::= NameExpression
                      | IdExpression
                      | ClassExpression
                      | TypeExpression

  NameExpression  ::= Name

  Name            ::= [A-Za-z]+

  IdExpression    ::= '#' [\p{L}\p{N}_-]+

  ClassExpression ::= '.' [\p{L}\p{N}_-]+

  TypeExpression  ::= ':' Name

=head1 SEE ALSO

See example filter C<select> to select parts of a document.

=cut