File: README

package info (click to toggle)
libxml-parser-easytree-perl 0.01-1
  • links: PTS, VCS
  • area: main
  • in suites: buster, stretch
  • size: 80 kB
  • ctags: 10
  • sloc: perl: 84; makefile: 2
file content (146 lines) | stat: -rw-r--r-- 5,305 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
=head1 NAME

XML::Parser::EasyTree - Easier tree style for XML::Parser

=head1 SYNOPSIS

  use XML::Parser;
  use XML::Parser::EasyTree;
  $XML::Parser::Easytree::Noempty=1;
  my $p=new XML::Parser(Style=>'EasyTree');
  my $tree=$p->parsefile('something.xml');

=head1 DESCRIPTION

XML::Parser::EasyTree adds a new "built-in" style called "EasyTree" to 
XML::Parser.  Like XML::Parser's "Tree" style, setting this style causes 
the parser to build a lightweight tree structure representing the XML 
document.  This structure is, at least in this author's opinion, easier to 
work with than the one created by the built-in style.

When the parser is invoked with the EasyTree style, it returns a reference 
to an array of tree nodes, each of which is a hash reference.  All nodes 
have a 'type' key whose value is the type of the node: 'e' for element 
nodes, 't' for text nodes, and 'p' for processing instruction nodes.  All 
nodes also have a 'content' key whose value is a reference to an array 
holding the element's child nodes for element nodes, the string value for 
text nodes, and the data value for processing instruction nodes.  Element 
nodes also have an 'attrib' key whose value is a reference to a hash of 
attribute names and values.  Processing instructions also have a 'target' 
key whose value is the PI's target.

EasyTree nodes are ordinary Perl hashes and are not objects.  Contiguous 
runs of text are always returned in a single node.

The reason the parser returns an array reference rather than the root 
element's node is that an XML document can legally contain processing 
instructions outside the root element (the xml-stylesheet PI is commonly 
used this way).

If the parser's Namespaces option is set, element and attribute names will 
be prefixed with their (possibly empty) namespace URI enclosed in curly 
brackets.

=head1 SPECIAL VARIABLES

Two package global variables control special behaviors:

=over 4

=item XML::Parser::EasyTree::Latin

If this is set to a nonzero value, all text, names, and values will be 
returned in ISO-8859-1 (Latin-1) encoding rather than UTF-8.

=item XML::Parser::EasyTree::Noempty

If this is set to a nonzero value, text nodes containing nothing but 
whitespace (such as those generated by line breaks and indentation between 
tags) will be omitted from the parse tree.

=back

=head1 EXAMPLE

Parse a prettyprined version of the XML shown in the example for the built-in "Tree" style:

  #!perl -w
  use strict;
  use XML::Parser;
  use XML::Parser::EasyTree;
  use Data::Dumper;
  
  $XML::Parser::EasyTree::Noempty=1;
  my $xml=<<'EOF';
  <foo>
    <head id="a">Hello <em>there</em>
    </head>
    <bar>Howdy<ref/>
    </bar>
    do
  </foo>
  EOF
  my $p=new XML::Parser(Style=>'EasyTree');
  my $tree=$p->parse($xml);
  print Dumper($tree);

Returns:

  $VAR1 = [
          { 'name' => 'foo',
            'type' => 'e',
            'content' => [
                           { 'name' => 'head',
                             'type' => 'e',
                             'content' => [
                                            { 'type' => 't',
                                              'content' => 'Hello '
                                            },
                                            { 'name' => 'em',
                                              'type' => 'e',
                                              'content' => [
                                                             { 'type' => 't',
                                                               'content' => 'there'
                                                             }
                                                           ],
                                              'attrib' => {}
                                            }
                                          ],
                             'attrib' => { 'id' => 'a'
                                         }
                           },
                           { 'name' => 'bar',
                             'type' => 'e',
                             'content' => [
                                            { 'type' => 't',
                                              'content' => 'Howdy'
                                            },
                                            { 'name' => 'ref',
                                              'type' => 'e',
                                              'content' => [],
                                              'attrib' => {}
                                            }
                                          ],
                             'attrib' => {}
                           },
                           { 'type' => 't',
                             'content' => '
  do
 '
                           }
                         ],
            'attrib' => {}
          }
        ];

=head1 AUTHOR

Eric Bohlman (ebohlman@omsdev.com)

Copyright (c) 2001 Eric Bohlman. All rights reserved. This program
is free software; you can redistribute it and/or modify it under the same
terms as Perl itself.

=head1 SEE ALSO

  XML::Parser