1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162
|
xmlformat API
Notation is Ruby-like. API for Perl version is similar.
CHANGES TO MAKE:
- Hide the option hash representation by replacing with accessor methods.
module XMLFormat
Module methods:
warn(arg, ...)
Print message given by arguments to stderr.
die(arg, ...)
Print message given by arguments to stderr and exit(1).
class XMLFormatter
Class Methods:
obj = new
Generate new XMLFormatter object, and set up initial formatting options
hash.
read_config(filename)
Read configuration file containing formatting options.
val, err_msg = check_option(opt_name, opt_val)
(private)
Check and option name/value for legality, return possibly type-converted
option value and error message. If err_msg is nil, the option is legal.
If err_msg is not nil, the option is illegal and err_msg contains a
string indicating the problem.
opts = get_opts(elt_name)
(private)
Look up formatting options for element and return them. This never
fails, because if no options are known for the given element name,
it returns the default options, which are guaranteed to be defined.
display_config
Display the configuration (formatting options).
display_unconfigured_elements
Produce a report of which elements are named in the input document
but for which no formatting options were given in the configuration
file.
shallow_parse(xml_document)
Parse an XML document (specified in the form of a string) into
array tokens and store the array internally.
array = tokens
Acessor method that returns the token list.
name = extract_tag_name(tag)
(private)
Given a tag (an angle-bracket sring), extract the tag name and return it.
assign_line_numbers
Assigns an input line number to each token (for use in error messages).
(private)
err_count = report_errors
Check the internal token list for errors, print information on bad
tokens, and return an error count. The count is zero if no errors are
found.
tokens_to_tree
Convert the internal token list to tree form and store the tree.
hash = node(type, content)
hash = text_node(content)
hash = comment_node(content)
hash = pi_node(content)
hash = doctype_node(content)
hash = cdata_node(content)
hash = element_node(open_tag, close_tag, children)
(private)
Tree node generators.
str = tree_stringify(children = @tree)
Convert the node list back to a string and return the string.
If the argument is missing, use the entire tree. In this
case, you get back the original input document.
tree_canonize
Canonize the document tree to remove extraneous all-whitespace
nodes and normalize text nodes.
tree_canonize2(children, par_name = "*DOCUMENT)
(private)
Helper function for tree_canonize.
Canonize a document subtree and return the modified subtree.
bool = is_normalized_elt(node)
(private)
Return true/false to indicate whether the node is a normalized element.
tree_format(par_name = "*DOCUMENT", children = @tree, indent = 0)
Format the tree or a subtree to produce a string representing the
reformatted XML document. Store string in @out_doc class variable.
If the children argument is missing, use the entire tree.
flush_pending(indent)
(private)
Flush pending text, using indent if text is line-wrapped.
Side-effect: advances the break type to element-break.
array = line_wrap(str, first_indent, rest_indent, max_len)
(private)
Perform line-wrapping on a string and return the result as an
array of lines.
str = the string to wrap
first_indent = indent for first line
rest_indent = indent for any subsequent lines
max_len = maximum allowed length of lines (including indent)
emit_break(indent)
(private)
Put out a break -- the number of newlines appropriate for the current
break type (entry-break, element-break, or exit-break).
If the break count > zero and indent is > 0, put out that many spaces
as well.
|