File: README.md

package info (click to toggle)
golang-github-tdewolff-parse 2.7.15-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 836 kB
  • sloc: makefile: 2
file content (101 lines) | stat: -rw-r--r-- 2,104 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
# XML [![API reference](https://img.shields.io/badge/godoc-reference-5272B4)](https://pkg.go.dev/github.com/tdewolff/parse/v2/xml?tab=doc)

This package is an XML lexer written in [Go][1]. It follows the specification at [Extensible Markup Language (XML) 1.0 (Fifth Edition)](http://www.w3.org/TR/REC-xml/). The lexer takes an io.Reader and converts it into tokens until the EOF.

## Installation
Run the following command

	go get -u github.com/tdewolff/parse/v2/xml

or add the following import and run project with `go get`

	import "github.com/tdewolff/parse/v2/xml"

## Lexer
### Usage
The following initializes a new Lexer with io.Reader `r`:
``` go
l := xml.NewLexer(parse.NewInput(r))
```

To tokenize until EOF an error, use:
``` go
for {
	tt, data := l.Next()
	switch tt {
	case xml.ErrorToken:
		// error or EOF set in l.Err()
		return
	case xml.StartTagToken:
		// ...
		for {
			ttAttr, dataAttr := l.Next()
			if ttAttr != xml.AttributeToken {
				// handle StartTagCloseToken/StartTagCloseVoidToken/StartTagClosePIToken
				break
			}
			// ...
		}
	case xml.EndTagToken:
		// ...
	}
}
```

All tokens:
``` go
ErrorToken TokenType = iota // extra token when errors occur
CommentToken
CDATAToken
StartTagToken
StartTagCloseToken
StartTagCloseVoidToken
StartTagClosePIToken
EndTagToken
AttributeToken
TextToken
```

### Examples
``` go
package main

import (
	"os"

	"github.com/tdewolff/parse/v2/xml"
)

// Tokenize XML from stdin.
func main() {
	l := xml.NewLexer(parse.NewInput(os.Stdin))
	for {
		tt, data := l.Next()
		switch tt {
		case xml.ErrorToken:
			if l.Err() != io.EOF {
				fmt.Println("Error on line", l.Line(), ":", l.Err())
			}
			return
		case xml.StartTagToken:
			fmt.Println("Tag", string(data))
			for {
				ttAttr, dataAttr := l.Next()
				if ttAttr != xml.AttributeToken {
					break
				}

				key := dataAttr
				val := l.AttrVal()
				fmt.Println("Attribute", string(key), "=", string(val))
			}
		// ...
		}
	}
}
```

## License
Released under the [MIT license](https://github.com/tdewolff/parse/blob/master/LICENSE.md).

[1]: http://golang.org/ "Go Language"