1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123
|
$ fq -h xml
xml: Extensible Markup Language decoder
Options
=======
array=false Decode as nested arrays
attribute_prefix="@" Prefix for attribute keys
seq=false Use seq attribute to preserve element order
Decode examples
===============
# Decode file as xml
$ fq -d xml . file
# Decode value as xml
... | xml
# Decode file using xml options
$ fq -d xml -o array=false -o attribute_prefix="@" -o seq=false . file
# Decode value as xml
... | xml({array:false,attribute_prefix:"@",seq:false})
XML can be decoded and encoded into jq values in two ways, elements as object or array. Which variant to use depends a bit what you
want to do. The object variant might be easier to query for a specific value but array might be easier to use to generate xml or to
query after all elements of some kind etc.
Encoding is done using the to_xml function and it will figure what variant that is used based on the input value. Is has two optional
options indent and attribute_prefix.
Elements as object
==================
Element can have different shapes depending on body text, attributes and children:
- <a key="value">text</a> is {"a":{"#text":"text","@key":"value"}}, has text (#text) and attributes (@key)
- <a>text</a> is {"a":"text"}
- <a><b>text</b></a> is {"a":{"b":"text"}} one child with only text and no attributes
- <a><b/><b>text</b></a> is {"a":{"b":["","text"]}} two children with same name end up in an array
- <a><b/><b key="value">text</b></a> is {"a":{"b":["",{"#text":"text","@key":"value"}]}}
If there is #seq attribute it encodes the child element order. Use -o seq=true to include sequence number when decoding, otherwise
order might be lost.
# decode as object is the default
$ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -d xml -o seq=true
{
"a": {
"b": [
{
"#seq": 0
},
{
"#seq": 1,
"#text": "bbb"
}
],
"c": {
"#seq": 2,
"#text": "ccc",
"@attr": "value"
}
}
}
# access text of the <c> element
$ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq '.a.c["#text"]'
"ccc"
# decode to object and encode to xml
$ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -r -d xml -o seq=true 'to_xml({indent:2})'
<a>
<b></b>
<b>bbb</b>
<c attr="value">ccc</c>
</a>
Elements as array
=================
Elements are arrays of the shape ["#text": "body text", "attr_name", {key: "attr value"}|null, [<child element>, ...]].
# decode as array
$ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -d xml -o array=true
[
"a",
null,
[
[
"b",
null,
[]
],
[
"b",
{
"#text": "bbb"
},
[]
],
[
"c",
{
"#text": "ccc",
"attr": "value"
},
[]
]
]
]
# decode to array and encode to xml
$ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -r -d xml -o array=true -o seq=true 'to_xml({indent:2})'
<a>
<b></b>
<b>bbb</b>
<c attr="value">ccc</c>
</a>
# access text of the <c> element, the object variant above is probably easier to use
$ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -o array=true '.[2][2][1]["#text"]'
"ccc"
References
==========
- xml.com's Converting Between XML and JSON (https://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html)
|