File: help_xml.fqtest

package info (click to toggle)
fq 0.9.0-2
links: PTS, VCS
area: main
in suites: forky, sid, trixie
size: 106,624 kB
sloc: xml: 2,835; makefile: 250; sh: 241; exp: 57; ansic: 21
file content (123 lines) | stat: -rw-r--r-- 3,522 bytes
$ fq -h xml
xml: Extensible Markup Language decoder

Options
=======

  array=false           Decode as nested arrays
  attribute_prefix="@"  Prefix for attribute keys
  seq=false             Use seq attribute to preserve element order

Decode examples
===============

  # Decode file as xml
  $ fq -d xml . file
  # Decode value as xml
  ... | xml
  # Decode file using xml options
  $ fq -d xml -o array=false -o attribute_prefix="@" -o seq=false . file
  # Decode value as xml
  ... | xml({array:false,attribute_prefix:"@",seq:false})

XML can be decoded and encoded into jq values in two ways, elements as object or array. Which variant to use depends a bit what you
want to do. The object variant might be easier to query for a specific value but array might be easier to use to generate xml or to
query after all elements of some kind etc.

Encoding is done using the to_xml function and it will figure what variant that is used based on the input value. Is has two optional
options indent and attribute_prefix.

Elements as object
==================
Element can have different shapes depending on body text, attributes and children:

- <a key="value">text</a> is {"a":{"#text":"text","@key":"value"}}, has text (#text) and attributes (@key)
- <a>text</a> is {"a":"text"}
- <a><b>text</b></a> is {"a":{"b":"text"}} one child with only text and no attributes
- <a><b/><b>text</b></a> is {"a":{"b":["","text"]}} two children with same name end up in an array
- <a><b/><b key="value">text</b></a> is {"a":{"b":["",{"#text":"text","@key":"value"}]}}

If there is #seq attribute it encodes the child element order. Use -o seq=true to include sequence number when decoding, otherwise
order might be lost.

  # decode as object is the default
  $ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -d xml -o seq=true
  {
    "a": {
      "b": [
        {
          "#seq": 0
        },
        {
          "#seq": 1,
          "#text": "bbb"
        }
      ],
      "c": {
        "#seq": 2,
        "#text": "ccc",
        "@attr": "value"
      }
    }
  }
  
  # access text of the <c> element
  $ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq '.a.c["#text"]'
  "ccc"
  
  # decode to object and encode to xml
  $ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -r -d xml -o seq=true 'to_xml({indent:2})'
  <a>
    <b></b>
    <b>bbb</b>
    <c attr="value">ccc</c>
  </a>

Elements as array
=================
Elements are arrays of the shape ["#text": "body text", "attr_name", {key: "attr value"}|null, [<child element>, ...]].

  # decode as array
  $ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -d xml -o array=true
  [
    "a",
    null,
    [
      [
        "b",
        null,
        []
      ],
      [
        "b",
        {
          "#text": "bbb"
        },
        []
      ],
      [
        "c",
        {
          "#text": "ccc",
          "attr": "value"
        },
        []
      ]
    ]
  ]
  
  # decode to array and encode to xml
  $ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -r -d xml -o array=true -o seq=true 'to_xml({indent:2})'
  <a>
    <b></b>
    <b>bbb</b>
    <c attr="value">ccc</c>
  </a>
  
  # access text of the <c> element, the object variant above is probably easier to use
  $ echo '<a><b/><b>bbb</b><c attr="value">ccc</c></a>' | fq -o array=true '.[2][2][1]["#text"]'
  "ccc"

References
==========
- xml.com's Converting Between XML and JSON (https://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html)