1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270
|
---
headline: jq 1.3 Manual
history: |
*The manual for the development version of jq can be found
[here](/jq/manual).*
body: |
A jq program is a "filter": it takes an input, and produces an
output. There are a lot of builtin filters for extracting a
particular field of an object, or converting a number to a string,
or various other standard tasks.
Filters can be combined in various ways - you can pipe the output of
one filter into another filter, or collect the output of a filter
into an array.
Some filters produce multiple results, for instance there's one that
produces all the elements of its input array. Piping that filter
into a second runs the second filter for each element of the
array. Generally, things that would be done with loops and iteration
in other languages are just done by gluing filters together in jq.
It's important to remember that every filter has an input and an
output. Even literals like "hello" or 42 are filters - they take an
input but always produce the same literal as output. Operations that
combine two filters, like addition, generally feed the same input to
both and combine the results. So, you can implement an averaging
filter as `add / length` - feeding the input array both to the `add`
filter and the `length` filter and dividing the results.
But that's getting ahead of ourselves. :) Let's start with something
simpler:
manpage_intro: |
jq(1) -- Command-line JSON processor
====================================
## SYNOPSIS
`jq` [<options>...] <filter> [<files>...]
`jq` can transform JSON in various ways, by selecting, iterating,
reducing and otherwise mangling JSON documents. For instance,
running the command `jq 'map(.price) | add'` will take an array of
JSON objects as input and return the sum of their "price" fields.
By default, `jq` reads a stream of JSON objects (whitespace
separated) from `stdin`. One or more <files> may be specified, in
which case `jq` will read input from those instead.
The <options> are described in the [INVOKING JQ] section, they
mostly concern input and output formatting. The <filter> is written
in the jq language and specifies how to transform the input
document.
## FILTERS
manpage_epilogue: |
## BUGS
Presumably. Report them or discuss them at:
https://github.com/stedolan/jq/issues
## AUTHOR
Stephen Dolan `<mu@netsoc.tcd.ie>`
sections:
- title: Invoking jq
body: |
jq filters run on a stream of JSON data. The input to jq is
parsed as a sequence of whitespace-separated JSON values which
are passed through the provided filter one at a time. The
output(s) of the filter are written to standard out, again as a
sequence of whitespace-separated JSON data.
You can affect how jq reads and writes its input and output
using some command-line options:
* `--slurp`/`-s`:
Instead of running the filter for each JSON object in the
input, read the entire input stream into a large array and run
the filter just once.
* `--raw-input`/`-R`:
Don't parse the input as JSON. Instead, each line of text is
passed to the filter as a string. If combined with `--slurp`,
then the entire input is passed to the filter as a single long
string.
* `--null-input`/`-n`:
Don't read any input at all! Instead, the filter is run once
using `null` as the input. This is useful when using jq as a
simple calculator or to construct JSON data from scratch.
* `--compact-output` / `-c`:
By default, jq pretty-prints JSON output. Using this option
will result in more compact output by instead putting each
JSON object on a single line.
* `--colour-output` / `-C` and `--monochrome-output` / `-M`:
By default, jq outputs colored JSON if writing to a
terminal. You can force it to produce color even if writing to
a pipe or a file using `-C`, and disable color with `-M`.
* `--ascii-output` / `-a`:
jq usually outputs non-ASCII Unicode codepoints as UTF-8, even
if the input specified them as escape sequences (like
"\u03bc"). Using this option, you can force jq to produce pure
ASCII output with every non-ASCII character replaced with the
equivalent escape sequence.
* `--raw-output` / `-r`:
With this option, if the filter's result is a string then it
will be written directly to standard output rather than being
formatted as a JSON string with quotes. This can be useful for
making jq filters talk to non-JSON-based systems.
* `--arg name value`:
This option passes a value to the jq program as a predefined
variable. If you run jq with `--arg foo bar`, then `$foo` is
available in the program and has the value `"bar"`.
- title: Basic filters
entries:
- title: "`.`"
body: |
The absolute simplest (and least interesting) filter
is `.`. This is a filter that takes its input and
produces it unchanged as output.
Since jq by default pretty-prints all output, this trivial
program can be a useful way of formatting JSON output from,
say, `curl`.
examples:
- program: '.'
input: '"Hello, world!"'
output: ['"Hello, world!"']
- title: "`.foo`"
body: |
The simplest *useful* filter is .foo. When given a
JSON object (aka dictionary or hash) as input, it produces
the value at the key "foo", or null if there's none present.
examples:
- program: '.foo'
input: '{"foo": 42, "bar": "less interesting data"}'
output: [42]
- program: '.foo'
input: '{"notfoo": true, "alsonotfoo": false}'
output: ['null']
- title: "`.[foo]`, `.[2]`, `.[10:15]`"
body: |
You can also look up fields of an object using syntax like
`.["foo"]` (.foo above is a shorthand version of this). This
one works for arrays as well, if the key is an
integer. Arrays are zero-based (like javascript), so `.[2]`
returns the third element of the array.
The `.[10:15]` syntax can be used to return a subarray of an
array. The array returned by `.[10:15]` will be of length 5,
containing the elements from index 10 (inclusive) to index
15 (exclusive). Either index may be negative (in which case
it counts backwards from the end of the array), or omitted
(in which case it refers to the start or end of the array).
examples:
- program: '.[0]'
input: '[{"name":"JSON", "good":true}, {"name":"XML", "good":false}]'
output: ['{"name":"JSON", "good":true}']
- program: '.[2]'
input: '[{"name":"JSON", "good":true}, {"name":"XML", "good":false}]'
output: ['null']
- program: '.[2:4]'
input: '["a","b","c","d","e"]'
output: ['["c", "d"]']
- program: '.[:3]'
input: '["a","b","c","d","e"]'
output: ['["a", "b", "c"]']
- program: '.[-2:]'
input: '["a","b","c","d","e"]'
output: ['["d", "e"]']
- title: "`.[]`"
body: |
If you use the `.[foo]` syntax, but omit the index
entirely, it will return *all* of the elements of an
array. Running `.[]` with the input `[1,2,3]` will produce the
numbers as three separate results, rather than as a single
array.
You can also use this on an object, and it will return all
the values of the object.
examples:
- program: '.[]'
input: '[{"name":"JSON", "good":true}, {"name":"XML", "good":false}]'
output:
- '{"name":"JSON", "good":true}'
- '{"name":"XML", "good":false}'
- program: '.[]'
input: '[]'
output: []
- program: '.[]'
input: '{"a": 1, "b": 1}'
output: ['1', '1']
- title: "`,`"
body: |
If two filters are separated by a comma, then the
input will be fed into both and there will be multiple
outputs: first, all of the outputs produced by the left
expression, and then all of the outputs produced by the
right. For instance, filter `.foo, .bar`, produces
both the "foo" fields and "bar" fields as separate outputs.
examples:
- program: '.foo, .bar'
input: '{"foo": 42, "bar": "something else", "baz": true}'
output: ['42', '"something else"']
- program: ".user, .projects[]"
input: '{"user":"stedolan", "projects": ["jq", "wikiflow"]}'
output: ['"stedolan"', '"jq"', '"wikiflow"']
- program: '.[4,2]'
input: '["a","b","c","d","e"]'
output: ['"e"', '"c"']
- title: "`|`"
body: |
The | operator combines two filters by feeding the output(s) of
the one on the left into the input of the one on the right. It's
pretty much the same as the Unix shell's pipe, if you're used to
that.
If the one on the left produces multiple results, the one on
the right will be run for each of those results. So, the
expression `.[] | .foo` retrieves the "foo" field of each
element of the input array.
examples:
- program: '.[] | .name'
input: '[{"name":"JSON", "good":true}, {"name":"XML", "good":false}]'
output: ['"JSON"', '"XML"']
- title: Types and Values
body: |
jq supports the same set of datatypes as JSON - numbers,
strings, booleans, arrays, objects (which in JSON-speak are
hashes with only string keys), and "null".
Booleans, null, strings and numbers are written the same way as
in javascript. Just like everything else in jq, these simple
values take an input and produce an output - `42` is a valid jq
expression that takes an input, ignores it, and returns 42
instead.
entries:
- title: Array construction - `[]`
body: |
As in JSON, `[]` is used to construct arrays, as in
`[1,2,3]`. The elements of the arrays can be any jq
expression. All of the results produced by all of the
expressions are collected into one big array. You can use it
to construct an array out of a known quantity of values (as
in `[.foo, .bar, .baz]`) or to "collect" all the results of a
filter into an array (as in `[.items[].name]`)
Once you understand the "," operator, you can look at jq's array
syntax in a different light: the expression `[1,2,3]` is not using a
built-in syntax for comma-separated arrays, but is instead applying
the `[]` operator (collect results) to the expression 1,2,3 (which
produces three different results).
If you have a filter `X` that produces four results,
then the expression `[X]` will produce a single result, an
array of four elements.
examples:
- program: "[.user, .projects[]]"
input: '{"user":"stedolan", "projects": ["jq", "wikiflow"]}'
output: ['["stedolan", "jq", "wikiflow"]']
- title: Objects - `{}`
body: |
Like JSON, `{}` is for constructing objects (aka
dictionaries or hashes), as in: `{"a": 42, "b": 17}`.
If the keys are "sensible" (all alphabetic characters), then
the quotes can be left off. The value can be any expression
(although you may need to wrap it in parentheses if it's a
complicated one), which gets applied to the {} expression's
input (remember, all filters have an input and an
output).
{foo: .bar}
will produce the JSON object `{"foo": 42}` if given the JSON
object `{"bar":42, "baz":43}`. You can use this to select
particular fields of an object: if the input is an object
with "user", "title", "id", and "content" fields and you
just want "user" and "title", you can write
{user: .user, title: .title}
Because that's so common, there's a shortcut syntax: `{user, title}`.
If one of the expressions produces multiple results,
multiple dictionaries will be produced. If the input's
{"user":"stedolan","titles":["JQ Primer", "More JQ"]}
then the expression
{user, title: .titles[]}
will produce two outputs:
{"user":"stedolan", "title": "JQ Primer"}
{"user":"stedolan", "title": "More JQ"}
Putting parentheses around the key means it will be evaluated as an
expression. With the same input as above,
{(.user): .titles}
produces
{"stedolan": ["JQ Primer", "More JQ"]}
examples:
- program: '{user, title: .titles[]}'
input: '{"user":"stedolan","titles":["JQ Primer", "More JQ"]}'
output:
- '{"user":"stedolan", "title": "JQ Primer"}'
- '{"user":"stedolan", "title": "More JQ"}'
- program: '{(.user): .titles}'
input: '{"user":"stedolan","titles":["JQ Primer", "More JQ"]}'
output: ['{"stedolan": ["JQ Primer", "More JQ"]}']
- title: Builtin operators and functions
body: |
Some jq operator (for instance, `+`) do different things
depending on the type of their arguments (arrays, numbers,
etc.). However, jq never does implicit type conversions. If you
try to add a string to an object you'll get an error message and
no result.
entries:
- title: Addition - `+`
body: |
The operator `+` takes two filters, applies them both
to the same input, and adds the results together. What
"adding" means depends on the types involved:
- **Numbers** are added by normal arithmetic.
- **Arrays** are added by being concatenated into a larger array.
- **Strings** are added by being joined into a larger string.
- **Objects** are added by merging, that is, inserting all
the key-value pairs from both objects into a single
combined object. If both objects contain a value for the
same key, the object on the right of the `+` wins.
`null` can be added to any value, and returns the other
value unchanged.
examples:
- program: '.a + 1'
input: '{"a": 7}'
output: ['8']
- program: '.a + .b'
input: '{"a": [1,2], "b": [3,4]}'
output: ['[1,2,3,4]']
- program: '.a + null'
input: '{"a": 1}'
output: ['1']
- program: '.a + 1'
input: '{}'
output: ['1']
- program: '{a: 1} + {b: 2} + {c: 3} + {a: 42}'
input: 'null'
output: ['{"a": 42, "b": 2, "c": 3}']
- title: Subtraction - `-`
body: |
As well as normal arithmetic subtraction on numbers, the `-`
operator can be used on arrays to remove all occurences of
the second array's elements from the first array.
examples:
- program: '4 - .a'
input: '{"a":3}'
output: ['1']
- program: . - ["xml", "yaml"]
input: '["xml", "yaml", "json"]'
output: ['["json"]']
- title: Multiplication, division - `*` and `/`
body: |
These operators only work on numbers, and do the expected.
examples:
- program: '10 / . * 3'
input: 5
output: [6]
- title: `length`
body: |
The builtin function `length` gets the length of various
different types of value:
- The length of a **string** is the number of Unicode
codepoints it contains (which will be the same as its
JSON-encoded length in bytes if it's pure ASCII).
- The length of an **array** is the number of elements.
- The length of an **object** is the number of key-value pairs.
- The length of **null** is zero.
examples:
- program: '.[] | length'
input: '[[1,2], "string", {"a":2}, null]'
output: [2, 6, 1, 0]
- title: `keys`
body: |
The builtin function `keys`, when given an object, returns
its keys in an array.
The keys are sorted "alphabetically", by unicode codepoint
order. This is not an order that makes particular sense in
any particular language, but you can count on it being the
same for any two objects with the same set of keys,
regardless of locale settings.
When `keys` is given an array, it returns the valid indices
for that array: the integers from 0 to length-1.
examples:
- program: 'keys'
input: '{"abc": 1, "abcd": 2, "Foo": 3}'
output: ['["Foo", "abc", "abcd"]']
- program: 'keys'
input: '[42,3,35]'
output: ['[0,1,2]']
- title: `has`
body: |
The builtin function `has` returns whether the input object
has the given key, or the input array has an element at the
given index.
`has($key)` has the same effect as checking whether `$key`
is a member of the array returned by `keys`, although `has`
will be faster.
examples:
- program: 'map(has("foo"))'
input: '[{"foo": 42}, {}]'
output: ['[true, false]']
- program: 'map(has(2))'
input: '[[0,1], ["a","b","c"]]'
output: ['[false, true]']
- title: `to_entries`, `from_entries`, `with_entries`
body: |
These functions convert between an object and an array of
key-value pairs. If `to_entries` is passed an object, then
for each `k: v` entry in the input, the output array
includes `{"key": k, "value": v}`.
`from_entries` does the opposite conversion, and
`with_entries(foo)` is a shorthand for `to_entries |
map(foo) | from_entries`, useful for doing some operation to
all keys and values of an object.
examples:
- program: 'to_entries'
input: '{"a": 1, "b": 2}'
output: ['[{"key":"a", "value":1}, {"key":"b", "value":2}]']
- program: 'from_entries'
input: '[{"key":"a", "value":1}, {"key":"b", "value":2}]'
output: ['{"a": 1, "b": 2}']
- program: 'with_entries(.key |= "KEY_" + .)'
input: '{"a": 1, "b": 2}'
output: ['{"KEY_a": 1, "KEY_b": 2}']
- title: `select`
body: |
The function `select(foo)` produces its input unchanged if
`foo` returns true for that input, and produces no output
otherwise.
It's useful for filtering lists: `[1,2,3] | map(select(. >= 2))`
will give you `[3]`.
examples:
- program: 'map(select(. >= 2))'
input: '[1,5,3,0,7]'
output: ['[5,3,7]']
- title: `empty`
body: |
`empty` returns no results. None at all. Not even `null`.
It's useful on occasion. You'll know if you need it :)
examples:
- program: '1, empty, 2'
input: 'null'
output: [1, 2]
- program: '[1,2,empty,3]'
input: 'null'
output: ['[1,2,3]']
- title: `map(x)`
body: |
For any filter `x`, `map(x)` will run that filter for each
element of the input array, and produce the outputs a new
array. `map(.+1)` will increment each element of an array of numbers.
`map(x)` is equivalent to `[.[] | x]`. In fact, this is how
it's defined.
examples:
- program: 'map(.+1)'
input: '[1,2,3]'
output: ['[2,3,4]']
- title: `add`
body: |
The filter `add` takes as input an array, and produces as
output the elements of the array added together. This might
mean summed, concatenated or merged depending on the types
of the elements of the input array - the rules are the same
as those for the `+` operator (described above).
If the input is an empty array, `add` returns `null`.
examples:
- program: add
input: '["a","b","c"]'
output: ['"abc"']
- program: add
input: '[1, 2, 3]'
output: [6]
- program: add
input: '[]'
output: ["null"]
- title: `range`
body: |
The `range` function produces a range of numbers. `range(4;10)`
produces 6 numbers, from 4 (inclusive) to 10 (exclusive). The numbers
are produced as separate outputs. Use `[range(4;10)]` to get a range as
an array.
examples:
- program: 'range(2;4)'
input: 'null'
output: ['2', '3']
- program: '[range(2;4)]'
input: 'null'
output: ['[2,3]']
- title: `tonumber`
body: |
The `tonumber` function parses its input as a number. It
will convert correctly-formatted strings to their numeric
equivalent, leave numbers alone, and give an error on all other input.
examples:
- program: '.[] | tonumber'
input: '[1, "1"]'
output: [1, 1]
- title: `tostring`
body: |
The `tostring` function prints its input as a
string. Strings are left unchanged, and all other values are
JSON-encoded.
examples:
- program: '.[] | tostring'
input: '[1, "1", [1]]'
output: ['"1"', '"1"', '"[1]"']
- title: `type`
body: |
The `type` function returns the type of its argument as a
string, which is one of null, boolean, number, string, array
or object.
examples:
- program: 'map(type)'
input: '[0, false, [], {}, null, "hello"]'
output: ['["number", "boolean", "array", "object", "null", "string"]']
- title: `sort, sort_by`
body: |
The `sort` functions sorts its input, which must be an
array. Values are sorted in the following order:
* `null`
* `false`
* `true`
* numbers
* strings, in alphabetical order (by unicode codepoint value)
* arrays, in lexical order
* objects
The ordering for objects is a little complex: first they're
compared by comparing their sets of keys (as arrays in
sorted order), and if their keys are equal then the values
are compared key by key.
`sort_by` may be used to sort by a particular field of an
object, or by applying any jq filter. `sort_by(foo)`
compares two elements by comparing the result of `foo` on
each element.
examples:
- program: 'sort'
input: '[8,3,null,6]'
output: ['[null,3,6,8]']
- program: 'sort_by(.foo)'
input: '[{"foo":4, "bar":10}, {"foo":3, "bar":100}, {"foo":2, "bar":1}]'
output: ['[{"foo":2, "bar":1}, {"foo":3, "bar":100}, {"foo":4, "bar":10}]']
- title: `group_by`
body: |
`group_by(.foo)` takes as input an array, groups the
elements having the same `.foo` field into separate arrays,
and produces all of these arrays as elements of a larger
array, sorted by the value of the `.foo` field.
Any jq expression, not just a field access, may be used in
place of `.foo`. The sorting order is the same as described
in the `sort` function above.
examples:
- program: 'group_by(.foo)'
input: '[{"foo":1, "bar":10}, {"foo":3, "bar":100}, {"foo":1, "bar":1}]'
output: ['[[{"foo":1, "bar":10}, {"foo":1, "bar":1}], [{"foo":3, "bar":100}]]']
- title: `min`, `max`, `min_by`, `max_by`
body: |
Find the minimum or maximum element of the input array. The
`_by` versions allow you to specify a particular field or
property to examine, e.g. `min_by(.foo)` finds the object
with the smallest `foo` field.
examples:
- program: 'min'
input: '[5,4,2,7]'
output: ['2']
- program: 'max_by(.foo)'
input: '[{"foo":1, "bar":14}, {"foo":2, "bar":3}]'
output: ['{"foo":2, "bar":3}']
- title: `unique`
body: |
The `unique` function takes as input an array and produces
an array of the same elements, in sorted order, with
duplicates removed.
examples:
- program: 'unique'
input: '[1,2,5,3,5,3,1,3]'
output: ['[1,2,3,5]']
- title: `reverse`
body: |
This function reverses an array.
examples:
- program: 'reverse'
input: '[1,2,3,4]'
output: ['[4,3,2,1]']
- title: `contains`
body: |
The filter `contains(b)` will produce true if b is
completely contained within the input. A string B is
contained in a string A if B is a substring of A. An array B
is contained in an array A is all elements in B are
contained in any element in A. An object B is contained in
object A if all of the values in B are contained in the
value in A with the same key. All other types are assumed to
be contained in each other if they are equal.
examples:
- program: 'contains("bar")'
input: '"foobar"'
output: ['true']
- program: 'contains(["baz", "bar"])'
input: '["foobar", "foobaz", "blarp"]'
output: ['true']
- program: 'contains(["bazzzzz", "bar"])'
input: '["foobar", "foobaz", "blarp"]'
output: ['false']
- program: 'contains({foo: 12, bar: [{barp: 12}]})'
input: '{"foo": 12, "bar":[1,2,{"barp":12, "blip":13}]}'
output: ['true']
- program: 'contains({foo: 12, bar: [{barp: 15}]})'
input: '{"foo": 12, "bar":[1,2,{"barp":12, "blip":13}]}'
output: ['false']
- title: `recurse`
body: |
The `recurse` function allows you to search through a
recursive structure, and extract interesting data from all
levels. Suppose your input represents a filesystem:
{"name": "/", "children": [
{"name": "/bin", "children": [
{"name": "/bin/ls", "children": []},
{"name": "/bin/sh", "children": []}]},
{"name": "/home", "children": [
{"name": "/home/stephen", "children": [
{"name": "/home/stephen/jq", "children": []}]}]}]}
Now suppose you want to extract all of the filenames
present. You need to retrieve `.name`, `.children[].name`,
`.children[].children[].name`, and so on. You can do this
with:
recurse(.children[]) | .name
examples:
- program: 'recurse(.foo[])'
input: '{"foo":[{"foo": []}, {"foo":[{"foo":[]}]}]}'
output:
- '{"foo":[{"foo":[]},{"foo":[{"foo":[]}]}]}'
- '{"foo":[]}'
- '{"foo":[{"foo":[]}]}'
- '{"foo":[]}'
- title: "String interpolation - `\(foo)`"
body: |
Inside a string, you can put an expression inside parens
after a backslash. Whatever the expression returns will be
interpolated into the string.
examples:
- program: '"The input was \(.), which is one less than \(.+1)"'
input: '42'
output: ['"The input was 42, which is one less than 43"']
- title: "Format strings and escaping"
body: |
The `@foo` syntax is used to format and escape strings,
which is useful for building URLs, documents in a language
like HTML or XML, and so forth. `@foo` can be used as a
filter on its own, the possible escapings are:
* `@text`:
Calls `tostring`, see that function for details.
* `@json`:
Serialises the input as JSON.
* `@html`:
Applies HTML/XML escaping, by mapping the characters
`<>&'"` to their entity equivalents `<`, `>`,
`&`, `'`, `"`.
* `@uri`:
Applies percent-encoding, by mapping all reserved URI
characters to a `%xx` sequence.
* `@csv`:
The input must be an array, and it is rendered as CSV
with double quotes for strings, and quotes escaped by
repetition.
* `@sh`:
The input is escaped suitable for use in a command-line
for a POSIX shell. If the input is an array, the output
will be a series of space-separated strings.
* `@base64`:
The input is converted to base64 as specified by RFC 4648.
This syntax can be combined with string interpolation in a
useful way. You can follow a `@foo` token with a string
literal. The contents of the string literal will *not* be
escaped. However, all interpolations made inside that string
literal will be escaped. For instance,
@uri "https://www.google.com/search?q=\(.search)"
will produce the following output for the input
`{"search":"jq!"}`:
https://www.google.com/search?q=jq%21
Note that the slashes, question mark, etc. in the URL are
not escaped, as they were part of the string literal.
examples:
- program: '@html'
input: '"This works if x < y"'
output: ['"This works if x < y"']
# - program: '@html "<span>Anonymous said: \(.)</span>"'
# input: '"<script>alert(\"lol hax\");</script>"'
# output: ["<span>Anonymous said: <script>alert("lol hax");</script></span>"]
- program: '@sh "echo \(.)"'
input: "\"O'Hara's Ale\""
output: ["\"echo 'O'\\\\''Hara'\\\\''s Ale'\""]
- title: Conditionals and Comparisons
entries:
- title: `==`, `!=`
body: |
The expression 'a == b' will produce 'true' if the result of a and b
are equal (that is, if they represent equivalent JSON documents) and
'false' otherwise. In particular, strings are never considered equal
to numbers. If you're coming from Javascript, jq's == is like
Javascript's === - considering values equal only when they have the
same type as well as the same value.
!= is "not equal", and 'a != b' returns the opposite value of 'a == b'
examples:
- program: '.[] == 1'
input: '[1, 1.0, "1", "banana"]'
output: ['true', 'true', 'false', 'false']
- title: if-then-else
body: |
`if A then B else C end` will act the same as `B` if `A`
produces a value other than false or null, but act the same
as `C` otherwise.
Checking for false or null is a simpler notion of
"truthiness" than is found in Javascript or Python, but it
means that you'll sometimes have to be more explicit about
the condition you want: you can't test whether, e.g. a
string is empty using `if .name then A else B end`, you'll
need something more like `if (.name | count) > 0 then A else
B end` instead.
If the condition A produces multiple results, it is
considered "true" if any of those results is not false or
null. If it produces zero results, it's considered false.
More cases can be added to an if using `elif A then B` syntax.
examples:
- program: |-
if . == 0 then
"zero"
elif . == 1 then
"one"
else
"many"
end
input: 2
output: ['"many"']
- title: `>, >=, <=, <`
body: |
The comparison operators `>`, `>=`, `<=`, `<` return whether
their left argument is greater than, greater than or equal
to, less than or equal to or less than their right argument
(respectively).
The ordering is the same as that described for `sort`, above.
examples:
- program: '. < 5'
input: 2
output: ['true']
- title: and/or/not
body: |
jq supports the normal Boolean operators and/or/not. They have the
same standard of truth as if expressions - false and null are
considered "false values", and anything else is a "true value".
If an operand of one of these operators produces multiple
results, the operator itself will produce a result for each input.
`not` is in fact a builtin function rather than an operator,
so it is called as a filter to which things can be piped
rather than with special syntax, as in `.foo and .bar |
not`.
These three only produce the values "true" and "false", and
so are only useful for genuine Boolean operations, rather
than the common Perl/Python/Ruby idiom of
"value_that_may_be_null or default". If you want to use this
form of "or", picking between two values rather than
evaluating a condition, see the "//" operator below.
examples:
- program: '42 and "a string"'
input: 'null'
output: ['true']
- program: '(true, false) or false'
input: 'null'
output: ['true', 'false']
# - program: '(true, false) and (true, false)'
# input: 'null'
# output: ['true', 'false', 'false', 'false']
- program: '(true, true) and (true, false)'
input: 'null'
output: ['true', 'false', 'true', 'false']
- program: '[true, false | not]'
input: 'null'
output: ['[false, true]']
- title: Alternative operator - `//`
body: |
A filter of the form `a // b` produces the same
results as `a`, if `a` produces results other than `false`
and `null`. Otherwise, `a // b` produces the same results as `b`.
This is useful for providing defaults: `.foo // 1` will
evaluate to `1` if there's no `.foo` element in the
input. It's similar to how `or` is sometimes used in Python
(jq's `or` operator is reserved for strictly Boolean
operations).
examples:
- program: '.foo // 42'
input: '{"foo": 19}'
output: [19]
- program: '.foo // 42'
input: '{}'
output: [42]
- title: Advanced features
body: |
Variables are an absolute necessity in most programming languages, but
they're relegated to an "advanced feature" in jq.
In most languages, variables are the only means of passing around
data. If you calculate a value, and you want to use it more than once,
you'll need to store it in a variable. To pass a value to another part
of the program, you'll need that part of the program to define a
variable (as a function parameter, object member, or whatever) in
which to place the data.
It is also possible to define functions in jq, although this is
is a feature whose biggest use is defining jq's standard library
(many jq functions such as `map` and `find` are in fact written
in jq).
Finally, jq has a `reduce` operation, which is very powerful but a
bit tricky. Again, it's mostly used internally, to define some
useful bits of jq's standard library.
entries:
- title: Variables
body: |
In jq, all filters have an input and an output, so manual
plumbing is not necessary to pass a value from one part of a program
to the next. Many expressions, for instance `a + b`, pass their input
to two distinct subexpressions (here `a` and `b` are both passed the
same input), so variables aren't usually necessary in order to use a
value twice.
For instance, calculating the average value of an array of numbers
requires a few variables in most languages - at least one to hold the
array, perhaps one for each element or for a loop counter. In jq, it's
simply `add / length` - the `add` expression is given the array and
produces its sum, and the `length` expression is given the array and
produces its length.
So, there's generally a cleaner way to solve most problems in jq that
defining variables. Still, sometimes they do make things easier, so jq
lets you define variables using `expression as $variable`. All
variable names start with `$`. Here's a slightly uglier version of the
array-averaging example:
length as $array_length | add / $array_length
We'll need a more complicated problem to find a situation where using
variables actually makes our lives easier.
Suppose we have an array of blog posts, with "author" and "title"
fields, and another object which is used to map author usernames to
real names. Our input looks like:
{"posts": [{"title": "Frist psot", "author": "anon"},
{"title": "A well-written article", "author": "person1"}],
"realnames": {"anon": "Anonymous Coward",
"person1": "Person McPherson"}}
We want to produce the posts with the author field containing a real
name, as in:
{"title": "Frist psot", "author": "Anonymous Coward"}
{"title": "A well-written article", "author": "Person McPherson"}
We use a variable, $names, to store the realnames object, so that we
can refer to it later when looking up author usernames:
.realnames as $names | .posts[] | {title, author: $names[.author]}
The expression `exp as $x | ...` means: for each value of expression
`exp`, run the rest of the pipeline with the entire original input, and
with `$x` set to that value. Thus `as` functions as something of a
foreach loop.
Variables are scoped over the rest of the expression that defines
them, so
.realnames as $names | (.posts[] | {title, author: $names[.author]})
will work, but
(.realnames as $names | .posts[]) | {title, author: $names[.author]}
won't.
examples:
- program: '.bar as $x | .foo | . + $x'
input: '{"foo":10, "bar":200}'
output: ['210']
- title: 'Defining Functions'
body: |
You can give a filter a name using "def" syntax:
def increment: . + 1;
From then on, `increment` is usable as a filter just like a
builtin function (in fact, this is how some of the builtins
are defined). A function may take arguments:
def map(f): [.[] | f];
Arguments are passed as filters, not as values. The
same argument may be referenced multiple times with
different inputs (here `f` is run for each element of the
input array). Arguments to a function work more like
callbacks than like value arguments.
If you want the value-argument behaviour for defining simple
functions, you can just use a variable:
def addvalue(f): f as $value | map(. + $value);
With that definition, `addvalue(.foo)` will add the current
input's `.foo` field to each element of the array.
examples:
- program: 'def addvalue(f): . + [f]; map(addvalue(.[0]))'
input: '[[1,2],[10,20]]'
output: ['[[1,2,1], [10,20,10]]']
- program: 'def addvalue(f): f as $x | map(. + $x); addvalue(.[0])'
input: '[[1,2],[10,20]]'
output: ['[[1,2,1,2], [10,20,1,2]]']
- title: Reduce
body: |
The `reduce` syntax in jq allows you to combine all of the
results of an expression by accumulating them into a single
answer. As an example, we'll pass `[3,2,1]` to this expression:
reduce .[] as $item (0; . + $item)
For each result that `.[]` produces, `. + $item` is run to
accumulate a running total, starting from 0. In this
example, `.[]` produces the results 3, 2, and 1, so the
effect is similar to running something like this:
0 | (3 as $item | . + $item) |
(2 as $item | . + $item) |
(1 as $item | . + $item)
examples:
- program: 'reduce .[] as $item (0; . + $item)'
input: '[10,2,5,3]'
output: ['20']
- title: Assignment
body: |
Assignment works a little differently in jq than in most
programming languages. jq doesn't distinguish between references
to and copies of something - two objects or arrays are either
equal or not equal, without any further notion of being "the
same object" or "not the same object".
If an object has two fields which are arrays, `.foo` and `.bar`,
and you append something to `.foo`, then `.bar` will not get
bigger. Even if you've just set `.bar = .foo`. If you're used to
programming in languages like Python, Java, Ruby, Javascript,
etc. then you can think of it as though jq does a full deep copy
of every object before it does the assignment (for performance,
it doesn't actually do that, but that's the general idea).
entries:
- title: "`=`"
body: |
The filter `.foo = 1` will take as input an object
and produce as output an object with the "foo" field set to
1. There is no notion of "modifying" or "changing" something
in jq - all jq values are immutable. For instance,
.foo = .bar | .foo.baz = 1
will not have the side-effect of setting .bar.baz to be set
to 1, as the similar-looking program in Javascript, Python,
Ruby or other languages would. Unlike these languages (but
like Haskell and some other functional languages), there is
no notion of two arrays or objects being "the same array" or
"the same object". They can be equal, or not equal, but if
we change one of them in no circumstances will the other
change behind our backs.
This means that it's impossible to build circular values in
jq (such as an array whose first element is itself). This is
quite intentional, and ensures that anything a jq program
can produce can be represented in JSON.
- title: "`|=`"
body: |
As well as the assignment operator '=', jq provides the "update"
operator '|=', which takes a filter on the right-hand side and
works out the new value for the property being assigned to by running
the old value through this expression. For instance, .foo |= .+1 will
build an object with the "foo" field set to the input's "foo" plus 1.
This example should show the difference between '=' and '|=':
Provide input '{"a": {"b": 10}, "b": 20}' to the programs:
.a = .b
.a |= .b
The former will set the "a" field of the input to the "b" field of the
input, and produce the output {"a": 20}. The latter will set the "a"
field of the input to the "a" field's "b" field, producing {"a": 10}.
- title: "`+=`, `-=`, `*=`, `/=`, `//=`"
body: |
jq has a few operators of the form `a op= b`, which are all
equivalent to `a |= . op b`. So, `+= 1` can be used to increment values.
examples:
- program: .foo += 1
input: '{"foo": 42}'
output: ['{"foo": 43}']
- title: Complex assignments
body: |
Lots more things are allowed on the left-hand side of a jq assignment
than in most langauges. We've already seen simple field accesses on
the left hand side, and it's no surprise that array accesses work just
as well:
.posts[0].title = "JQ Manual"
What may come as a surprise is that the expression on the left may
produce multiple results, referring to different points in the input
document:
.posts[].comments |= . + ["this is great"]
That example appends the string "this is great" to the "comments"
array of each post in the input (where the input is an object with a
field "posts" which is an array of posts).
When jq encounters an assignment like 'a = b', it records the "path"
taken to select a part of the input document while executing a. This
path is then used to find which part of the input to change while
executing the assignment. Any filter may be used on the
left-hand side of an equals - whichever paths it selects from the
input will be where the assignment is performed.
This is a very powerful operation. Suppose we wanted to add a comment
to blog posts, using the same "blog" input above. This time, we only
want to comment on the posts written by "stedolan". We can find those
posts using the "select" function described earlier:
.posts[] | select(.author == "stedolan")
The paths provided by this operation point to each of the posts that
"stedolan" wrote, and we can comment on each of them in the same way
that we did before:
(.posts[] | select(.author == "stedolan") | .comments) |=
. + ["terrible."]
|