File: README.md

package info (click to toggle)
miller 6.16.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 87,928 kB
  • sloc: ruby: 162; sh: 119; makefile: 87
file content (39 lines) | stat: -rw-r--r-- 4,350 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
This contains the implementation of the [`types.Mlrval`](./mlrval.go) datatype which is used for record values, as well as expression/variable values in the Miller `put`/`filter` DSL.

## Mlrval

The [`types.Mlrval`](./mlrval.go) structure includes **string, int, float, boolean, array-of-mlrval, map-string-to-mlrval, void, absent, and error** types as well as type-conversion logic for various operators.

* Miller's `absent` type is like Javascript's `undefined` -- it's for times when there is no such key, as in a DSL expression `$out = $foo` when the input record is `$x=3,y=4` -- there is no `$foo` so `$foo` has `absent` type. Nothing is written to the `$out` field in this case. See also [here](https://miller.readthedocs.io/en/latest/reference-main-null-data) for more information.
* Miller's `void` type is like Javascript's `null` -- it's for times when there is a key with no value, as in `$out = $x` when the input record is `$x=,$y=4`. This is an overlap with `string` type, since a void value looks like an empty string. I've gone back and forth on this (including when I was writing the C implementation) -- whether to retain `void` as a distinct type from empty-string, or not. I ended up keeping it as it made the `Mlrval` logic easier to understand.
* Miller's `error` type is for things like doing type-uncoerced addition of strings. Data-dependent errors are intended to result in `(error)`-valued output, rather than crashing Miller. See also [here](https://miller.readthedocs.io/en/latest/reference-main-data-types) for more information.
* Miller's number handling makes auto-overflow from int to float transparent, while preserving the possibility of 64-bit bitwise arithmetic.
  * This is different from JavaScript, which has only double-precision floats and thus no support for 64-bit numbers (note however that there is now [`BigInt`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt)).
  * This is also different from C and Go, wherein casts are necessary -- without which int arithmetic overflows.
  * Using `$a * $b` in Miller will auto-overflow to float. Using `$a .* $b` will stick with 64-bit integers (if `$a` and `$b` are already 64-bit integers).
  * More generally:
    * Bitwise operators such as `|`, `&`, and `^` map ints to ints.
    * The auto-overflowing math operators `+`, `*`, etc. map ints to ints unless they overflow in which case float is produced.
    * The int-preserving math operators `.+`, `.*`, etc. map ints to ints even if they overflow.
  * See also [here](https://miller.readthedocs.io/en/latest/reference-main-arithmetic) for the semantics of Miller arithmetic, which the `Mlrval` class implements.
* Since a Mlrval can be of type array-of-mlrval or map-string-to-mlrval, a Mlrval is suited for JSON decoding/encoding.

# Mlrmap

[`types.Mlrmap`](./mlrmap.go) is the sequence of key-value pairs which represents a Miller record. The key-lookup mechanism is optimized for Miller read/write usage patterns -- please see `mlrmap.go` for more details.

It's also an ordered map structure, with string keys and Mlrval values. This is used within Mlrval itself.

# Context

[`types.Context`](./context.go) supports AWK-like variables such as `FILENAME`, `NF`, `NR`, and so on.

# A note on JSON

* The code for JSON I/O is mixed between `Mlrval` and `Mlrmap. This is unsurprising since JSON is a mutually recursive data structure -- arrays can contain maps and vice versa.
* JSON has non-collection types (string, int, float, etc) as well as collection types (array and object).  Support for objects is principally in [./mlrmap_json.go](mlrmap_json.go); support for non-collection types as well as arrays is in [./mlrval_json.go](mlrval_json.go).
* Both multi-line and single-line formats are supported.
* Callsites for JSON output are record-writing (e.g. `--ojson`), the `dump` and `print` DSL routines, and the `json_stringify` DSL function.
  * The choice between single-line and multi-line for JSON record-writing is controlled by `--jvstack` and `--no-jvstack`, the former (multiline) being the default.
  * The `dump` and `print` DSL routines produce multi-line output without a way for the user to choose single-line output.
  * The `json_stringify` DSL function lets the user specify multi-line or single-line, with the former being the default,