File: design.html

package info (click to toggle)
jsonnet 0.17.0%2Bds-2
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 6,344 kB
  • sloc: cpp: 23,062; python: 1,705; ansic: 865; sh: 708; javascript: 576; makefile: 187; java: 140
file content (481 lines) | stat: -rw-r--r-- 19,944 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
---
layout: default
title: Language Design
---

<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <h1 id="language_design">Language Design</h1>
    </div>
    <div style="clear: both"></div>
  </div>
</div>

<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <p>An exposition of the basic thinking that underpinned Jsonnet's design.</p>
    </div>
    <div style="clear: both"></div>
  </div>
</div>


<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <h2 id="objectives">Objectives</h2>
    </div>
    <div style="clear: both"></div>
  </div>
</div>

<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <p>
        <a href="http://www.json.org">JSON</a> has emerged as the defacto standard for communication
        of structured data, both between machines and at the human / machine boundary.  However, in
        large quantities JSON can be unwieldy for humans, especially when duplication needs to be
        kept in sync between different parts of the data structure.  Many people address this by
        writing scripts that generate JSON.  Typically these are written in general purpose
        programming languages like Python.  However, maintaining these scripts can be non-trivial,
        especially for persons unfamiliar with the generation code.
      </p>

      <p>
        Jsonnet attempts to solve this problem in a specialized and principled way.  Its design was
        guided by the following criteria:
      </p>

      <ul>
        <li>
          <b>Hermeticity</b>:  Code may be treated as data.  The same JSON should be generated
          regardless of the environment, i.e. without non-deterministic or system-dependent
          behaviors.
        </li>
        <li>
          <b>Templating language</b>:  Code should be interleaved with verbatim data so that it is
          easy to maintain the two in synchrony.  The success of templating languages proves their
          effectiveness.
        </li>
        <li>
          <b>Variants</b>: It should be simple and intuitive to derive variants of any existing
          configurations that override attributes for special or ad-hoc purposes.
        </li>
        <li>
          <b>Modularity</b>:  As configurations grow, it must be possible to manage the complexity
          using standard techniques for <a
          href="http://dl.acm.org/citation.cfm?id=808431">programming in the large</a>.
          Configurations may span many files and many developers.  Errors should provide stack
          traces that describe the nested context.
        </li>
        <li>
          <b>Familiarity</b>:  Raw data should be specified as JSON.  Computation constructs should
          behave in standard ways and compose predictably.
        </li>
        <li>
          <b>Powerful yet simple</b>: Trivial cases should be trivial.  More complex cases should be
          approachable, with localized additional complexity.  Everything must be <i>possible</i>,
          yet the language must still have a small footprint for learning and future tooling.
        </li>
        <li>
          <b>Wide scope</b>:  The same language should be able to configure anything.  Ideally, all
          components of a system should be managed via a well-maintained centralized configuration,
          written in a common language.
        </li>
        <li>
          <b>Formal rigor</b>:  There must be an authoritative specification that is complete and
          simple enough to understand, and a comprehensive set of tests.  This allows new
          implementations to be developed without compatibility hurdles.  The language should
          <i>not</i> be defined by its first implementation.
        </li>
      </ul>
      <p>
        Before Jsonnet, no existing configuration languages or DSLs satisfied these criteria.  In
        fact, the vast majority only satisfied one or two of them.
      </p>
    </div>
    <div style="clear: both"></div>
  </div>
</div>


<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <h2 id="json">Rationale for extending JSON</h2>
    </div>
    <div style="clear: both"></div>
  </div>
</div>

<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <p>
        Extending JSON means building a language that fits within the existing constraints of the
        JSON specification.  This means that all JSON data must behave the same in Jsonnet (i.e. be
        emitted unchanged), with additional behaviors being expressible using constructs that would
        be syntax errors in standard JSON.  This choice did constrain us substantially, but we
        believe it was the correct one for the following reasons:
      </p>

      <p>
        A common case of Jsonnet program is a program that is mostly JSON data, but has a few
        occurences of Jsonnet constructs.  In this case, someone who knows JSON can maintain the
        file with no additional knowledge.
      </p>

      <p>
        Systems that accept JSON can be transparently modified to accept Jsonnet instead by
        inserting a step to convert the Jsonnet to JSON.  If you want to be extra paranoid, you can
        invoke Jsonnet only if the JSON parsing failed, but this is not actually necessary.
      </p>
    </div>
    <div style="clear: both"></div>
  </div>
</div>


<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <h2 id="reflection">Rationale for Reflection</h2>
    </div>
    <div style="clear: both"></div>
  </div>
</div>

<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <p>
        In JSON, there is an object construct, that is used as a dictionary, and like an object.
        Jsonnet is the same, the object construct is often used as a dictionary as well as a
        conventional object.  The importance of this is that it is necessary to iterate over
        dictionaries, test for the existence of a key, and build them with computed key names.
        These features when applied to objects give us reflection.  So naturally, Jsonnet has
        reflection because objects and dictionaries are the same thing.
      </p>
    </div>
    <div style="clear: both"></div>
  </div>
</div>


<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <h2 id="turing_completeness">Rationale for Turing-Completeness</h2>
    </div>
    <div style="clear: both"></div>
  </div>
</div>

<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <p>
        Jsonnet is Turing-complete as it is possible to write non-terminating programs.
        Configurations should always terminate, so ideally we would consider non-termination as an
        error.  However, doing so in general is <a
        href="http://en.wikipedia.org/wiki/Halting_problem">impossible</a> and even when constrained
        to practical use cases, it remains impractical.  Typical approaches for enforcing
        termination either restrict the language (e.g. to primitive recursion), which makes some
        programs impossible / difficult to write, or, alternatively, require the programmer to
        provide evidence that the program terminates, via some sort of annotation / <a
        href="http://www.turingarchive.org/browse.php/B/8">energy function</a>.  Both of these make
        the programmer's life more difficult.
      </p>
      <p>
        Furthermore, non-Turing complete languages can take arbitrary CPU or RAM by running
        intensive algorithms on large input.  So enforcing termination is firstly not practical, and
        secondly does not actually solve the more practical problem of bounding resource consumption
        during execution.
      </p>
      <p>
        For Jsonnet we decided restricting termination would create more problems than it would
        solve.
      </p>
    </div>
    <div style="clear: both"></div>
  </div>
</div>


<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <h2 id="dynamic_typing">Rationale for Dynamic Typing</h2>
    </div>
    <div style="clear: both"></div>
  </div>
</div>

<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <p>
        Modern <a href="http://en.wikipedia.org/wiki/Type_system">type systems</a> typically either
        require annotations, use type inference, or are dynamic.  The former two approaches have two
        advantages: Firstly, some errors can be detected before execution (some correct programs are
        also rejected).  Secondly, they can be implemented more efficiently since the additional
        knowledge about the program means special memory representations can be used, and also some
        instructions can be elided.  However annotations are additional bureaucracy and type
        inference produces unification errors that the programmer has to understand and fix.
      </p>
      <p>
        Dynamic typing means checking types at run-time, and raising errors via the language's
        existing error reporting mechanism.  This is conceptually much simpler for the programmer.
        In a configuration language, runtime performance is not as important as it is in most other
        contexts, and additionally programs tend to execute quickly so testing is much easier.  On
        the other hand, simplicity is of the utmost importance.
      </p>
      <p>
        For Jsonnet, i.e. in the context of configuration languages, we decided that simplicity was
        far more important than the minimal benefits of runtime efficiency and static error
        detection.
      </p>
    </div>
    <div style="clear: both"></div>
  </div>
</div>


<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <h2 id="oo">Rationale for Object-Oriented Semantics</h2>
    </div>
    <div style="clear: both"></div>
  </div>
</div>

<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <p>
        Object-oriented semantics (specifically, late binding), are ideal for deriving variants from
        existing data.  Most modern object-oriented languages distinguish between classes and
        objects.  Classes can be extended, or instantiated into objects, and objects cannot be
        extended.  However some are <a
        href="http://en.wikipedia.org/wiki/Prototype-based_programming">prototype-based</a>, where
        the concept of class and object are essentially fused.  In Jsonnet, we decided to use the
        latter form because JSON does not have classes, and it's simpler to only have one concept
        instead of two.
      </p>
      <p>
        Furthermore, we decided to use mixin semantics for inheritance, because they are the most
        powerful and flexible form of inheritance that has been widely studied, yet remain
        remarkably simple to define and use.
      </p>
      <p>
        One novelty is our notion of a first class <code>super</code> construct, which can be used
        outside the context of field lookup, i.e. <code>super.f</code>.  The behavior of
        <code>super</code> by itself is given formally in the <a
        href="/language/spec.html">specification</a>.  We decided not to restrict <code>super</code>
        in order to keep the language simple (fewer rules).
      </p>
    </div>
    <div style="clear: both"></div>
  </div>
</div>


<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <h2 id="functional">Rationale for Pure Functional Semantics</h2>
    </div>
    <div style="clear: both"></div>
  </div>
</div>

<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <p>
        There are ideological and practical reasons for designing Jsonnet as a pure functional
        language.  On the practical side, pure functional languages have no side-effects, which
        (together with determinism) gives us the property of hermeticity.  This allows code to be
        treated as data, because the code will always evaluate to the same thing, and it is not
        allowed to make changes to its environment (e.g. by writing to files).  By analogy,
        compressed data can be decompressed on demand because it will always produce the same data
        and the decompression process doesn't have any external side-effects.
      </p>
      <p>
        Furthermore, the property of <a
        href="http://en.wikipedia.org/wiki/Referential_transparency_(computer_science)">referential
        transparency</a> essentially extends the hermeticity property into submodules of the Jsonnet
        code itself, which means that different parts of the code cannot interfere with each other
        and everything composes in a very predictable way.  In a language whose purpose is simply to
        define data, the property of referential transparency is very natural.  Ideologically
        speaking, functional programming language advocates have always claimed that functional
        languages are excellent at defining, translating, and refining structured data.
      </p>
    </div>
    <div style="clear: both"></div>
  </div>
</div>


<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <h2 id="lazy">Rationale for Lazy Semantics</h2>
    </div>
    <div style="clear: both"></div>
  </div>
</div>

<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <p>
        The late binding from the object oriented semantics already embodies some of the features of
        a lazy language.  Errors do not occur unless a field is actually dereferenced, and cyclic
        structures can be created.  For example the following is valid even in an eager version of
        the language:
      </p>
<pre class="large">local x = {a: "apple", b: y.b},
      y = {a: x.a, b: "banana"};
x</pre>
      <p>
        It would therefore be confusing if the following was not also valid, which leads us to lazy
        semantics for arrays.
      </p>
      <pre class="large">local x = ["apple", y[1]],
      y = [x[0], "banana"];
x</pre>
      <p>
        Therefore, for consistency, the whole language is lazy.  It does not harm the language to be
        lazy: Performance is not significantly affected, stack traces are still possible, and it
        doesn't interfere with I/O (because there is no I/O).  There is also a precedent for
        laziness, e.g. in Makefiles and the Nix expression language.
      </p>
      <p>
        Arguably, laziness brings real benefits in terms of abstraction and modularity.  It is
        possible to build infinite data-structures, and there is unrestricted beta expansion.  For
        example, the following 2 snippets of code are only equivalent in a lazy language.
      </p>
      <pre class="large">if x == 0 then 0 else if x &gt; 0 then 1 / x else -1/x</pre>
      <pre class="large">local r = 1 / x;
if x == 0 then 0 else if x &gt; 0 then r else -r
</pre>
    </div>
    <div style="clear: both"></div>
  </div>
</div>


<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <h2 id="modularity">Modularity and Encapsulation</h2>
    </div>
    <div style="clear: both"></div>
  </div>
</div>

<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <p>
        In Jsonnet, a <dfn>module</dfn> is typically a Jsonnet file that defines an object whose
        fields contain useful values, such as functions or objects that can specialized for a
        particular purpose via extension.  Using an object at the top level of the module allows
        adding other fields later on, without having to alter user code.  When writing such a
        module, it is advisable to expose only the interface to the module, and not its
        implementation.  This is called encapsulation, and it allows changing the implementation
        later, despite the module being imported by many other Jsonnet files.
      </p>
      <p>
        Jsonnet's primary feature for encapsulation is the <code>local</code> keyword.  This makes
        it possible to define variables that are visible only to the module, and impossible to
        access from outside.  The following is a simple example.  Other code can import
        <tt>util.jsonnet</tt> but will not be able to see the <code>internal</code> object, and
        therefore not the function <code>square</code>.
      </p>
      <pre class="large"><code>// util.jsonnet
local internal = {
    square(x):: x*x,
};
{
    euclidianDistance(x1, y1, x2, y2)::
        std.sqrt(internal.square(x2-x1) + internal.square(y2-y1)),
}
</code></pre>
      <p>
        It is also possible to store <code>square</code> in a field, which exposes it to those
        importing the module:
      </p>
      <pre class="large">// util2.jsonnet
{
    square(x):: x*x,
    euclidianDistance(x1, y1, x2, y2)::
        std.sqrt(self.square(x2-x1) + self.square(y2-y1)),
}</pre>
      <p>
        This allows users to redefine the square function, as shown in the very strange code below.
        In some cases, this is actually what you want and is very useful.  But it does make it
        harder to maintain backwards compatibility.  For example, if you later change the
        implementation of <code>euclidianDistance</code> to inline the square call, then user code
        will behave differently.
      </p>
      <pre class="large">// myfile.jsonnet
local util2 = import "util2.jsonnet" { square(x):: x*x*x };
{
    crazy: util2.euclidianDistance(1,2,3,4)
}</pre>
      <p>
        In conclusion, Jsonnet allows you to either expose these details or hide them.  So choose
        wisely, and know that everything you expose will potentially be used in ways that you didn't
        expect.  Keep your interface small if possible.
      </p>
      <p>
        A common belief is that languages should make <code>local</code> the default state, with an
        explicit construct to allow outside access.  This ensures that things are not accidentally
        (or apathetically) exposed.  In the case of Jsonnet, backwards compatibility with JSON
        prohibits that design since in JSON everything is a visible field.
      </p>
    </div>
    <div style="clear: both"></div>
  </div>
</div>


<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <h2 id="misc">Miscellaneous Design Choices</h2>
    </div>
    <div style="clear: both"></div>
  </div>
</div>

<div class="hgroup">
  <div class="hgroup-inline">
    <div class="panel">
      <p>
        The language is designed to be implementable via desugaring, i.e. there is a simple core
        language and the other constructs are translated down to this core language before
        interpretation.  This technique allows a language to have considerable expressive power,
        while remaining easy to implement.  </p> <p> The string formating operator <code>%</code> is
        defined to behave identically to the Python equivalent, and therefore very similarly to the
        C <tt>printf</tt> micro-language.  We decided it would be better to be compatible with an
        existing popular language wherever possible, and Python is often the language of choice for
        operations people.
      </p>
      <p>
        In contrast, we decided that the semantics of Python's <tt>and</tt> and <tt>or</tt>
        operators, which have meaning beyond boolean logic, might be confusing for those without
        prior knowledge of such languages.  We therefore implemented the simpler
        <code>&amp;&amp;</code> and <code>||</code> operators, which only operate on booleans.  More
        complex cases can be implemented explicitly with the <code>if</code> construct.
      </p>
    </div>
    <div style="clear: both"></div>
  </div>
</div>