File: implicit-typing-removed.md

package info (click to toggle)
python-strictyaml 1.7.3-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 1,708 kB
  • sloc: python: 12,836; sh: 48; makefile: 3
file content (166 lines) | stat: -rw-r--r-- 4,348 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
---
title: The Norway Problem - why StrictYAML refuses to do implicit typing and so should you
---

A while back I met an old coworker and he started telling me about this
interesting bug he faced:

"So, we started internationalizing the website by creating a config
file. We added the UK, Ireland, France and Germany at first."

```yaml
countries:
- GB
- IE
- FR
- DE
```

"This was all fine. However, one day after a quick configuration change
all hell broke loose. It turned out that while the UK, France and
Germany were all fine, *Norway* was *not*..."

"While the website went down and we were losing money we chased down
a number of loose ends until finally finding the root cause."

"If turned out that if feed this configuration file into
[pyyaml](http://pyyaml.org):"

```yaml
countries:
- GB
- IE
- FR
- DE
- NO
```

"This is what you got in return:"

```python
>>> from pyyaml import load
>>> load(the_configuration)
{'countries': ['GB', 'IE', 'FR', 'DE', False]}
```

It snows a *lot* in False.

When this is fed to code that expects a string of the form 'NO',
then the code will usually break, often with a cryptic error,
Typically it would be a KeyError when trying to use 'False'
as a key in a dict when no such key exists.

It can be "quick fixed" by using quotes - a fix for sure, but
kind of a hack - and by that time the damage is done:

```yaml
countries:
- GB
- IE
- FR
- DE
- 'NO'
```

The most tragic aspect of this bug, however, is that it is
*intended* behavior according to the [YAML 1.2 specification](https://yaml.org/spec/1.2.2/).
The real fix requires explicitly disregarding the spec - which
is why most YAML parsers have it.

StrictYAML sidesteps this problem by ignoring key parts of the
spec, in an attempt to create a "zero surprises" parser.

*Everything* is a string by default:

```python
>>> from strictyaml import load
>>> load(the_configuration).data
{'countries': ['GB', 'IE', 'FR', 'DE', 'NO']}
```


## String or float?

Norway is just the tip of the iceberg. The first time this problem hit me
I was maintaining a configuration file of application versions. I had
a file like this initially - which caused no issues:

```yaml
python: 3.5.3
postgres: 9.3.0
```

However, if I changed it *very* slightly:

```yaml
python: 3.5.3
postgres: 9.3
```

I started getting type errors because it was parsed like this:

```python
>>> from ruamel.yaml import load
>>> load(versions) == [{"python": "3.5.3", "postgres": 9.3}]    # oops those *both* should have been strings
```

Again, this led to type errors in my code. Again, I 'quick fixed' it with quotes.
However, the solution I really wanted was:

```python
>>> from strictyaml import load
>>> load(versions) == [{"python": "3.5.3", "postgres": "9.3"}]    # that's better
```


## The world's most buggy name

[Christopher Null](http://www.wired.com/2015/11/null) has a name that is
notorious for breaking software code - airlines, banks, every bug caused
by a programmer who didn't know a type from their elbow has hit him.

YAML, sadly, is no exception:

```yaml
first name: Christopher
surname: Null
```

```python
# Is it okay if we just call you Christopher None instead?
>>> load(name) == {"first name": "Christopher", "surname": None}
```

With StrictYAML:

```python
>>> from strictyaml import load
>>> load(name) == {"first name": "Christopher", "surname": "Null"}
```


## Type theoretical concerns

Type theory is a popular topic with regards to programming languages,
where a well designed type system is regarded (rightly) as a yoke that
can catch bugs at an early stage of development while *poorly*
designed type systems provide fertile breeding ground for edge case
bugs.

(it's equally true that extremely strict type systems require a lot
more upfront and the law of diminishing returns applies to type
strictness - a cogent answer to the question "why is so little
software written in haskell?").

A less popular, although equally true idea is the notion that markup
languages like YAML have the same issues with types - as demonstrated
above.


## User Experience

In a way, type systems can be considered both a mathematical concern
and a UX device.

In the above, and in most cases, implicit typing represents a major violation
of the UX [principle of least astonishment](https://en.wikipedia.org/wiki/Principle_of_least_astonishment).