1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279
|
---
tags: intro, python, technical, properties
date: 2016-06-05 16:00
title: Evolving toward property-based testing with Hypothesis
author: jml
---
Many people are quite comfortable writing ordinary unit tests, but feel a bit
confused when they start with property-based testing. This post shows how two
ordinary programmers started with normal Python unit tests and nudged them
incrementally toward property-based tests, gaining many advantages on the way.
<!--more-->
Background
----------
I used to work on a command-line tool with an interface much like git's. It
had a repository, and within that repository you could create branches and
switch between them. Let's call the tool `tlr`.
It was supposed to behave something like this:
List branches:
$ tlr branch
foo
* master
Switch to an existing branch:
$ tlr checkout foo
* foo
master
Create a branch and switch to it:
$ tlr checkout -b new-branch
$ tlr branch
foo
master
* new-branch
Early on, my colleague and I found a bug: when you created a new branch with
`checkout -b` it wouldn't switch to it. The behavior looked something like
this:
$ tlr checkout -b new-branch
$ tlr branch
foo
* master
new-branch
The previously active branch (in this case, `master`) stayed active, rather
than switching to the newly-created branch (`new-branch`).
Before we fixed the bug, we decided to write a test. I thought this would be a
good chance to start using Hypothesis.
Writing a simple test
---------------------
My colleague was less familiar with Hypothesis than I was, so we started with
a plain old Python unit test:
```python
def test_checkout_new_branch(self):
"""Checking out a new branch makes it the current active branch."""
tmpdir = FilePath(self.mktemp())
tmpdir.makedirs()
repo = Repository.initialize(tmpdir.path)
repo.checkout("new-branch", create=True)
self.assertEqual("new-branch", repo.get_active_branch())
```
The first thing to notice here is that the string `"new-branch"` is not
actually relevant to the test. It's just a value we picked to exercise the
buggy code. The test should be able to pass with *any valid branch name*.
Even before we started to use Hypothesis, we made this more explicit by making
the branch name a parameter to the test:
```python
def test_checkout_new_branch(self, branch_name="new-branch"):
tmpdir = FilePath(self.mktemp())
tmpdir.makedirs()
repo = Repository.initialize(tmpdir.path)
repo.checkout(branch_name, create=True)
self.assertEqual(branch_name, repo.get_active_branch())
```
(For brevity, I'll elide the docstring from the rest of the code examples)
We never manually provided the `branch_name` parameter, but this change made
it more clear that the test ought to pass regardless of the branch name.
Introducing Hypothesis
----------------------
Once we had a parameter, the next thing was to use Hypothesis to provide the
parameter for us. First, we imported Hypothesis:
```python
from hypothesis import given, strategies as st
```
And then made the simplest change to our test to actually use it:
```python
@given(branch_name=st.just("new-branch"))
def test_checkout_new_branch(self, branch_name):
tmpdir = FilePath(self.mktemp())
tmpdir.makedirs()
repo = Repository.initialize(tmpdir.path)
repo.checkout(branch_name, create=True)
self.assertEqual(branch_name, repo.get_active_branch())
```
Here, rather than providing the branch name as a default argument value, we
are telling Hypothesis to come up with a branch name for us using the
`just("new-branch")`
[strategy](https://hypothesis.readthedocs.io/en/latest/data.html). This
strategy will always come up with `"new-branch"`, so it's actually no
different from what we had before.
What we actually wanted to test is that any valid branch name worked. We
didn't yet know how to generate any valid branch name, but using a
time-honored tradition we pretended that we did:
```python
def valid_branch_names():
"""Hypothesis strategy to generate arbitrary valid branch names."""
# TODO: Improve this strategy.
return st.just("new-branch")
@given(branch_name=valid_branch_names())
def test_checkout_new_branch(self, branch_name):
tmpdir = FilePath(self.mktemp())
tmpdir.makedirs()
repo = Repository.initialize(tmpdir.path)
repo.checkout(branch_name, create=True)
self.assertEqual(branch_name, repo.get_active_branch())
```
Even if we had stopped here, this would have been an improvement. Although the
Hypothesis version of the test doesn't have any extra power over the vanilla
version, it is more explicit about what it's testing, and the
`valid_branch_names()` strategy can be reused by future tests, giving us a
single point for improving the coverage of many tests at once.
Expanding the strategy
----------------------
It's only when we get Hypothesis to start generating our data for us that we
really get to take advantage of its bug finding power.
The first thing my colleague and I tried was:
```python
def valid_branch_names():
return st.text()
```
But that failed pretty hard-core.
Turns out branch names were implemented as symlinks on disk, so valid branch
name has to be a valid file name on whatever filesystem the tests are running
on. This at least rules out empty names, `"."`, `".."`, very long names, names
with slashes in them, and probably others (it's actually
[really complicated](https://en.wikipedia.org/wiki/Filename#Comparison_of_filename_limitations)).
Hypothesis had made something very clear to us: neither my colleague nor I
actually knew what a valid branch name should be. None of our interfaces
documented it, we had no validators, no clear ideas for rendering & display,
nothing. We had just been assuming that people would pick good, normal,
sensible names.
It was as if we had suddenly gained the benefit of extensive real-world
end-user testing, just by calling the right function. This was:
1. Awesome. We've found bugs that our users won't.
2. Annoying. We really didn't want to fix this bug right now.
In the end, we compromised and implemented a relatively conservative strategy
to simulate the good, normal, sensible branch names that we expected:
```python
from string import ascii_lowercase
VALID_BRANCH_CHARS = ascii_lowercase + "_-."
def valid_branch_names():
# TODO: Handle unicode / weird branch names by rejecting them early, raising nice errors
# TODO: How do we handle case-insensitive file systems?
return st.text(alphabet=VALID_BRANCH_CHARS, min_size=1, max_size=112)
```
Not ideal, but *much* more extensive than just hard-coding `"new-branch"`, and
much clearer communication of intent.
Adding edge cases
-----------------
There's one valid branch name that this strategy *could* generate, but
probably won't: `master`. If we left the test just as it is, then one time in
a hojillion the strategy would generate `"master"` and the test would fail.
Rather than waiting on chance, we encoded this in the `valid_branch_names`
strategy, to make it more likely:
```python
def valid_branch_names():
return st.text(alphabet=letters, min_size=1, max_size=112).map(
lambda t: t.lower()
) | st.just("master")
```
When we ran the tests now, they failed with an exception due to the branch
`master` already existing. To fix this, we used `assume`:
```python
from hypothesis import assume
@given(branch_name=valid_branch_names())
def test_checkout_new_branch(self, branch_name):
assume(branch_name != "master")
tmpdir = FilePath(self.mktemp())
tmpdir.makedirs()
repo = Repository.initialize(tmpdir.path)
repo.checkout(branch_name, create=True)
self.assertEqual(branch_name, repo.get_active_branch())
```
Why did we add `master` to the valid branch names if we were just going to
exclude it anyway? Because when other tests say "give me a valid branch name",
we want *them* to make the decision about whether `master` is appropriate or
not. Any future test author will be compelled to actually think about whether
handling `master` is a thing that they want to do. That's one of the great
benefits of Hypothesis: it's like having a rigorous design critic in your
team.
Going forward
-------------
We stopped there, but we need not have. Just as the test should have held for
any branch, it should also hold for any repository. We were just creating an
empty repository because it was convenient for us.
If we were to continue, the next step would be to [write a `repositories()`
function to generate repositories]({% post_url 2016-05-11-generating-the-right-data/)
with more varied contents, commit histories, and existing branches.
The test might then look something like this:
```python
@given(repo=repositories(), branch_name=valid_branch_names())
def test_checkout_new_branch(self, repo, branch_name):
"""
Checking out a new branch results in it being the current active
branch.
"""
assume(branch_name not in repo.get_branches())
repo.checkout(branch_name, create=True)
self.assertEqual(branch_name, repo.get_active_branch())
```
This is about as close to a bona fide "property" as you're likely to get in
code that isn't a straight-up computer science problem: if you create and
switch to a branch that doesn't already exist, the new active branch is the
newly created branch.
We got there not by sitting down and thinking about the properties of our
software in the abstract, nor by necessarily knowing much about property-based
testing, but rather by incrementally taking advantage of features of Python
and Hypothesis. On the way, we discovered and, umm, contained a whole class of
bugs, and we made sure that all future tests would be heaps more powerful.
Win.
|