File: 2016-06-05-incremental-property-based-testing.md

package info (click to toggle)
python-hypothesis 6.138.0-1
links: PTS, VCS
area: main
in suites: sid
size: 15,272 kB
sloc: python: 62,853; ruby: 1,107; sh: 253; makefile: 41; javascript: 6
file content (279 lines) | stat: -rw-r--r-- 9,784 bytes
parent folder | download | duplicates (2)
---
tags: intro, python, technical, properties
date: 2016-06-05 16:00
title: Evolving toward property-based testing with Hypothesis
author: jml
---

Many people are quite comfortable writing ordinary unit tests, but feel a bit
confused when they start with property-based testing. This post shows how two
ordinary programmers started with normal Python unit tests and nudged them
incrementally toward property-based tests, gaining many advantages on the way.

<!--more-->

Background
----------

I used to work on a command-line tool with an interface much like git's. It
had a repository, and within that repository you could create branches and
switch between them. Let's call the tool `tlr`.

It was supposed to behave something like this:

List branches:

    $ tlr branch
    foo
    * master

Switch to an existing branch:

    $ tlr checkout foo
    * foo
    master

Create a branch and switch to it:

    $ tlr checkout -b new-branch
    $ tlr branch
    foo
    master
    * new-branch

Early on, my colleague and I found a bug: when you created a new branch with
`checkout -b` it wouldn't switch to it. The behavior looked something like
this:

    $ tlr checkout -b new-branch
    $ tlr branch
    foo
    * master
    new-branch

The previously active branch (in this case, `master`) stayed active, rather
than switching to the newly-created branch (`new-branch`).

Before we fixed the bug, we decided to write a test. I thought this would be a
good chance to start using Hypothesis.

Writing a simple test
---------------------

My colleague was less familiar with Hypothesis than I was, so we started with
a plain old Python unit test:

```python
def test_checkout_new_branch(self):
    """Checking out a new branch makes it the current active branch."""
    tmpdir = FilePath(self.mktemp())
    tmpdir.makedirs()
    repo = Repository.initialize(tmpdir.path)
    repo.checkout("new-branch", create=True)
    self.assertEqual("new-branch", repo.get_active_branch())
```

The first thing to notice here is that the string `"new-branch"` is not
actually relevant to the test. It's just a value we picked to exercise the
buggy code. The test should be able to pass with *any valid branch name*.

Even before we started to use Hypothesis, we made this more explicit by making
the branch name a parameter to the test:

```python
def test_checkout_new_branch(self, branch_name="new-branch"):
    tmpdir = FilePath(self.mktemp())
    tmpdir.makedirs()
    repo = Repository.initialize(tmpdir.path)
    repo.checkout(branch_name, create=True)
    self.assertEqual(branch_name, repo.get_active_branch())
```

(For brevity, I'll elide the docstring from the rest of the code examples)

We never manually provided the `branch_name` parameter, but this change made
it more clear that the test ought to pass regardless of the branch name.

Introducing Hypothesis
----------------------

Once we had a parameter, the next thing was to use Hypothesis to provide the
parameter for us. First, we imported Hypothesis:

```python
from hypothesis import given, strategies as st
```

And then made the simplest change to our test to actually use it:

```python
@given(branch_name=st.just("new-branch"))
def test_checkout_new_branch(self, branch_name):
    tmpdir = FilePath(self.mktemp())
    tmpdir.makedirs()
    repo = Repository.initialize(tmpdir.path)
    repo.checkout(branch_name, create=True)
    self.assertEqual(branch_name, repo.get_active_branch())
```

Here, rather than providing the branch name as a default argument value, we
are telling Hypothesis to come up with a branch name for us using the
`just("new-branch")`
[strategy](https://hypothesis.readthedocs.io/en/latest/data.html). This
strategy will always come up with `"new-branch"`, so it's actually no
different from what we had before.

What we actually wanted to test is that any valid branch name worked. We
didn't yet know how to generate any valid branch name, but using a
time-honored tradition we pretended that we did:

```python
def valid_branch_names():
    """Hypothesis strategy to generate arbitrary valid branch names."""
    # TODO: Improve this strategy.
    return st.just("new-branch")


@given(branch_name=valid_branch_names())
def test_checkout_new_branch(self, branch_name):
    tmpdir = FilePath(self.mktemp())
    tmpdir.makedirs()
    repo = Repository.initialize(tmpdir.path)
    repo.checkout(branch_name, create=True)
    self.assertEqual(branch_name, repo.get_active_branch())
```

Even if we had stopped here, this would have been an improvement. Although the
Hypothesis version of the test doesn't have any extra power over the vanilla
version, it is more explicit about what it's testing, and the
`valid_branch_names()` strategy can be reused by future tests, giving us a
single point for improving the coverage of many tests at once.

Expanding the strategy
----------------------

It's only when we get Hypothesis to start generating our data for us that we
really get to take advantage of its bug finding power.

The first thing my colleague and I tried was:

```python
def valid_branch_names():
    return st.text()
```

But that failed pretty hard-core.

Turns out branch names were implemented as symlinks on disk, so valid branch
name has to be a valid file name on whatever filesystem the tests are running
on. This at least rules out empty names, `"."`, `".."`, very long names, names
with slashes in them, and probably others (it's actually
[really complicated](https://en.wikipedia.org/wiki/Filename#Comparison_of_filename_limitations)).

Hypothesis had made something very clear to us: neither my colleague nor I
actually knew what a valid branch name should be. None of our interfaces
documented it, we had no validators, no clear ideas for rendering & display,
nothing. We had just been assuming that people would pick good, normal,
sensible names.

It was as if we had suddenly gained the benefit of extensive real-world
end-user testing, just by calling the right function. This was:

 1. Awesome. We've found bugs that our users won't.
 2. Annoying. We really didn't want to fix this bug right now.

In the end, we compromised and implemented a relatively conservative strategy
to simulate the good, normal, sensible branch names that we expected:

```python
from string import ascii_lowercase

VALID_BRANCH_CHARS = ascii_lowercase + "_-."


def valid_branch_names():
    # TODO: Handle unicode / weird branch names by rejecting them early, raising nice errors
    # TODO: How do we handle case-insensitive file systems?
    return st.text(alphabet=VALID_BRANCH_CHARS, min_size=1, max_size=112)
```

Not ideal, but *much* more extensive than just hard-coding `"new-branch"`, and
much clearer communication of intent.

Adding edge cases
-----------------

There's one valid branch name that this strategy *could* generate, but
probably won't: `master`. If we left the test just as it is, then one time in
a hojillion the strategy would generate `"master"` and the test would fail.

Rather than waiting on chance, we encoded this in the `valid_branch_names`
strategy, to make it more likely:

```python
def valid_branch_names():
    return st.text(alphabet=letters, min_size=1, max_size=112).map(
        lambda t: t.lower()
    ) | st.just("master")
```

When we ran the tests now, they failed with an exception due to the branch
`master` already existing. To fix this, we used `assume`:

```python
from hypothesis import assume


@given(branch_name=valid_branch_names())
def test_checkout_new_branch(self, branch_name):
    assume(branch_name != "master")
    tmpdir = FilePath(self.mktemp())
    tmpdir.makedirs()
    repo = Repository.initialize(tmpdir.path)
    repo.checkout(branch_name, create=True)
    self.assertEqual(branch_name, repo.get_active_branch())
```

Why did we add `master` to the valid branch names if we were just going to
exclude it anyway? Because when other tests say "give me a valid branch name",
we want *them* to make the decision about whether `master` is appropriate or
not. Any future test author will be compelled to actually think about whether
handling `master` is a thing that they want to do. That's one of the great
benefits of Hypothesis: it's like having a rigorous design critic in your
team.

Going forward
-------------

We stopped there, but we need not have. Just as the test should have held for
any branch, it should also hold for any repository. We were just creating an
empty repository because it was convenient for us.

If we were to continue, the next step would be to [write a `repositories()`
function to generate repositories]({% post_url 2016-05-11-generating-the-right-data/)
with more varied contents, commit histories, and existing branches.
The test might then look something like this:

```python
@given(repo=repositories(), branch_name=valid_branch_names())
def test_checkout_new_branch(self, repo, branch_name):
    """
    Checking out a new branch results in it being the current active
    branch.
    """
    assume(branch_name not in repo.get_branches())
    repo.checkout(branch_name, create=True)
    self.assertEqual(branch_name, repo.get_active_branch())
```

This is about as close to a bona fide "property" as you're likely to get in
code that isn't a straight-up computer science problem: if you create and
switch to a branch that doesn't already exist, the new active branch is the
newly created branch.

We got there not by sitting down and thinking about the properties of our
software in the abstract, nor by necessarily knowing much about property-based
testing, but rather by incrementally taking advantage of features of Python
and Hypothesis. On the way, we discovered and, umm, contained a whole class of
bugs, and we made sure that all future tests would be heaps more powerful.
Win.