File: adapting-strategies.rst

package info (click to toggle)
python-hypothesis 6.138.0-1
links: PTS, VCS
area: main
in suites: sid
size: 15,272 kB
sloc: python: 62,853; ruby: 1,107; sh: 253; makefile: 41; javascript: 6
file content (134 lines) | stat: -rw-r--r-- 5,868 bytes
Adapting strategies
===================

This page discusses ways to adapt strategies to your needs, either by transforming them inline with |.map|, or filtering out unwanted inputs with |.filter| and |assume|.

Mapping strategy inputs
-----------------------

Sometimes you want to apply a simple transformation to a strategy. For instance, we know that we can generate lists of integers with ``lists(integers())``. But maybe we wanted to instead generate sorted lists. We could use an inline |.map| to achieve this:

.. code-block:: pycon

    >>> lists(integers()).map(sorted).example()
    [-25527, -24245, -93, -70, -7, 0, 39, 65, 112, 6189, 19469, 32526, 1566924430]

In general, ``strategy.map(f)`` returns a new strategy which transforms all the examples generated by ``strategy`` by calling ``f`` on them.

Filtering strategy inputs
-------------------------

Many strategies in Hypothesis offer some control over the kinds of values that get generated. For instance, ``integers(min_value=0)`` generates positive integers, and ``integers(100, 200)`` generates integers between ``100`` and ``200``.

Sometimes, you need more control than this. The inputs from a strategy may not match exactly what you need, and you just need to filter out a few bad cases.

For instance, suppose we have written a simple test involving the modulo operator ``%``:

.. code-block:: python

    from hypothesis import given, strategies as st

    @given(st.integers(), st.integers())
    def test_remainder_magnitude(a, b):
        # the remainder after division is always less than
        # the divisor
        assert abs(a % b) < abs(b)

Hypothesis will quickly report a failure for this test: ``ZeroDivisionError: integer modulo by zero``. Just like division, modulo isn't defined for 0. The case of ``b == 0`` isn't interesting for the test, and we would like to get rid of it.

The best way to do this is with the |.filter| method:

.. code-block:: python

    from hypothesis import assume, given, strategies as st

    @given(st.integers(), st.integers().filter(lambda n: n != 0))
    def test_remainder_magnitude(a, b):
        # b is guaranteed to be nonzero here, thanks to the filter
        assert abs(a % b) < abs(b)

This test now passes cleanly.

Calling |.filter| on a strategy creates a new strategy with that filter applied at generation-time. For instance, ``integers().filter(lambda n: n != 0)`` is a strategy which generates nonzero integers.

Assuming away test cases
------------------------

|.filter| lets you filter test inputs from a single strategy. Hypothesis also provides an |assume| function for when you need to filter an entire test case, based on an arbitrary condition.

The |assume| function skips test cases where some condition evaluates to ``True``. You can use it anywhere in your test. We could have expressed our |.filter| example above using |assume| as well:

.. code-block:: python

    from hypothesis import assume, given, strategies as st

    @given(st.integers(), st.integers())
    def test_remainder_magnitude(a, b):
        assume(b != 0)
        # b will be nonzero here
        assert abs(a % b) < abs(b)

|assume| vs |.filter|
~~~~~~~~~~~~~~~~~~~~~

Where possible, you should use |.filter|. Hypothesis can often rewrite simple filters into more efficient sampling methods than rejection sampling, and will retry filters several times instead of aborting the entire test case (as with |assume|).

For more complex relationships that can't be expressed with |.filter|, use |assume|.

Here's an example of a test where we want to filter out two different types of examples:

.. code-block:: python

    from hypothesis import assume, given, strategies as st

    @given(st.integers(), st.integers())
    def test_floor_division_lossless_when_b_divides_a(a, b):
        # we want to assume that:
        # * b is nonzero, and
        # * b divides a
        assert (a // b) * b == a

We could start by using |assume| for both:

.. code-block:: python

    from hypothesis import assume, given, strategies as st

    @given(st.integers(), st.integers())
    def test_floor_division_lossless_when_b_divides_a(a, b):
        assume(b != 0)
        assume(a % b == 0)
        assert (a // b) * b == a

And then notice that the ``b != 0`` condition can be moved into the strategy definition as a |.filter| call:

.. code-block:: python

    from hypothesis import assume, given, strategies as st

    @given(st.integers(), st.integers().filter(lambda n: n != 0))
    def test_floor_division_lossless_when_b_divides_a(a, b):
        assume(a % b == 0)
        assert (a // b) * b == a

However, the ``a % b == 0`` condition has to stay as an |assume|, because it expresses a more complicated relationship between ``a`` and ``b``.

|assume| vs early-returning
~~~~~~~~~~~~~~~~~~~~~~~~~~~

One other way we could have avoided the divide-by-zero error inside the test function is to early-return when ``b == 0``:

.. code-block:: python

    from hypothesis import assume, given, strategies as st

    @given(st.integers(), st.integers())
    def test_remainder_magnitude(a, b):
        if b == 0:
            # bad plan - test "passes" without checking anything!
            return
        assert abs(a % b) < abs(b)

While this would have avoided the divide-by-zero, early-returning is not the same as using |assume|. With |assume|, Hypothesis knows that a test case has been filtered out, and will not count it towards the |max_examples| limit. In contrast, early-returns are counted as a passing test, even though the assertions didn't run! In more complicted cases, this could end up testing your code less than you expect, because many test cases get discarded without Hypothesis knowing about it.

In addition, |assume| lets you skip the test case at any point in the test, even inside arbitrarily deep nestings of functions.