1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317
|
=====
Usage
=====
This plugin provides a `benchmark` fixture. This fixture is a callable object that will benchmark any function passed
to it.
Example:
.. code-block:: python
def something(duration=0.000001):
"""
Function that needs some serious benchmarking.
"""
time.sleep(duration)
# You may return anything you want, like the result of a computation
return 123
def test_my_stuff(benchmark):
# benchmark something
result = benchmark(something)
# Extra code, to verify that the run completed correctly.
# Sometimes you may want to check the result, fast functions
# are no good if they return incorrect results :-)
assert result == 123
You can also pass extra arguments:
.. code-block:: python
def test_my_stuff(benchmark):
benchmark(time.sleep, 0.02)
Or even keyword arguments:
.. code-block:: python
def test_my_stuff(benchmark):
benchmark(time.sleep, duration=0.02)
Another pattern seen in the wild, that is not recommended for micro-benchmarks (very fast code) but may be convenient:
.. code-block:: python
def test_my_stuff(benchmark):
@benchmark
def something(): # unnecessary function call
time.sleep(0.000001)
A better way is to just benchmark the final function:
.. code-block:: python
def test_my_stuff(benchmark):
benchmark(time.sleep, 0.000001) # way more accurate results!
If you need to do fine control over how the benchmark is run (like a `setup` function, exact control of `iterations` and
`rounds`) there's a special mode - pedantic_:
.. code-block:: python
def my_special_setup():
...
def test_with_setup(benchmark):
benchmark.pedantic(something, setup=my_special_setup, args=(1, 2, 3), kwargs={'foo': 'bar'}, iterations=10, rounds=100)
Commandline options
===================
``py.test`` command-line options:
--benchmark-min-time=SECONDS
Minimum time per round in seconds. Default: '0.000005'
--benchmark-max-time=SECONDS
Maximum run time per test - it will be repeated until
this total time is reached. It may be exceeded if test
function is very slow or --benchmark-min-rounds is large
(it takes precedence). Default: '1.0'
--benchmark-min-rounds=NUM
Minimum rounds, even if total time would exceed `--max-
time`. Default: 5
--benchmark-timer=FUNC
Timer to use when measuring time. Default:
'time.perf_counter'
--benchmark-calibration-precision=NUM
Precision to use when calibrating number of iterations.
Precision of 10 will make the timer look 10 times more
accurate, at a cost of less precise measure of
deviations. Default: 10
--benchmark-warmup=[KIND]
Activates warmup. Will run the test function up to
number of times in the calibration phase. See
`--benchmark-warmup-iterations`. Note: Even the warmup
phase obeys --benchmark-max-time. Available KIND:
'auto', 'off', 'on'. Default: 'auto' (automatically
activate on PyPy).
--benchmark-warmup-iterations=NUM
Max number of iterations to run in the warmup phase.
Default: 100000
--benchmark-disable-gc
Disable GC during benchmarks.
--benchmark-skip Skip running any tests that contain benchmarks.
--benchmark-disable Disable benchmarks. Benchmarked functions are only ran
once and no stats are reported. Use this is you want to
run the test but don't do any benchmarking.
--benchmark-enable Forcibly enable benchmarks. Use this option to override
--benchmark-disable (in case you have it in pytest
configuration).
--benchmark-only Only run benchmarks. This overrides --benchmark-skip.
--benchmark-save=NAME
Save the current run into 'STORAGE-PATH/counter-
NAME.json'. Default: '<commitid>_<date>_<time>_<isdirty>', example:
'e689af57e7439b9005749d806248897ad550eab5_20150811_041632_uncommitted-changes'.
--benchmark-autosave Autosave the current run into 'STORAGE-PATH/<counter>_<commitid>_<date>_<time>_<isdirty>',
example:
'STORAGE-PATH/0123_525685bcd6a51d1ade0be75e2892e713e02dfd19_20151028_221708_uncommitted-changes.json'
--benchmark-save-data
Use this to make --benchmark-save and --benchmark-
autosave include all the timing data, not just the
stats.
--benchmark-json=PATH
Dump a JSON report into PATH. Note that this will
include the complete data (all the timings, not just the
stats).
--benchmark-compare=[NUM|_ID]
Compare the current run against run NUM (or prefix of
_id in elasticsearch) or the latest saved run if
unspecified.
--benchmark-compare-fail=EXPR [EXPR ...]
Fail test if performance regresses according to given
EXPR (eg: min:5% or mean:0.001 for number of seconds).
Can be used multiple times.
--benchmark-cprofile=COLUMN
If specified cProfile will be enabled. Top functions
will be stored for the given column. Available columns:
'ncalls_recursion', 'ncalls', 'tottime', 'tottime_per',
'cumtime', 'cumtime_per', 'function_name'.
--benchmark-cprofile-loops=LOOPS
How many times to run the function in cprofile.
Available options: 'auto', or an integer.
--benchmark-cprofile-top=COUNT
How many rows to display.
--benchmark-cprofile-dump=[FILENAME-PREFIX]
Save cprofile dumps as FILENAME-PREFIX-test_name.prof.
If FILENAME-PREFIX contains slashes ('/') then
directories will be created. Default:
'benchmark_20241028_160327'
--benchmark-time-unit=COLUMN
Unit to scale the results to. Available units: 'ns',
'us', 'ms', 's'. Default: 'auto'.
--benchmark-storage=URI
Specify a path to store the runs as uri in form
file://path or elasticsearch+http[s]://host1,host2/[inde
x/doctype?project_name=Project] (when --benchmark-save
or --benchmark-autosave are used). For backwards
compatibility unexpected values are converted to
file://<value>. Default: 'file://./.benchmarks'.
--benchmark-netrc=[BENCHMARK_NETRC]
Load elasticsearch credentials from a netrc file.
Default: ''.
--benchmark-verbose Dump diagnostic and progress information.
--benchmark-quiet Disable reporting. Verbose mode takes precedence.
--benchmark-sort=COL Column to sort on. Can be one of: 'min', 'max', 'mean',
'stddev', 'name', 'fullname'. Default: 'min'
--benchmark-group-by=LABEL
How to group tests. Can be one of: 'group', 'name',
'fullname', 'func', 'fullfunc', 'param' or 'param:NAME',
where NAME is the name passed to @pytest.parametrize.
Default: 'group'
--benchmark-columns=LABELS
Comma-separated list of columns to show in the result
table. Default: 'min, max, mean, stddev, median, iqr,
outliers, ops, rounds, iterations'
--benchmark-name=FORMAT
How to format names in results. Can be one of 'short',
'normal', 'long', or 'trial'. Default: 'normal'
--benchmark-histogram=[FILENAME-PREFIX]
Plot graphs of min/max/avg/stddev over time in
FILENAME-PREFIX-test_name.svg. If FILENAME-PREFIX
contains slashes ('/') then directories will be
created. Default: 'benchmark_<date>_<time>'
.. _comparison-cli:
Comparison CLI
--------------
An extra ``py.test-benchmark`` bin is available for inspecting previous benchmark data::
py.test-benchmark [-h [COMMAND]] [--storage URI] [--netrc [NETRC]]
[--verbose]
{help,list,compare} ...
Commands:
help Display help and exit.
list List saved runs.
compare Compare saved runs.
The compare ``command`` takes almost all the ``--benchmark`` options, minus the prefix:
positional arguments:
glob_or_file Glob or exact path for json files. If not specified
all runs are loaded.
options:
-h, --help show this help message and exit
--sort=COL Column to sort on. Can be one of: 'min', 'max',
'mean', 'stddev', 'name', 'fullname'. Default: 'min'
--group-by=LABELS Comma-separated list of categories by which to
group tests. Can be one or more of: 'group', 'name',
'fullname', 'func', 'fullfunc', 'param' or
'param:NAME', where NAME is the name passed to
@pytest.parametrize. Default: 'group'
--columns=LABELS Comma-separated list of columns to show in the result
table. Default: 'min, max, mean, stddev, median, iqr,
outliers, rounds, iterations'
--name=FORMAT How to format names in results. Can be one of 'short',
'normal', 'long', or 'trial'. Default: 'normal'
--histogram=FILENAME-PREFIX
Plot graphs of min/max/avg/stddev over time in
FILENAME-PREFIX-test_name.svg. If FILENAME-PREFIX
contains slashes ('/') then directories will be
created. Default: 'benchmark_<date>_<time>'
--csv=FILENAME Save a csv report. If FILENAME contains slashes ('/')
then directories will be created. Default:
'benchmark_<date>_<time>'
examples:
pytest-benchmark compare 'Linux-CPython-3.5-64bit/*'
Loads all benchmarks ran with that interpreter. Note the special quoting that disables your shell's glob
expansion.
pytest-benchmark compare 0001
Loads first run from all the interpreters.
pytest-benchmark compare /foo/bar/0001_abc.json /lorem/ipsum/0001_sir_dolor.json
Loads runs from exactly those files.
Markers
=======
You can set per-test options with the ``benchmark`` marker:
.. code-block:: python
@pytest.mark.benchmark(
group="group-name",
min_time=0.1,
max_time=0.5,
min_rounds=5,
timer=time.time,
disable_gc=True,
warmup=False
)
def test_my_stuff(benchmark):
@benchmark
def result():
# Code to be measured
return time.sleep(0.000001)
# Extra code, to verify that the run
# completed correctly.
# Note: this code is not measured.
assert result is None
Extra info
==========
You can set arbirary values in the ``benchmark.extra_info`` dictionary, which
will be saved in the JSON if you use ``--benchmark-autosave`` or similar:
.. code-block:: python
def test_my_stuff(benchmark):
benchmark.extra_info['foo'] = 'bar'
benchmark(time.sleep, 0.02)
Patch utilities
===============
Suppose you want to benchmark an ``internal`` function from a class:
.. sourcecode:: python
class Foo(object):
def __init__(self, arg=0.01):
self.arg = arg
def run(self):
self.internal(self.arg)
def internal(self, duration):
time.sleep(duration)
With the ``benchmark`` fixture this is quite hard to test if you don't control the ``Foo`` code or it has very
complicated construction.
For this there's an experimental ``benchmark_weave`` fixture that can patch stuff using `aspectlib
<https://github.com/ionelmc/python-aspectlib>`_ (make sure you ``pip install aspectlib`` or ``pip install
pytest-benchmark[aspect]``):
.. sourcecode:: python
def test_foo(benchmark):
benchmark.weave(Foo.internal, lazy=True):
f = Foo()
f.run()
.. _pedantic: http://pytest-benchmark.readthedocs.org/en/latest/pedantic.html
|