1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103
|
Benchmarks on PyTables Undo/Redo
================================
This is a small report for the performance of the Undo/Redo feature in
PyTables.
A small script (see undo_redo.py) has been made in order to check
different scenarios for Undo/Redo, like creating single nodes, copying
children from one group to another, and creating attributes.
Undo/Redo is independent of object size
---------------------------------------
Firstly, one thing to be noted is that the Undo/Redo feature is
independent of the object size that is being treated. For example, the
times for 10 objects (flag -n) each one with 10 elements (flag -s) is:
$ time python2.4 undo_redo.py -n 10 -i 2 -s 10 data.nobackup/undo_redo.h5
Time for Undo, Redo (createNode): 0.213686943054 s, 0.0727670192719 s
Time for Undo, Redo (createNode): 0.271666049957 s, 0.0740389823914 s
Time for Undo, Redo (copy_children): 0.296227931976 s, 0.161941051483 s
Time for Undo, Redo (copy_children): 0.363519906998 s, 0.162662982941 s
Time for Undo, Redo (set_attr): 0.208750009537 s, 0.0732419490814 s
Time for Undo, Redo (set_attr): 0.27628993988 s, 0.0736088752747 s
real 0m5.557s
user 0m4.354s
sys 0m0.729s
Note how all tests take more or less the same amount of time. This is
because a move operation is used as a central tool to implement the
Undo/Redo feature. Such a move operation has a constant cost,
independently of the size of the objects. For example, using objects
with 1000 elements, we can see that this does not affect the Undo/Redo
speed:
$ time python2.4 undo_redo.py -n 10 -i 2 -s 1000 data.nobackup/undo_redo.h5
Time for Undo, Redo (createNode): 0.213760137558 s, 0.0717759132385 s
Time for Undo, Redo (createNode): 0.276151895523 s, 0.0724079608917 s
Time for Undo, Redo (copy_children): 0.308417797089 s, 0.168260812759 s
Time for Undo, Redo (copy_children): 0.382102966309 s, 0.168042898178 s
Time for Undo, Redo (set_attr): 0.209735155106 s, 0.0740969181061 s
Time for Undo, Redo (set_attr): 0.279798984528 s, 0.0770981311798 s
real 0m5.835s
user 0m4.585s
sys 0m0.736s
Undo/Redo times grow linearly with the number of objects implied
----------------------------------------------------------------
Secondly, the time for doing/undoing is obviously proportional
(linearly) to the number of objects that are implied in that process
(set by -n):
$ time python2.4 undo_redo.py -n 100 -i 2 -s 10 data.nobackup/undo_redo.h5
Time for Undo, Redo (createNode): 2.27267885208 s, 0.779091119766 s
Time for Undo, Redo (createNode): 2.31264209747 s, 0.766252040863 s
Time for Undo, Redo (copy_children): 3.01871585846 s, 1.63346219063 s
Time for Undo, Redo (copy_children): 3.07704997063 s, 1.62615203857 s
Time for Undo, Redo (set_attr): 2.18017196655 s, 0.809293985367 s
Time for Undo, Redo (set_attr): 2.23039293289 s, 0.809432029724 s
real 0m48.395s
user 0m40.385s
sys 0m6.914s
A note on actual performance and place for improvement
------------------------------------------------------
Finally, note how the Undo/Redo capability of PyTables is pretty
fast. The next benchmark makes 1000 undo and 1000 redos for
create_array:
$ time python2.4 undo_redo.py -n 1000 -i 2 -t createNode -s 1000 data.nobackup/undo_redo.h5
Time for Undo, Redo (createNode): 22.7840828896 s, 7.9872610569 s
Time for Undo, Redo (createNode): 22.2799329758 s, 7.95833396912 s
real 1m32.307s
user 1m16.598s
sys 0m15.105s
i.e. an undo takes 23 milliseconds while a redo takes 8 milliseconds
approximately.
The fact that undo operations take 3 times more than redo is probably
due to how the action log is implemented. The action log has been
implemented as a Table object, and PyTables has been optimized to read
rows of tables in *forward* direction (the one needed for redo
operations). However, when looking in *backward* direction (needed for
undo operations), the internal cache of PyTables is counterproductive
and makes look-ups quite slow (compared with forward access).
Nevertheless, the code for Undo/Redo has been optimized quite a bit to
smooth this kind of access as much as possible, but with a relative
success. A more definitive optimization should involve getting much
better performance for reading tables in backward direction. That
would be a major task, and can be eventually addressed in the future.
Francesc Alted
2005-03-10
|