File: history.py

package info (click to toggle)
python-bumps 1.0.3-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 6,200 kB
  • sloc: python: 24,517; xml: 493; ansic: 373; makefile: 211; javascript: 99; sh: 94
file content (297 lines) | stat: -rw-r--r-- 10,087 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
# This program is in the public domain
# Author: Paul Kienzle
"""
Log of progress through a computation.

Each cycle through a computation, a process can update its history,
adding information about the number of function evaluations, the
total time taken, the set of points evaluated and their values, the
current best value and so on.  The process can use this history
when computing the next set of points to evaluate and when checking
if the termination conditions are met.  Any values that may be
useful outside the computation, e.g., for logging or for updating
the user, should be recorded.  In the ideal case, the history
is all that is needed to restart the process in case of a system
crash.

History consists of a set of traces.  The content of the traces
themselves is provided by the computation, but various stake holders
can use them.  For example, the user may wish to log the set of points
that have been evaluated and their values using the system logger
and an optimizer may require a certain amount of history to calculate
the next set of values.

New traces are defined using :meth:`History.provides`.  For example,
the following adds traces for 'value' and 'point' to the history, and
requires the value on the two previous cycles in order to do its work:

    >>> from bumps.history import History
    >>> h = History(value=2, point=0) # keep two values and zero points

Initially the history is empty:

    >>> print(len(h.value))
    0

After three updates we see that only two values are kept.

    >>> h.update(value=2.6, point=[1,1,1])
    >>> h.update(value=1, point=[1,0.5,1])
    >>> h.update(value=0.5, point=[1,0.5,0.9])
    >>> print(h.value)
    Trace value: 0.5, 1
    >>> print(len(h.value))
    2

Since the required length of 'point' is zero no values are kept:

    >>> print(h.point[0])
    Traceback (most recent call last):
        ...
    IndexError: point has not accumulated enough history

A history consumer can override this, and require a certain length of a trace.
Then future values will be preserved:

    >>> h.requires(point=1)
    >>> h.update(value=0.25, point=[1,0.5,0.92])
    >>> print(h.point[0])
    [1, 0.5, 0.92]

Traces are independent of each other. A new trace can be added to the history
and updated separately from the existing traces. This can be handy if there
are separate sources of history though it may be difficult to keep the in sync.
The following adds a 'step' to the existing history, initialized to 15, without
changing 'value' or 'point':

    >>> h.provides(step=2) # keep two steps
    >>> h.update(step=15) # initialize step to 15
    >>> print(h.step)
    Trace step: 15

Traces may be used as accumulators, with the delta added to the existing value
before being stored in the trace. For example:

    >>> h.accumulate(step=1)
    >>> print(h.step)
    Trace step: 16, 15

Within bumps, history is used by monitors, with bumps.fitters.MonitorRunner
managing updates to history and feeding them to the fit progress monitors.
"""

# Design questions:
# 1. Can optimizer function evaluators add traces?  Can they use traces?
# 2. Do we want to support a skip option on traces, so that only every nth
#    item is preserved?  This is probably too hard.


class History(object):
    """
    Collection of traces.

    Provided traces can be specified as key word arguments, name=length.
    """

    def __init__(self, **kw):
        self.provides(**kw)

    def provides(self, **kw):
        """
        Specify additional provided fields.

        Raises AttributeError if trace is already provided or if the trace
        name matches the name of one of the history methods.
        """
        for k, v in kw.items():
            # Make sure the additional trait is not already provided.
            # This test will also catch methods such as provides/requires
            # and static properties such as bounds that are set from outside.
            if hasattr(self, k):
                raise AttributeError("History already provides " + k)
            else:
                mon = self._new_trace(keep=v, name=k)
                setattr(self, k, mon)

    def requires(self, **kw):
        """
        Specify required fields, and their history length.
        """
        for k, v in kw.items():
            try:
                mon = getattr(self, k)
                mon.requires(v)
            except AttributeError:
                raise AttributeError("history does not provide " + k + "\nuse one of " + self._trace_names())

    def accumulate(self, **kw):
        """
        Extend the given traces with the provided values.  The traced
        value will be the old value plus the new value.
        """
        for k, v in kw.items():
            try:
                getattr(self, k).accumulate(v)
            except AttributeError:
                raise AttributeError(k + " is not being traced")

    def update(self, **kw):
        """
        Extend the given traces with the provided values.  The traced
        values are independent.  Use accumulate if you want to add the
        new value to the previous value in the trace.
        """
        for k, v in kw.items():
            try:
                getattr(self, k).put(v)
            except AttributeError:
                raise AttributeError(k + " is not being traced")

    def clear(self):
        """
        Clear history, removing all traces
        """
        self.__dict__.clear()

    def _new_trace(self, keep=None, name=None):
        """
        Create a new trace.  We use a factory method here so that
        the History subclass can control the kind of trace created.
        The returned trace must be a subclass of history.Trace.
        """
        return Trace(keep=keep, name=name)

    def _traces(self):
        return [trace for trace in self.__dict__.values() if isinstance(trace, Trace)]

    def _trace_names(self):
        traces = [trace.name for trace in self._traces()]
        return ", ".join(l for l in sorted(traces))

    def __str__(self):
        traces = sorted(self._traces(), lambda x, y: cmp(x.name, y.name))
        return "\n".join(str(l) for l in traces)

    def snapshot(self):
        """
        Return a dictionary of traces { 'name':  [v[n], v[n-1], ..., v[0]] }
        """
        # print("history snapshot", self._traces(), self.__dict__)
        return dict((trace.name, trace.snapshot()) for trace in self._traces())

    def restore(self, state):
        """
        Restore history to the state returned by a call to snapshot
        """
        for k, v in state.items():
            try:
                getattr(self, k).restore(v)
            except KeyError:
                pass


class Trace(object):
    """
    Value trace.

    This is a stack-like object with items inserted at the beginning, and
    removed from the end once the maximum length *keep* is reached.

    len(trace) returns the number of items in the trace
    trace[i] returns the ith previous element in the history
    trace.requires(n) says how much history to keep
    trace.put(value) stores value
    trace.accumulate(value) adds value to the previous value before storing
    state = trace.snapeshot() returns the values as a stack, most recent last
    trace.restore(state) restores a snapshot

    Note that snapshot/restore uses lists to represent numpy arrays, which
    may cause problems if the trace is capturing lists.
    """

    # Implementation note:
    # Traces are stored in reverse order because append is faster than insert.
    # This detail is hidden from the caller since __getitem__ returns the
    # appropriate value.
    # TODO: convert to circular buffer unless keeping the full trace
    # TODO: use numpy arrays for history

    def __init__(self, keep=1, name="trace"):
        self.keep = keep
        self._storage = []
        self.name = name

    def requires(self, n):
        """
        Set the trace length to be at least n.
        """
        # Note: never shorten the trace since another algorithm, condition,
        # or monitor may require the longer trace.
        if n > self.keep:
            self.keep = n

    def accumulate(self, value):
        if self.keep < 1:
            return
        try:
            value = self._storage[-1] + value
        except IndexError:
            pass  # Value is 0 + value => 0
        self.put(value)

    def put(self, value):
        """
        Add an item to the trace, shifting off from the beginning
        when the trace is full.
        """
        if self.keep < 1:
            return
        if len(self._storage) == self.keep:
            self._storage = self._storage[1:]
        self._storage.append(value)

    def __len__(self):
        return len(self._storage)

    def __getitem__(self, key):
        if key < 0:
            raise IndexError(self.name + " can only be accessed from the beginning")
        try:
            return self._storage[-key - 1]
        except IndexError:
            raise IndexError(self.name + " has not accumulated enough history")

    def __setitem__(self, key, value):
        raise TypeError("cannot write directly to a trace; use put instead")

    def __str__(self):
        return "Trace " + self.name + ": " + ", ".join([str(k) for k in reversed(self._storage)])

    def snapshot(self):
        """
        Capture state of the trace.

        Numpy arrays are converted to lists so that the trace can be easily
        converted to json.
        """
        import numpy as np

        if isinstance(self._storage[0], np.ndarray):
            return [v.tolist() for v in self._storage]
        else:
            return self._storage[:]

    def restore(self, state):
        """
        Restore a trace from a captured snapshot.

        Lists are converted to numpy arrays.
        """
        import numpy as np

        if isinstance(state[0], list):
            state = [np.asarray(v) for v in state]
        if len(state) > self.keep:
            self._storage = state[-self.keep :]
        else:
            self._storage = state[:]