File: zeo.txt

package info (click to toggle)
zodb 1%3A3.10.7-1
  • links: PTS, VCS
  • area: main
  • in suites: buster, stretch
  • size: 3,988 kB
  • ctags: 5,875
  • sloc: python: 33,695; ansic: 7,673; xml: 474; sh: 20; makefile: 18
file content (392 lines) | stat: -rw-r--r-- 16,604 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
==========================
Running a ZEO Server HOWTO
==========================

Introduction
------------

ZEO (Zope Enterprise Objects) is a client-server system for sharing a
single storage among many clients.  Normally, a ZODB storage can only
be used by a single process.  When you use ZEO, the storage is opened
in the ZEO server process.  Client programs connect to this process
using a ZEO ClientStorage.  ZEO provides a consistent view of the
database to all clients.  The ZEO client and server communicate using
a custom RPC protocol layered on top of TCP.

There are several configuration options that affect the behavior of a
ZEO server.  This section describes how a few of these features
working.  Subsequent sections describe how to configure every option.

Client cache
~~~~~~~~~~~~

Each ZEO client keeps an on-disk cache of recently used objects to
avoid fetching those objects from the server each time they are
requested.  It is usually faster to read the objects from disk than it
is to fetch them over the network.  The cache can also provide
read-only copies of objects during server outages.

The cache may be persistent or transient. If the cache is persistent,
then the cache files are retained for use after process restarts. A
non-persistent cache uses temporary files that are removed when the
client storage is closed.

The client cache size is configured when the ClientStorage is created.
The default size is 20MB, but the right size depends entirely on the
particular database.  Setting the cache size too small can hurt
performance, but in most cases making it too big just wastes disk
space.  The document "Client cache tracing" describes how to collect a
cache trace that can be used to determine a good cache size.

ZEO uses invalidations for cache consistency.  Every time an object is
modified, the server sends a message to each client informing it of
the change.  The client will discard the object from its cache when it
receives an invalidation.  These invalidations are often batched.

Each time a client connects to a server, it must verify that its cache
contents are still valid.  (It did not receive any invalidation
messages while it was disconnected.)  There are several mechanisms
used to perform cache verification.  In the worst case, the client
sends the server a list of all objects in its cache along with their
timestamps; the server sends back an invalidation message for each
stale object.  The cost of verification is one drawback to making the
cache too large.

Note that every time a client crashes or disconnects, it must verify
its cache.  Every time a server crashes, all of its clients must
verify their caches.

The cache verification process is optimized in two ways to eliminate
costs when restarting clients and servers.  Each client keeps the
timestamp of the last invalidation message it has seen.  When it
connects to the server, it checks to see if any invalidation messages
were sent after that timestamp.  If not, then the cache is up-to-date
and no further verification occurs.  The other optimization is the
invalidation queue, described below.

Invalidation queue
~~~~~~~~~~~~~~~~~~

The ZEO server keeps a queue of recent invalidation messages in
memory.  When a client connects to the server, it sends the timestamp
of the most recent invalidation message it has received.  If that
message is still in the invalidation queue, then the server sends the
client all the missing invalidations.  This is often cheaper than
perform full cache verification.

The default size of the invalidation queue is 100.  If the
invalidation queue is larger, it will be more likely that a client
that reconnects will be able to verify its cache using the queue.  On
the other hand, a large queue uses more memory on the server to store
the message.  Invalidation messages tend to be small, perhaps a few
hundred bytes each on average; it depends on the number of objects
modified by a transaction.

Transaction timeouts
~~~~~~~~~~~~~~~~~~~~

A ZEO server can be configured to timeout a transaction if it takes
too long to complete.  Only a single transaction can commit at a time;
so if one transaction takes too long, all other clients will be
delayed waiting for it.  In the extreme, a client can hang during the
commit process.  If the client hangs, the server will be unable to
commit other transactions until it restarts.  A well-behaved client
will not hang, but the server can be configured with a transaction
timeout to guard against bugs that cause a client to hang.

If any transaction exceeds the timeout threshold, the client's
connection to the server will be closed and the transaction aborted.
Once the transaction is aborted, the server can start processing other
client's requests.  Most transactions should take very little time to
commit.  The timer begins for a transaction after all the data has
been sent to the server.  At this point, the cost of commit should be
dominated by the cost of writing data to disk; it should be unusual
for a commit to take longer than 1 second.  A transaction timeout of
30 seconds should tolerate heavy load and slow communications between
client and server, while guarding against hung servers.

When a transaction times out, the client can be left in an awkward
position.  If the timeout occurs during the second phase of the two
phase commit, the client will log a panic message.  This should only
cause problems if the client transaction involved multiple storages.
If it did, it is possible that some storages committed the client
changes and others did not.

Connection management
~~~~~~~~~~~~~~~~~~~~~

A ZEO client manages its connection to the ZEO server.  If it loses
the connection, it attempts to reconnect.  While
it is disconnected, it can satisfy some reads by using its cache.

The client can be configured to wait for a connection when it is created
or to return immediately and provide data from its persistent cache.
It usually simplifies programming to have the client wait for a
connection on startup.

When the client is disconnected, it polls periodically to see if the
server is available.  The rate at which it polls is configurable.

The client can be configured with multiple server addresses.  In this
case, it assumes that each server has identical content and will use
any server that is available.  It is possible to configure the client
to accept a read-only connection to one of these servers if no
read-write connection is available.  If it has a read-only connection,
it will continue to poll for a read-write connection.  This feature
supports the Zope Replication Services product,
http://www.zope.com/Products/ZopeProducts/ZRS.  In general, it could
be used to with a system that arranges to provide hot backups of
servers in the case of failure.

If a single address resolves to multiple IPv4 or IPv6 addresses,
the client will connect to an arbitrary of these addresses.

Authentication
~~~~~~~~~~~~~~

ZEO supports optional authentication of client and server using a
password scheme similar to HTTP digest authentication (RFC 2069).  It
is a simple challenge-response protocol that does not send passwords
in the clear, but does not offer strong security.  The RFC discusses
many of the limitations of this kind of protocol.  Note that this
feature provides authentication only.  It does not provide encryption
or confidentiality.

The challenge-response also produces a session key that is used to
generate message authentication codes for each ZEO message.  This
should prevent session hijacking.

Guard the password database as if it contained plaintext passwords.
It stores the hash of a username and password.  This does not expose
the plaintext password, but it is sensitive nonetheless.  An attacker
with the hash can impersonate the real user.  This is a limitation of
the simple digest scheme.

The authentication framework allows third-party developers to provide
new authentication modules.

Installing software
-------------------

ZEO is distributed as part of the ZODB3 package and with Zope,
starting with Zope 2.7.  You can download it from
http://pypi.python.org/pypi/ZODB3.

Configuring server
------------------

The script runzeo.py runs the ZEO server.  The server can be
configured using command-line arguments or a config file.  This
document only describes the config file.  Run runzeo.py
-h to see the list of command-line arguments.

The runzeo.py script imports the ZEO package.  ZEO must either be
installed in Python's site-packages directory or be in a directory on
PYTHONPATH.  

The configuration file specifies the underlying storage the server
uses, the address it binds, and a few other optional parameters.
An example is::

    <zeo>
    address zeo.example.com:8090
    monitor-address zeo.example.com:8091
    </zeo>

    <filestorage 1>
    path /var/tmp/Data.fs
    </filestorage>

    <eventlog>
    <logfile>
    path /var/tmp/zeo.log
    format %(asctime)s %(message)s
    </logfile>
    </eventlog>

This file configures a server to use a FileStorage from
/var/tmp/Data.fs.  The server listens on port 8090 of zeo.example.com.
It also starts a monitor server that lists in port 8091.  The ZEO
server writes its log file to /var/tmp/zeo.log and uses a custom
format for each line.  Assuming the example configuration it stored in
zeo.config, you can run a server by typing::

    python /usr/local/bin/runzeo.py -C zeo.config

A configuration file consists of a <zeo> section and a storage
section, where the storage section can use any of the valid ZODB
storage types.  It may also contain an eventlog configuration.  See
the document "Configuring a ZODB database" for more information about
configuring storages and eventlogs.

The zeo section must list the address.  All the other keys are
optional.

address
        The address at which the server should listen.  This can be in
        the form 'host:port' to signify a TCP/IP connection or a
        pathname string to signify a Unix domain socket connection (at
        least one '/' is required).  A hostname may be a DNS name or a
        dotted IP address.  If the hostname is omitted, the platform's
        default behavior is used when binding the listening socket (''
        is passed to socket.bind() as the hostname portion of the
        address).

read-only
        Flag indicating whether the server should operate in read-only
        mode.  Defaults to false.  Note that even if the server is
        operating in writable mode, individual storages may still be
        read-only.  But if the server is in read-only mode, no write
        operations are allowed, even if the storages are writable.  Note
        that pack() is considered a read-only operation.

invalidation-queue-size
        The storage server keeps a queue of the objects modified by the
        last N transactions, where N == invalidation_queue_size.  This
        queue is used to speed client cache verification when a client
        disconnects for a short period of time.

monitor-address
        The address at which the monitor server should listen.  If
        specified, a monitor server is started.  The monitor server
        provides server statistics in a simple text format.  This can
        be in the form 'host:port' to signify a TCP/IP connection or a
        pathname string to signify a Unix domain socket connection (at
        least one '/' is required).  A hostname may be a DNS name or a
        dotted IP address.  If the hostname is omitted, the platform's
        default behavior is used when binding the listening socket (''
        is passed to socket.bind() as the hostname portion of the
        address).

transaction-timeout
        The maximum amount of time to wait for a transaction to commit
        after acquiring the storage lock, specified in seconds.  If the
        transaction takes too long, the client connection will be closed
        and the transaction aborted.

authentication-protocol
        The name of the protocol used for authentication.  The
        only protocol provided with ZEO is "digest," but extensions
        may provide other protocols.

authentication-database
        The path of the database containing authentication credentials.

authentication-realm
        The authentication realm of the server.  Some authentication
        schemes use a realm to identify the logic set of usernames
        that are accepted by this server.

Configuring clients
-------------------

The ZEO client can also be configured using ZConfig.  The ZODB.config
module provides several function for opening a storage based on its
configuration.

- ZODB.config.storageFromString()
- ZODB.config.storageFromFile()
- ZODB.config.storageFromURL()

The ZEO client configuration requires the server address be
specified.  Everything else is optional.  An example configuration is::

    <zeoclient>
    server zeo.example.com:8090
    </zeoclient>

The other configuration options are listed below.

storage
        The name of the storage that the client wants to use.  If the
        ZEO server serves more than one storage, the client selects
        the storage it wants to use by name.  The default name is '1',
        which is also the default name for the ZEO server.

cache-size
        The maximum size of the client cache, in bytes.

name
        The storage name.  If unspecified, the address of the server
        will be used as the name.

client
        Enables persistent cache files.  The string passed here is
        used to construct the cache filenames.  If it is not
        specified, the client creates a temporary cache that will
        only be used by the current object.

var
        The directory where persistent cache files are stored.  By
        default cache files, if they are persistent, are stored in 
        the current directory.

min-disconnect-poll
        The minimum delay in seconds between attempts to connect to
        the server, in seconds.  Defaults to 5 seconds.

max-disconnect-poll
        The maximum delay in seconds between attempts to connect to
        the server, in seconds.  Defaults to 300 seconds.

wait
        A boolean indicating whether the constructor should wait
        for the client to connect to the server and verify the cache
        before returning.  The default is true.

read-only
        A flag indicating whether this should be a read-only storage,
        defaulting to false (i.e. writing is allowed by default).

read-only-fallback
        A flag indicating whether a read-only remote storage should be
        acceptable as a fallback when no writable storages are
        available.  Defaults to false.  At most one of read_only and
        read_only_fallback should be true.
realm
        The authentication realm of the server.  Some authentication
        schemes use a realm to identify the logic set of usernames
        that are accepted by this server.

A ZEO client can also be created by calling the ClientStorage
constructor explicitly.  For example::

    from ZEO.ClientStorage import ClientStorage
    storage = ClientStorage(("zeo.example.com", 8090))

Running the ZEO server as a daemon
----------------------------------

In an operational setting, you will want to run the ZEO server a
daemon process that is restarted when it dies.  The zdaemon package
provides two tools for running daemons: zdrun.py and zdctl.py. You can
find zdaemon and it's documentation at
http://pypi.python.org/pypi/zdaemon.

Rotating log files
~~~~~~~~~~~~~~~~~~

ZEO will re-initialize its logging subsystem when it receives a
SIGUSR2 signal.  If you are using the standard event logger, you
should first rename the log file and then send the signal to the
server.  The server will continue writing to the renamed log file
until it receives the signal.  After it receives the signal, the
server will create a new file with the old name and write to it.

Tools
-----

There are a few scripts that may help running a ZEO server.  The
zeopack.py script connects to a server and packs the storage.  It can
be run as a cron job.  The zeoup.py script attempts to connect to a
ZEO server and verify that is is functioning.  The zeopasswd.py script
manages a ZEO servers password database.

Diagnosing problems
-------------------

If an exception occurs on the server, the server will log a traceback
and send an exception to the client.  The traceback on the client will
show a ZEO protocol library as the source of the error.  If you need
to diagnose the problem, you will have to look in the server log for
the rest of the traceback.