File: permmessages.html

package info (click to toggle)
db5.3 5.3.28%2Bdfsg1-0.8
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 158,400 kB
  • sloc: ansic: 448,406; java: 111,824; tcl: 80,544; sh: 44,326; cs: 33,697; cpp: 21,604; perl: 14,557; xml: 10,799; makefile: 4,077; javascript: 1,998; yacc: 1,003; awk: 965; sql: 801; erlang: 342; python: 216; php: 24; asm: 14
file content (410 lines) | stat: -rw-r--r-- 21,173 bytes parent folder | download | duplicates (8)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>Permanent Message Handling</title>
    <link rel="stylesheet" href="gettingStarted.css" type="text/css" />
    <meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
    <link rel="start" href="index.html" title="Getting Started with Replicated Berkeley DB Applications" />
    <link rel="up" href="introduction.html" title="Chapter 1. Introduction" />
    <link rel="prev" href="elections.html" title="Holding Elections" />
    <link rel="next" href="txnapp.html" title="Chapter 2. Transactional Application" />
  </head>
  <body>
    <div xmlns="" class="navheader">
      <div class="libver">
        <p>Library Version 11.2.5.3</p>
      </div>
      <table width="100%" summary="Navigation header">
        <tr>
          <th colspan="3" align="center">Permanent Message Handling</th>
        </tr>
        <tr>
          <td width="20%" align="left"><a accesskey="p" href="elections.html">Prev</a> </td>
          <th width="60%" align="center">Chapter 1. Introduction</th>
          <td width="20%" align="right"> <a accesskey="n" href="txnapp.html">Next</a></td>
        </tr>
      </table>
      <hr />
    </div>
    <div class="sect1" lang="en" xml:lang="en">
      <div class="titlepage">
        <div>
          <div>
            <h2 class="title" style="clear: both"><a id="permmessages"></a>Permanent Message Handling</h2>
          </div>
        </div>
      </div>
      <div class="toc">
        <dl>
          <dt>
            <span class="sect2">
              <a href="permmessages.html#permmessagenot">When Not to Manage
                            Permanent Messages</a>
            </span>
          </dt>
          <dt>
            <span class="sect2">
              <a href="permmessages.html#permmanage">Managing Permanent Messages</a>
            </span>
          </dt>
          <dt>
            <span class="sect2">
              <a href="permmessages.html#permimplement">Implementing Permanent
                    Message Handling</a>
            </span>
          </dt>
        </dl>
      </div>
      <p>
                Messages received by a replica may be marked with 
                special flag that indicates the message is permanent. 
                Custom replicated applications will receive notification of
                this flag via the <code class="literal">DB_REP_ISPERM</code> return value
                from the 
                    
                    
                method.
                
                There is no hard requirement that a replication application look for, or
                respond to, this return code. However, because robust replicated
                applications typically do manage permanent messages, we introduce 
                the concept here. 
            </p>
      <p>
                    A message is marked as being permanent if the message
                    affects transactional integrity. For example,
                    transaction commit messages are an example of a message
                    that is marked permanent. What the application does
                    about the permanent message is driven by the durability
                    guarantees required by the application.
            </p>
      <p>
                    For example, consider what the Replication Manager does when it
                    has permanent message handling turned on and a
                    transactional commit record is sent to the replicas.
                    First, the replicas must transactional-commit the data
                    modifications identified by the message. And then, upon
                    a successful commit, the Replication Manager sends the master a
                    message acknowledgment.
            </p>
      <p>
                    For the master (again, using the Replication Manager), things are a little more complicated than
                simple message acknowledgment.  Usually in a replicated
                application, the master commits transactions
                asynchronously; that is, the commit operation does not
                block waiting for log data to be flushed to disk before
                returning. So when a master is managing permanent
                messages, it typically blocks the committing thread
                immediately before <code class="methodname">commit()</code>
                returns. The thread then waits for acknowledgments from
                its replicas. If it receives enough acknowledgments, it
                continues to operate as normal. 
            </p>
      <p>
                If the master does not
                receive message acknowledgments — or, more likely, it does not receive
                <span class="emphasis"><em>enough</em></span> acknowledgments — the
                committing thread flushes its log data to disk and then
                continues operations as normal. The master application can
                do this because replicas that fail to handle a message, for
                whatever reason, will eventually catch up to the master. So
                by flushing the transaction logs to disk, the master is
                ensuring that the data modifications have made it to
                stable storage in one location (its own hard drive).
            </p>
      <div class="sect2" lang="en" xml:lang="en">
        <div class="titlepage">
          <div>
            <div>
              <h3 class="title"><a id="permmessagenot"></a>When Not to Manage
                            Permanent Messages</h3>
            </div>
          </div>
        </div>
        <p>
                            There are two reasons why you might
                            choose to not implement permanent messages.
                            In part, these go to why you are using
                            replication in the first place.
                    </p>
        <p>
                        One class of applications uses replication so that
                        the application can improve transaction
                        through-put. Essentially, the application chooses a
                        reduced transactional durability guarantee so as to
                        avoid the overhead forced by the disk I/O required
                        to flush transaction logs to disk. However, the
                        application can then regain that durability
                        guarantee to a certain degree by replicating the
                        commit to some number of replicas.
                    </p>
        <p>
                        Using replication to improve an application's
                        transactional commit guarantee is called
                        <span class="emphasis"><em>replicating to the network.</em></span>
                    </p>
        <p>
                        In extreme cases where performance is of critical
                        importance to the application, the master might
                        choose to both use asynchronous commits
                        <span class="emphasis"><em>and</em></span> decide not to wait for
                        message acknowledgments. In this case the master
                        is simply broadcasting its commit activities to its
                        replicas without waiting for any sort of a reply. An
                        application like this might also choose to use
                        something other than TCP/IP for its network
                        communications since that protocol involves a fair
                        amount of packet acknowledgment all on its own. Of
                        course, this sort of an application should also be
                        very sure about the reliability of both its network and
                        the machines that are hosting its replicas.
                    </p>
        <p>
                            At the other extreme, there is a
                            class of applications that use replication
                            purely to improve read performance. This sort
                            of application might choose to use synchronous
                            commits on the master because write
                            performance there is not of critical
                            performance. In any case, this kind of an
                            application might not care to know whether its
                            replicas have received and successfully handled
                            permanent messages because the primary storage
                            location is assumed to be on the master, not
                            the replicas.
                    </p>
      </div>
      <div class="sect2" lang="en" xml:lang="en">
        <div class="titlepage">
          <div>
            <div>
              <h3 class="title"><a id="permmanage"></a>Managing Permanent Messages</h3>
            </div>
          </div>
        </div>
        <p>
                            With the exception of a rare breed of
                            replicated applications, most masters need some
                            view as to whether commits are occurring on
                            replicas as expected. At a minimum, this is because
                            masters will not flush their log buffers unless
                            they have reason to expect that permanent
                            messages have not been committed on the
                            replicas. 
                    </p>
        <p>
                        That said, it is important to remember that
                        managing permanent messages involves a fair amount
                        of network traffic. The messages must be sent to
                        the replicas and the replicas must acknowledge
                        them. This represents a performance overhead
                        that can be worsened by congested networks or
                        outright outages.
                    </p>
        <p>
                        Therefore, when managing permanent messages, you
                        must first decide on how many of your replicas must
                        send acknowledgments before your master decides
                        that all is well and it can continue normal
                        operations. When making this decision, you could
                        decide that <span class="emphasis"><em>all</em></span> replicas must
                        send acknowledgments. But unless you have only one
                        or two replicas, or you are replicating over a very
                        fast and reliable network, this policy could prove
                        very harmful to your application's performance.
                    </p>
        <p>
                        Therefore, a common strategy is to wait for an
                        acknowledgment from a simple majority of replicas.
                        This ensures that commit activity has occurred on
                        enough machines that you can be reliably certain
                        that data writes are preserved across your network.
                    </p>
        <p>
                        Remember that replicas that do not acknowledge a
                        permanent message are not necessarily unable to
                        perform the commit; it might be that network
                        problems have simply resulted in a delay at the
                        replica. In any case, the underlying DB
                        replication code is written such that a replica that
                        falls behind the master will eventually take action
                        to catch up.
                    </p>
        <p>
                            Depending on your application, it may be
                            possible for you to code your permanent message
                            handling such that acknowledgment must come
                            from only one or two replicas. This is a
                            particularly attractive strategy if you are
                            closely managing which machines are eligible to
                            become masters. Assuming that you have one or
                            two machines designated to be a master in the
                            event that the current master goes down, you
                            may only want to receive acknowledgments from
                            those specific machines.
                    </p>
        <p>
                        Finally, beyond simple message acknowledgment, you
                        also need to implement an acknowledgment timeout
                        for your application. This timeout value is simply
                        meant to ensure that your master does not hang
                        indefinitely waiting for responses that will never
                        come because a machine or router is down.
                    </p>
      </div>
      <div class="sect2" lang="en" xml:lang="en">
        <div class="titlepage">
          <div>
            <div>
              <h3 class="title"><a id="permimplement"></a>Implementing Permanent
                    Message Handling</h3>
            </div>
          </div>
        </div>
        <p>
                            How you implement permanent message handling
                            depends on which API you are using to implement
                            replication. If you are using the Replication Manager, then
                            permanent message handling is configured using
                            policies that you specify to the framework. In
                            this case, you can configure your application
                            to:
                   </p>
        <div class="itemizedlist">
          <ul type="disc">
            <li>
              <p>
                                    Ignore permanent messages (the master
                                    does not wait for acknowledgments). 
                                   </p>
            </li>
            <li>
              <p>
                                           Require acknowledgments from a
                                           quorum. A quorum is reached when
                                           acknowledgments are received from the
                                           minimum number of electable
                                           peers needed to ensure that
                                           the record remains durable if
                                           an election is held. 
                                   </p>
              <p>
                                       An <span class="emphasis"><em>electable peer</em></span> is any other
                                       site that potentially can be elected master.
                                   </p>
              <p>
                                           The goal here is to be
                                           absolutely sure the record is
                                           durable. The master wants to
                                           hear from enough electable
                                           peer that they have
                                           committed the record so that if
                                           an election is held, the master
                                           knows the record will exist even
                                           if a new master is selected.
                                   </p>
              <p>
                                           This is the default policy.
                                   </p>
            </li>
            <li>
              <p>
                                     Require an acknowledgment from at least one replica. 
                                   </p>
            </li>
            <li>
              <p>
                                           Require acknowledgments from
                                           all replicas.
                                   </p>
            </li>
            <li>
              <p>
                                      Require an acknowledgment from at least one electable peer.
                                   </p>
            </li>
            <li>
              <p>
                                           Require acknowledgments from all electable peers.
                                   </p>
            </li>
          </ul>
        </div>
        <p>
                        Note that the Replication Manager simply flushes its transaction
                        logs and moves on if a permanent message is not
                        sufficiently acknowledged.
                   </p>
        <p>
                        For details on permanent message handling with the
                        Replication Manager, see <a class="xref" href="fwrkpermmessage.html" title="Permanent Message Handling">Permanent Message Handling</a>.
                   </p>
        <p>
                        If these policies are not sufficient for your
                        needs, or if you want your application to take more
                        corrective action than simply flushing log buffers
                        in the event of an unsuccessful commit, then you
                        must use implement replication using the Base APIs.
                   </p>
        <p>
                        When using the Base APIs, messages are
                        sent from the master to its replica using a
                        <code class="function">send()</code> callback that you
                        implement.  Note, however, that DB's replication 
                        code automatically sets the permanent 
                        flag for you where appropriate. 
                   </p>
        <p>
                        If the <code class="function">send()</code> callback returns with a
                        non-zero status, DB flushes the transaction log 
                        buffers for you. Therefore, you must cause your
                        <code class="function">send()</code> callback to block waiting
                        for acknowledgments from your replicas. 
                        As a part of implementing the
                        <code class="function">send()</code> callback, you implement
                        your permanent message handling policies. This
                        means that you identify how many replicas must
                        acknowledge the message before the callback can
                        return <code class="literal">0</code>.  You must also
                        implement the acknowledgment timeout, if any.
                   </p>
        <p>
                        Further, message acknowledgments are sent from the
                        replicas to the master using a communications
                        channel that you implement (the replication code
                        does not provide a channel for acknowledgments).
                        So implementing permanent messages means that when
                        you write your replication communications channel,
                        you must also write it in such a way as to also
                        handle permanent message acknowledgments.
                   </p>
        <p>
                        For more information on implementing permanent
                        message handling using a custom replication layer,
                        see the <em class="citetitle">Berkeley DB Programmer's Reference Guide</em>.
                   </p>
      </div>
    </div>
    <div class="navfooter">
      <hr />
      <table width="100%" summary="Navigation footer">
        <tr>
          <td width="40%" align="left"><a accesskey="p" href="elections.html">Prev</a> </td>
          <td width="20%" align="center">
            <a accesskey="u" href="introduction.html">Up</a>
          </td>
          <td width="40%" align="right"> <a accesskey="n" href="txnapp.html">Next</a></td>
        </tr>
        <tr>
          <td width="40%" align="left" valign="top">Holding Elections </td>
          <td width="20%" align="center">
            <a accesskey="h" href="index.html">Home</a>
          </td>
          <td width="40%" align="right" valign="top"> Chapter 2. Transactional Application</td>
        </tr>
      </table>
    </div>
  </body>
</html>