File: bidirtransrepl.xml

package info (click to toggle)
virtuoso-opensource 7.2.5.1%2Bdfsg1-0.3
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 285,240 kB
  • sloc: ansic: 641,220; sql: 490,413; xml: 269,570; java: 83,893; javascript: 79,900; cpp: 36,927; sh: 31,653; cs: 25,702; php: 12,690; yacc: 10,227; lex: 7,601; makefile: 7,129; jsp: 4,523; awk: 1,697; perl: 1,013; ruby: 1,003; python: 326
file content (343 lines) | stat: -rw-r--r-- 13,793 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
<?xml version="1.0" encoding="ISO-8859-1"?>
<!--
 -  
 -  This file is part of the OpenLink Software Virtuoso Open-Source (VOS)
 -  project.
 -  
 -  Copyright (C) 1998-2018 OpenLink Software
 -  
 -  This project is free software; you can redistribute it and/or modify it
 -  under the terms of the GNU General Public License as published by the
 -  Free Software Foundation; only version 2 of the License, dated June 1991.
 -  
 -  This program is distributed in the hope that it will be useful, but
 -  WITHOUT ANY WARRANTY; without even the implied warranty of
 -  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
 -  General Public License for more details.
 -  
 -  You should have received a copy of the GNU General Public License along
 -  with this program; if not, write to the Free Software Foundation, Inc.,
 -  51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
 -  
 -  
-->
<sect2 id="bidirtransrepl"><title>Bi-Directional Transactional Replication</title>

 <para>Virtuoso supports bi-directional transactional replication via a mechanism 
 of updateable subscriptions.  The following rules and conditions must be observed:</para>

<simplelist>
  <member>Every table has only one publisher.</member>
  <member>Only direct subscribers are considered.</member>
  <member>Only replication of tables is allowed.</member>
</simplelist>

 <para>It is assumed that all the tables within a publication have primary keys 
 and that the primary key columns are never modified.</para>

 <para>Every transaction has an origin, i.e. the originating server on which 
 the transaction was performed.</para>

 <para>Modifications to a subscriber come from publisher only using the ordinary 
 transactional replication technique: the subscriber initiates an update and 
 pulls (requests) replication logs from publisher.  The publisher sends 
 the replication log from the replication log files and then places the subscriber 
 into "synced" (or "online") mode.  In this mode the replication logs are sent 
 to subscriber immediately after each COMMIT.</para>

 <para>Data flow from a subscriber to the publisher is very similar: the 
 subscriber initiates the update and pushes replication logs to the publisher.  
 After all replication log data has been sent to the publisher it is put into 
 "synced" mode and will receive modifications immediately after each COMMIT 
 on subscriber.</para>

 <sect3 id="bidirtransreplcreate"><title>Creating Publications for Updateable Subscriptions</title>

  <para>In order to create a publication that allows transaction-based
  replication and updateable subscriptions use the
  <function>repl_publish()</function> with a non-zero third argument.
  Replication feeds from subscribers can be replayed by user different from
  'dba' user.</para>

  <programlisting>repl_publish('foo', 'foo.log', 1, 'demo');</programlisting>

  <para>This will create updateable publication 'foo'. Replication feeds
  from subscribers will be replayed as user 'demo'.</para>
 </sect3>

 <sect3 id="bidirtransrepladdtable"><title>Adding Tables to a Publication</title>

  <para>When a table is added to an updateable publication a new 
  'ROWGUID varchar' column is automatically added to the table.
  This column will be used for conflict resolution (described later).
  If the table already has column with such a name, an existing column
  will be used (with checking for appropriate data type and width).
  ROWGUID columns hold globally unique identifiers of a row and are
  modified after each UPDATE of a row.  ROWGUID column values 
  are OSF DCE 1.1 compliant Universally Unique Identifiers (UUID).</para>

  <para>ROWGUID columns are used for conflict resolution for 
  INSERT/UPDATE/DELETE DML operations.  Basically, if a ROWGUID column 
  that came from a subscriber does not differ from the ROWGUID column 
  of the publisher's table then it is assumed that there is no conflict, 
  otherwise conflict resolution must take place.</para>
 </sect3>

 <sect3 id="bidirtransreplconflictres"><title>Conflict Resolution</title>
  
  <para>Since every table may have only one publisher, conflicts resolution 
  will always take place on the publisher.</para>

  <para>Assume some DML operation that occurred on a subscriber is being replayed
  on publisher.  There may be three types of conflicts:</para>

<orderedlist>
  <listitem><formalpara><title>uniqueness conflict (insert conflict)</title>
   <para>occurs when the row with some primary key &lt;PK&gt; already exists 
   in publisher's table.</para></formalpara></listitem>

  <listitem><formalpara><title>update conflict</title>
   <para>occurs when an UPDATE modifies a row which has already been 
   modified on publisher (by the publisher or another subscriber).</para></formalpara></listitem>

  <listitem><formalpara><title>delete conflict</title>
   <para>occurs when an UPDATE modifies a row or a DELETE removes a row
   that does not exist on publisher anymore.</para></formalpara></listitem>
</orderedlist>

  <para>Every table has a number of conflict resolvers that are used for 
  conflict resolution.  These are stored in DB.DBA.SYS_REPL_CR system table. 
  Each conflict resolver has a type ('I', 'U', or 'D') and an order.  Conflict 
  resolvers are applied in ascending order.</para>

<para>The conflict resolver is a Virtuoso/PL procedure that receives a 
conflicting row from a subscriber and some other arguments.  The conflict 
resolver can modify the row, which is passed as an 'inout' argument.
The conflict resolver should return an integer value, which will be used
for conflict resolution.</para>

Conflict resolvers of different types have different signatures:

<simplelist>
 <member>
   <para><emphasis>'I' - Insert conflict resolvers</emphasis></para>
   <para>(&lt;ALLCOLS&gt;, inout _origin varchar)</para></member>

 <member>
   <para><emphasis>'U' - Update conflict resolvers</emphasis></para>
   <para>(&lt;ALLCOLS&gt;, , &lt;ALLOLDCOLS&gt;, inout _origin varchar)</para></member>

 <member>
   <para><emphasis>'D' - Deletion conflict resolvers</emphasis></para>
   <para>(&lt;ALLOLDCOLS&gt;, inout _origin varchar)</para></member>
   </simplelist>

<para>where</para>

<para>&lt;ALLCOLS&gt; are new values of all columns including the ROWGUID 
column, &lt;ALLOLDCOLS&gt; are old values of all columns, and _origin 
is transaction originator.</para>

<para>Conflict resolvers can return the following integer values; 
The conflict resolver types concerned for each are listed in parentheses:</para>

<itemizedlist>
  <listitem><formalpara><title>0 - can't decide (I, U, D)</title>
	<para>next conflict resolver will be fired.</para></formalpara></listitem>

  <listitem><formalpara><title>1 - subscriber wins (I, U, D)</title>
	<para>DML operation will be applied with &lt;ALLCOLS&gt;
	All the subscribers except originator will receive modifications
	(originator already has them).</para></formalpara></listitem>

  <listitem><formalpara><title>2 - subscriber wins, change origin (I, U)</title>
	<para>DML operation will be applied with &lt;ALLCOLS&gt; and origin
	of transaction will be changed to publisher's server name.
	All the subscribers (including originator) will receive modifications.
	This return value is useful when conflict resolver changed some of
	the columns of the row that were passed in.
    Although all parameters of conflict resolver are inout
    only changing of &lt;ALLCOLS&gt; (non-PK columns) parameters 
	makes sense.</para></formalpara></listitem>

  <listitem><formalpara><title>3 - publisher wins (U)</title>
	<para>DML operation will be applied with &lt;ALLCOLS&gt; taken from
	publisher's table. All the subscribers will receive
	modifications.</para></formalpara></listitem>

  <listitem><formalpara><title>4 - reserved</title><para /></formalpara></listitem>

  <listitem><formalpara><title>5 - ignore (D)</title>
	<para>DML operation is ignored.</para></formalpara></listitem>
</itemizedlist>

<para>Conflict resolution stops when conflict resolvers return a non-zero
value meaning that it has made a decision.</para>

<example id="ex_conflictreslntrans"><title>Conflict Resolution</title>
<para>Suppose we have the following table:</para>

<programlisting><![CDATA[
create table items(
  item_id integer primary key,

  name varchar,
  price decimal
);
]]></programlisting>

<para>"Publisher wins" 'I' conflict resolver will look like:</para>

<programlisting><![CDATA[
create procedure items_cr(
    inout _item_id integer,
    inout _name varchar,
    inout _price decimal,
    inout _origin varchar)
  returns integer
{
  return 3;
}
]]></programlisting>

<para>The conflict resolver that will make a decision based on the 
minimal price column will look like:</para>

<programlisting><![CDATA[
create procedure items_cr(
    inout _item_id integer,
    inout _name varchar,
    inout _price decimal,
    inout _rowguid varchar,
    inout _old_item_id integer,
    inout _old_name varchar,
    inout _old_price decimal,
    inout _old_rowguid varchar,
    inout _origin varchar)
  returns integer
{
  declare p decimal;
  -- get current price value
  select price into p from items where item_id = _item_id;
  if (p < _price)
    return 3;			-- publisher wins
  else if (p > _price)
    return 1;			-- subscriber wins
  return 0;			-- can't decide
}
]]></programlisting>

<para>Conflict resolver that will change the price to the minimal 
value will look like:</para>

<programlisting><![CDATA[
create procedure items_cr(
    inout _item_id integer,
    inout _name varchar,
    inout _price decimal,
    inout _rowguid varchar,
    inout _old_item_id integer,
    inout _old_name varchar,
    inout _old_price decimal,
    inout _old_rowguid varchar,
    inout _origin varchar)
  returns integer
{
  declare p decimal;
  -- get current price value
  select price into p from items where item_id = _item_id;
  if (p < _price)
    {
      _price := p;
      return 2;			-- publisher wins, change origin
    }
  return 1;			-- subscriber wins
}
]]></programlisting>
</example>

<para>Conflict resolution occurs differently for each kind of DML operation:</para>

<itemizedlist>
  <listitem><formalpara><title>INSERT</title>
	<para>When INSERT of some row with primary key &lt;PK&gt; is replayed,
	the row in the publisher's table with such &lt;PK&gt; is looked-up.
	If the row does not exist then there is no conflict, conflict 
	resolution stops and the INSERT is replayed.
	If the row exists then we have a "uniqueness conflict".  In this case 'I'
	conflict resolvers are fired-up.
	If none of the 'I' conflict resolvers were able to make a decision
	(return non-zero value) the default action is 'publisher wins'.</para>
	</formalpara></listitem>

  <listitem><formalpara><title>UPDATE</title>

	<para>When there is an UPDATE of some row with primary 
	key &lt;PK&gt; is replayed, 	the row (and its ROWGUID) in 
	publisher's table with such &lt;PK&gt; is looked-up.
	If the row does not exist then we have a "delete conflict", 
	'D' conflict resolvers are fired up.  If none of the 'D' conflict 
	resolvers were able to make a decision the default action will be 
	to 'ignore'.
	If the row exists in the publisher's table and its ROWGUID is the same
	as that from the subscriber then there is no conflict.  Conflict
	resolution stops and the UPDATE is replayed.
	If the row exists and its ROWGUID differs from the one that came
	from subscriber then we have an "update conflict".  In this case the 
	'U' conflict resolvers are fired-up.
	If none of the 'U' conflict resolvers were able to make a decision 
	(return non-zero value) the default action will be 'publisher wins'.</para>
	</formalpara></listitem>

  <listitem><formalpara><title>DELETE</title>

	<para>When DELETE of some row with primary key &lt;PK&gt; is replayed,
	the row in the publisher's table with such &lt;PK&gt; is looked-up.  
	If the row does not exist or if the row exists but its
	ROWGUID differs from the one that came from subscriber then
	we have "delete conflict".  The 'D' conflict resolvers are fired-up.  
	If none of the 'D' conflict resolvers were able to make a decision then the 
	default action will be taken to 'ignore'.
	Otherwise it is assumed that there is no conflict and DELETE statement 
	is replayed.</para>
	</formalpara></listitem>
</itemizedlist>
</sect3>

<sect3 id="bidirtransreplautoconres"><title>Automatically Generated Conflict Resolvers</title>

<para>Simple conflict resolvers can be generated automatically.
This can be done by calling REPL_ADD_CR function.</para>

<tip><title>See Also:</title>
  <para><link linkend="fn_REPL_ADD_CR"><function>REPL_ADD_CR()</function></link></para></tip>

</sect3>

<sect3 id="bidirtransrepllogdata"><title>Replication Log Data</title>

 <para>Replication log data is different for each kind of DML operation:</para>

<itemizedlist>
  <listitem><formalpara><title>INSERT</title>
	<para>(stmt, &lt;ALLCOLS&gt;)</para></formalpara></listitem>

  <listitem><formalpara><title>UPDATE</title>
	<para>(stmt, &lt;ALLCOLS&gt;, &lt;OLDPK&gt;, &lt;ALLOLDCOLS&gt;, ncols)</para></formalpara></listitem>

  <listitem><formalpara><title>DELETE</title>
	<para>(stmt, &lt;OLDPK&gt;, &lt;ALLOLDCOLS&gt;, ncols)</para></formalpara></listitem>
</itemizedlist>

<para>where</para>

<para>stmt is DML statement (varchar), &lt;ALLCOLS&gt; is new values of 
all columns, &lt;OLDPK&gt; is primary key, specifying a row for which 
(UPDATE or DELETE) DML statement is executed, &lt;ALLOLDCOLS&gt; is old 
values of all columns, ncols is number of columns in table (integer).</para>

<para>The format of the log replication data is the same as in simple transactional
replication with addition of &lt;ALLOLDCOLS&gt; and ncols for logging UPDATE and
DELETE statements.</para>
</sect3>
</sect2>