1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326
|
= londiste(1) =
== NAME ==
londiste - PostgreSQL replication engine written in python
== SYNOPSIS ==
londiste.py [option] config.ini command [arguments]
== DESCRIPTION ==
Londiste is the PostgreSQL replication engine portion of the SkyTools suite,
by Skype. This suite includes packages implementing specific replication
tasks and/or solutions in layers, building upon each other.
PgQ is a generic queue implementation based on ideas from Slony-I's
snapshot based event batching. Londiste uses PgQ as its transport
mechanism to implement a robust and easy to use replication solution.
Londiste is an asynchronous master-slave(s) replication
system. Asynchronous means that a transaction commited on the master is
not guaranteed to have made it to any slave at the master's commit time; and
master-slave means that data changes on slaves are not reported back to
the master, it's the other way around only.
The replication is trigger based, and you choose a set of tables to
replicate from the provider to the subscriber(s). Any data changes
occuring on the provider (in a replicated table) will fire the
londiste trigger, which fills a queue of events for any subscriber(s) to
care about.
A replay process consumes the queue in batches, and applies all given
changes to any subscriber(s). The initial replication step involves using the
PostgreSQL's COPY command for efficient data loading.
== QUICK-START ==
Basic londiste setup and usage can be summarized by the following
steps:
1. create the subscriber database, with tables to replicate
2. edit a londiste configuration file, say conf.ini, and a PgQ ticker
configuration file, say ticker.ini
3. install londiste on the provider and subscriber nodes. This step
requires admin privileges on both provider and subscriber sides,
and both install commands can be run remotely:
$ londiste.py conf.ini provider install
$ londiste.py conf.ini subscriber install
4. launch the PgQ ticker on the provider machine:
$ pgqadm.py -d ticker.ini ticker
5. launch the londiste replay process:
$ londiste.py -d conf.ini replay
6. add tables to replicate from the provider database:
$ londiste.py conf.ini provider add table1 table2 ...
7. add tables to replicate to the subscriber database:
$ londiste.py conf.ini subscriber add table1 table2 ...
To replicate to more than one subscriber database just repeat each of the
described subscriber steps for each subscriber.
== COMMANDS ==
The londiste command is parsed globally, and has both options and
subcommands. Some options are reserved to a subset of the commands,
and others should be used without any command at all.
== GENERAL OPTIONS ==
This section presents options available to all and any londiste
command.
-h, --help::
show this help message and exit
-q, --quiet::
make program silent
-v, --verbose::
make program more verbose
== PROVIDER COMMANDS ==
$ londiste.py config.ini provider <command>
Where command is one of:
=== provider install ===
Installs code into provider and subscriber database and creates
queue. Equivalent to doing following by hand:
CREATE LANGUAGE plpgsql;
CREATE LANGUAGE plpython;
\i .../contrib/txid.sql
\i .../contrib/pgq.sql
\i .../contrib/londiste.sql
select pgq.create_queue(queue name);
=== provider add <table name> ... ===
Registers table(s) on the provider database and adds the londiste trigger to
the table(s) which will send events to the queue. Table names can be schema
qualified with the schema name defaulting to public if not supplied.
--all::
Register all tables in provider database, except those that are
under schemas 'pgq', 'londiste', 'information_schema' or 'pg_*'.
=== provider remove <table name> ... ===
Unregisters table(s) on the provider side and removes the londiste triggers
from the table(s). The table removal event is also sent to the queue, so all
subscribers unregister the table(s) on their end as well. Table names can be
schema qualified with the schema name defaulting to public if not supplied.
=== provider add-seq <sequence name> ... ===
Registers a sequence on provider.
=== provider remove-seq <sequence name> ... ===
Unregisters a sequence on provider.
=== provider tables ===
Shows registered tables on provider side.
=== provider seqs ===
Shows registered sequences on provider side.
== SUBSCRIBER COMMANDS ==
londiste.py config.ini subscriber <command>
Where command is one of:
=== subscriber install ===
Installs code into subscriber database. Equivalent to doing following
by hand:
CREATE LANGUAGE plpgsql;
\i .../contrib/londiste.sql
This will be done under the Postgres Londiste user, if the tables should
be owned by someone else, it needs to be done by hand.
=== subscriber add <table name> ... ===
Registers table(s) on subscriber side. Table names can be schema qualified
with the schema name defaulting to `public` if not supplied.
Switches (optional):
--all::
Add all tables that are registered on provider to subscriber database
--force::
Ignore table structure differences.
--excect-sync::
Table is already synced by external means so initial COPY is unnecessary.
--skip-truncate::
When doing initial COPY, don't remove old data.
=== subscriber remove <table name> ... ===
Unregisters table(s) from subscriber. No events will be applied to
the table anymore. Actual table will not be touched. Table names can be
schema qualified with the schema name defaulting to public if not supplied.
=== subscriber add-seq <sequence name> ... ===
Registers a sequence on subscriber.
=== subscriber remove-seq <sequence name> ... ===
Unregisters a sequence on subscriber.
=== subscriber resync <table name> ... ===
Tags table(s) as "not synced". Later the replay process will notice this
and launch copy process(es) to sync the table(s) again.
=== subscriber tables ===
Shows registered tables on the subscriber side, and the current state of
each table. Possible state values are:
NEW::
the table has not yet been considered by londiste.
in-copy::
Full-table copy is in progress.
catching-up::
Table is copied, missing events are replayed on to it.
wanna-sync:<tick-id>::
The "copy" process catched up, wants to hand the table over to
"replay".
do-sync:<tick_id>::
"replay" process is ready to accept it.
ok::
table is in sync.
=== subscriber fkeys ===
Show pending and active foreign keys on tables. Takes optional
type argument - `pending` or `active`. If no argument is given,
both types are shown.
Pending foreign keys are those that were removed during COPY time
but have not restored yet, The restore happens autmatically if
both tables are synced.
=== subscriber triggers ===
Show pending and active triggers on tables. Takes optional type
argument - `pending` or `active`. If no argument is given, both
types are shown.
Pending triggers keys are those that were removed during COPY time
but have not restored yet, The restore of triggers does not happen
autmatically, it needs to be done manually with `restore-triggers`
command.
=== subscriber restore-triggers <table name> ===
Restores all pending triggers for single table.
Optionally trigger name can be given as extra
argument, then only that trigger is restored.
=== subscriber register ===
Register consumer on queue. This usually happens
automatically when `replay` is launched, but
=== subscriber unregister ===
Unregister consumer from provider's queue. This should be
done if you want to shut replication down.
== REPLICATION COMMANDS ==
=== replay ===
The actual replication process. Should be run as daemon with -d
switch, because it needs to be always running.
It's main task is to get batches of events from PgQ and apply
them to subscriber database.
Switches:
-d, --daemon::
go background
-r, --reload::
reload config (send SIGHUP)
-s, --stop::
stop program safely (send SIGINT)
-k, --kill::
kill program immidiately (send SIGTERM)
== UTILITY COMMAND ==
=== repair <table name> ... ===
Attempts to achieve a state where the table(s) is/are in sync, compares
them, and writes out SQL statements that would fix differences.
Syncing happens by locking provider tables against updates and then
waiting until the replay process has applied all pending changes to
subscriber database. As this is dangerous operation, it has a hardwired
limit of 10 seconds for locking. If the replay process does not catch up
in that time, the locks are released and the repair operation is cancelled.
Comparing happens by dumping out the table contents of both sides,
sorting them and then comparing line-by-line. As this is a CPU and
memory-hungry operation, good practice is to run the repair command on a
third machine to avoid consuming resources on either the provider or the
subscriber.
=== compare <table name> ... ===
Syncs tables like repair, but just runs SELECT count(*) on both
sides to get a little bit cheaper, but also less precise, way of
checking if the tables are in sync.
== CONFIGURATION ==
Londiste and PgQ both use INI configuration files, your distribution of
skytools include examples. You often just have to edit the database
connection strings, namely db in PgQ ticker.ini and provider_db and
subscriber_db in londiste conf.ini as well as logfile and pidfile to adapt to
you system paths.
See `londiste(5)`.
== SEE ALSO ==
`londiste(5)`
https://developer.skype.com/SkypeGarage/DbProjects/SkyTools/[]
http://skytools.projects.postgresql.org/doc/londiste.ref.html[Reference guide]
|