File: pgq-admin.txt

package info (click to toggle)
skytools 2.1.8-2.2
  • links: PTS, VCS
  • area: main
  • in suites: squeeze
  • size: 1,980 kB
  • ctags: 1,543
  • sloc: sql: 6,635; python: 6,237; ansic: 2,799; makefile: 308; sh: 268
file content (189 lines) | stat: -rw-r--r-- 6,839 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
= pgqadm(1) =

== NAME ==

pgqadm - PgQ ticker and administration interface

== SYNOPSIS ==

  pgqadm.py [option] config.ini command [arguments]

== DESCRIPTION ==

PgQ is Postgres based event processing system. It is part of SkyTools package 
that contains several useful implementations on this engine. Main function of 
PgQadm is to maintain and keep healthy both pgq internal tables and tables that
store events.

SkyTools is scripting framework for Postgres databases written in Python that 
provides several utilities and implements common database handling logic.

Event - atomic piece of data created by Producers. In PgQ event is one record 
in one of tables that services that queue. Event record contains some system fields 
for PgQ and several data fileds filled by Producers. PgQ is neither checking nor 
enforcing event type. Event type is someting that consumer and produser must agree on. 
PgQ guarantees that each event is seen at least once but it is up to consumer to 
make sure that event is processed no more than once if that is needed.

Batch - PgQ is designed for efficiency and high throughput so events are grouped 
into batches for bulk processing. Creating these batches is one of main tasks of
PgQadm and there are several parameters for each queue that can be use to tune
size and frequency of batches. Consumerss receive events in these batches and depending
on business requirements process events separately or also in batches.

Queue - Event are stored in queue tables i.e queues. Several producers can write into 
same queeu and several consumers can read from the queue. Events are kept in queue 
until all the consumers have seen them. We use table rotation to decrease
hard disk io. Queue can contain any number of event types it is up to Producer and 
Consumer to agree on what types of events are passed and how they are encoded
For example Londiste producer side can produce events for more tables tan consumer 
side needs so consumer subscribes only to those tables it needs and events for 
other tables are ignores.

Producer - applicatione that pushes event into queue. Prodecer can be written in any 
langaage that is able to run stored procedures in Postgres.

Consumer - application that reads events from queue. Consumers can be written in any 
language that can interact with Postgres. SkyTools package contains several useful 
consumers written in Python that can be used as they are or as good starting points 
to write more complex consumers. 

== QUICK-START ==

Basic PgQ setup and usage can be summarized by the following
steps:

 1. create the database

 2. edit a PgQ ticker configuration file, say ticker.ini

 3. install PgQ internal tables

      $ pgqadm.py ticker.ini install

 4. launch the PgQ ticker on databse machine as daemon

      $ pgqadm.py -d ticker.ini ticker

 5. create queue
 
      $ pgqadm.py ticker.ini create <queue>

 6. register or run consumer to register it automatically

      $ pgqadm.py ticker.ini register <queue> <consumer>
      
 7. start producing events 

== CONFIG ==

  [pgqadm]
  job_name = pgqadm_somedb
  
  db = dbname=somedb
  
  # how often to run maintenance [seconds]
  maint_delay = 600
  
  # how often to check for activity [seconds]
  loop_delay = 0.1
  
  logfile = ~/log/%(job_name)s.log
  pidfile = ~/pid/%(job_name)s.pid

== COMMANDS ==

=== ticker ===

Start ticking & maintenance process. Usually run as daemon with -d option. 
Must be running for PgQ to be functional and for consumers to see any events. 

=== status ===

Show overview of registered queues and consumers and queue health. 
This command is used when you want to know what is happening inside PgQ.

=== install ===

Installs PgQ schema into database from config file.

=== create <queue> ===

Create queue tables into pgq schema. As soon as queue is created producers can
start inserting events into it.  But you must be aware that if there are no 
consumers on the queue the events are lost until consumer is registered.

=== drop <queue> ===

Drop queue and all it's consumers from PgQ. Queue tables are dropped and 
all the contents are lost forever so use with care as with most drop commands.

=== register <queue> <consumer> ===

Register given consumer to listen to given queue. First batch seen by this consumer 
is the one completed after registration. Registration happens automatically when
consumer is run first time so using this command is optional but may be needed
when producers start producing events before consumer can be run.

=== unregister <queue> <consumer> ===

Removes consumer from given queue. Note consumer must be stopped before issuing 
this command otherwise it automatically registers  again. 

=== config [<queue> [<variable>=<value> ... ]] ===

Show or change queue config. There are several parameters that can be set for each 
queue shown here with default values:

queue_ticker_max_lag (2)::
	If no tick has happend during given number of seconds then one
	is generated just to keep queue lag in control. It may be increased
	if there is no need to deliver events fast. Not much room to decrease it :)
  
queue_ticker_max_count (200)::
	Threshold number of events in filling batch that triggers tick. 
	Can be increased to encourage PgQ to create larger batches or decreased
	to encourage faster ticking with smaller batches.
  
queue_ticker_idle_period (60)::
	Number of seconds that can pass without ticking if no events are coming to queue.
	These empty ticks are used as keep alive signals for batch jobs and monitoring.
  
queue_rotation_period (2 hours)::
	Interval of time that may pass before PgQ tries to rotate tables to free up space.
	Not PgQ can not rotate tables if there are long transactions in database like VACUUM
	or pg_dump. May be decreased if low on disk space or increased to keep longer history
	of old events. To small values might affect performance badly because postgres tends
	to do seq scans on small tables. Too big values may waste disk space.

Looking at queue config.

  $ pgqadm.py mydb.ini config
  testqueue
      queue_ticker_max_lag        =     3
      queue_ticker_max_count      =   500
      queue_ticker_idle_period    =    60
      queue_rotation_period       =  7200
  $ pgqadm.py conf/pgqadm_myprovider.ini config testqueue queue_ticker_max_lag=10 queue_ticker_max_count=300
  Change queue bazqueue config to: queue_ticker_max_lag='10', queue_ticker_max_count='300'
  $ 

== COMMON  OPTIONS ==

  -h, --help::
        show help message
  -q, --quiet::
        make program silent
  -v, --verbose::
        make program verbose
  -d, --daemon::
        go background
  -r, --reload::
        reload config (send SIGHUP)
  -s, --stop::
        stop program safely (send SIGINT)
  -k, --kill::
        kill program immidiately (send SIGTERM)

// vim:sw=2 et smarttab sts=2: