File: ARCHITECTURE.md

package info (click to toggle)
pgagroal 1.6.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 1,888 kB
  • sloc: ansic: 22,875; sh: 68; makefile: 7
file content (285 lines) | stat: -rw-r--r-- 12,245 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
# pgagroal architecture

## Overview

`pgagroal` use a process model (`fork()`), where each process handles one connection to [PostgreSQL](https://www.postgresql.org).
This was done such a potential crash on one connection won't take the entire pool down.

The main process is defined in [main.c](../src/main.c). When a client connects it is processed in its own process, which
is handle in [worker.h](../src/include/worker.h) ([worker.c](../src/libpgagroal/worker.c)).

Once the client disconnects the connection is put back in the pool, and the child process is terminated.

## Shared memory

A memory segment ([shmem.h](../src/include/shmem.h)) is shared among all processes which contains the `pgagroal`
state containing the configuration of the pool, the list of servers and the state of each connection.

The configuration of `pgagroal` (`struct configuration`), the configuration of the servers (`struct server`) and
the state of each connection (`struct connection`) is initialized in this shared memory segment.
These structs are all defined in [pgagroal.h](../src/include/pgagroal.h).

The shared memory segment is created using the `mmap()` call.

## Atomic operations

The [atomic operation library](https://en.cppreference.com/w/c/atomic) is used to define the state of each of the
connection, and move them around in the connection state diagram. The state diagram has the follow states

| State name | Description |
|------------|-------------|
| `STATE_NOTINIT` | The connection has not been initialized |
| `STATE_INIT` | The connection is being initialized |
| `STATE_FREE` | The connection is free |
| `STATE_IN_USE` | The connection is in use |
| `STATE_GRACEFULLY` | The connection will be killed upon return to the pool |
| `STATE_FLUSH` | The connection is being flushed |
| `STATE_IDLE_CHECK` | The connection is being idle timeout checked |
| `STATE_MAX_CONNECTION_AGE` | The connection is being max connection age checked |
| `STATE_VALIDATION` | The connection is being validated |
| `STATE_REMOVE` | The connection is being removed |

These state are defined in [pgagroal.h](../src/include/pgagroal.h).

## Pool

The `pgagroal` pool API is defined in [pool.h](../src/include/pool.h) ([pool.c](../src/libpgagroal/pool.c)).

This API defines the functionality of the pool such as getting a connection from the pool, and returning it.
There is no ordering among processes, so a newly created process can obtain a connection before an older process.

The pool operates on the `struct connection` data type defined in [pgagroal.h](../src/include/pgagroal.h).

## Network and messages

All communication is abstracted using the `struct message` data type defined in [message.h](../src/include/message.h).

Reading and writing messages are handled in the [message.h](../src/include/message.h) ([message.c](../src/libpgagroal/message.c))
files.

Network operations are defined in [network.h](../src/include/network.h) ([network.c](../src/libpgagroal/network.c)).

## Memory

Each process uses a fixed memory block for its network communication, which is allocated upon startup of the worker.

That way we don't have to allocate memory for each network message, and more importantly free it after end of use.

The memory interface is defined in [memory.h](../src/include/memory.h) ([memory.c](../src/libpgagroal/memory.c)).

## Management

`pgagroal` has a management interface which serves two purposes.

First, it defines the administrator abilities that can be performed on the pool when it is running. This include
for example flushing the pool. The `pgagroal-cli` program is used for these operations ([cli.c](../src/cli.c)).

Second, the interface is used internally to transfer the connection (socket descriptor) from the child process
to the main `pgagroal` process after a new connection has been created. This is necessary since the socket descriptor
needs to be available to subsequent client and hence processes.

The management interface use Unix Domain Socket for communication.

The management interface is defined in [management.h](../src/include/management.h). The management interface
uses its own protocol which always consist of a header

| Field      | Type | Description |
|------------|------|-------------|
| `id` | Byte | The identifier of the message type |
| `slot` | Int | The slot that the message is for |

The rest of the message is depending on the message type.

### Remote management

The remote management functionality uses the same protocol as the standard management method.

However, before the management packet is sent the client has to authenticate using SCRAM-SHA-256 using the
same message format that PostgreSQL uses, e.g. StartupMessage, AuthenticationSASL, AuthenticationSASLContinue,
AuthenticationSASLFinal and AuthenticationOk. The SSLRequest message is supported.

The remote management interface is defined in [remote.h](../src/include/remote.h) ([remote.c](../src/libpgagroal/remote.c)).

## libev usage

[libev](http://software.schmorp.de/pkg/libev.html) is used to handle network interactions, which is "activated"
upon an `EV_READ` event.

Each process has its own event loop, such that the process only gets notified when data related only to that process
is ready. The main loop handles the system wide "services" such as idle timeout checks and so on.

## Pipeline

`pgagroal` has the concept of a pipeline that defines how communication is routed from the client through `pgagroal` to
[PostgreSQL](https://www.postgresql.org). Likewise in the other direction.

A pipeline is defined by

```C
struct pipeline
{
   initialize initialize;
   start start;
   callback client;
   callback server;
   stop stop;
   destroy destroy;
   periodic periodic;
};
```

in [pipeline.h](../src/include/pipeline.h).

The functions in the pipeline are defined as

| Function | Description |
|----------|-------------|
| `initialize` | Global initialization of the pipeline, may return a pointer to a shared memory segment |
| `start` | Called when the pipeline instance is started |
| `client` | Client to `pgagroal` communication |
| `server` | [PostgreSQL](https://www.postgresql.org) to `pgagroal` communication |
| `stop` | Called when the pipeline instance is stopped |
| `destroy` | Global destruction of the pipeline |
| `periodic` | Called periodic |

The functions `start`, `client`, `server` and `stop` has access to the following information

```C
struct worker_io
{
   struct ev_io io;      /* The libev base type */
   int client_fd;        /* The client descriptor */
   int server_fd;        /* The server descriptor */
   int slot;             /* The slot */
   SSL* client_ssl;      /* The client SSL context */
   SSL* server_ssl;      /* The server SSL context */
};
```
defined in [worker.h](../src/include/worker.h).

### Performance pipeline

One of the goals for `pgagroal` is performance, so the performance pipeline will only look for the
[`Terminate`](https://www.postgresql.org/docs/11/protocol-message-formats.html) message from the client and act on that.
Likewise the performance pipeline will only look for `FATAL` errors from the server. This makes the pipeline very fast, since there
is a minimum overhead in the interaction.

The pipeline is defined in [pipeline_perf.c](../src/libpgagroal/pipeline_perf.c) in the functions

| Function | Description |
|----------|-------------|
| `performance_initialize` | Nothing |
| `performance_start` | Nothing |
| `performance_client` | Client to `pgagroal` communication |
| `performance_server` | [PostgreSQL](https://www.postgresql.org) to `pgagroal` communication |
| `performance_stop` | Nothing |
| `performance_destroy` | Nothing |
| `performance_periodic` | Nothing |

### Session pipeline

The session pipeline works like the performance pipeline with the exception that it checks if
a Transport Layer Security (TLS) transport should be used.

The pipeline is defined in [pipeline_session.c](../src/libpgagroal/pipeline_session.c) in the functions

| Function | Description |
|----------|-------------|
| `session_initialize` | Initialize memory segment if disconnect_client is active |
| `session_start` | Prepares the client segment if disconnect_client is active |
| `session_client` | Client to `pgagroal` communication |
| `session_server` | [PostgreSQL](https://www.postgresql.org) to `pgagroal` communication |
| `session_stop` | Updates the client segment if disconnect_client is active |
| `session_destroy` | Destroys memory segment if initialized |
| `session_periodic` | Checks if clients should be disconnected |

### Transaction pipeline

The transaction pipeline will return the connection to the server after each transaction. The pipeline supports
Transport Layer Security (TLS).

The pipeline uses the [ReadyForQuery](https://www.postgresql.org/docs/current/protocol-message-formats.html) message
to check the status of the transaction, and therefore needs to maintain track of the message headers.

The pipeline has a management interface in order to receive the socket descriptors from the parent process when a new
connection is added to the pool. The pool will retry if the client in question doesn't consider the socket descriptor valid.

The pipeline is defined in [pipeline_transaction.c](../src/libpgagroal/pipeline_transaction.c) in the functions

| Function | Description |
|----------|-------------|
| `transaction_initialize` | Nothing |
| `transaction_start` | Setup process variables and returns the connection to the pool |
| `transaction_client` | Client to `pgagroal` communication. Obtain connection if needed |
| `transaction_server` | [PostgreSQL](https://www.postgresql.org) to `pgagroal` communication. Keep track of message headers |
| `transaction_stop` | Return connection to the pool if needed. Possible rollback of active transaction |
| `transaction_destroy` | Nothing |
| `transaction_periodic` | Nothing |

## Signals

The main process of `pgagroal` supports the following signals `SIGTERM`, `SIGINT` and `SIGALRM`
as a mechanism for shutting down. The `SIGTRAP` signal will put `pgagroal` into graceful shutdown, meaning that
exisiting connections are allowed to finish their session. The `SIGABRT` is used to request a core dump (`abort()`).
The `SIGHUP` signal will trigger a reload of the configuration.

The child processes support `SIGQUIT` as a mechanism to shutdown. This will not shutdown the pool itself.

It should not be needed to use `SIGKILL` for `pgagroal`. Please, consider using `SIGABRT` instead, and share the
core dump and debug logs with the `pgagroal` community.

## Reload

The `SIGHUP` signal will trigger a reload of the configuration.

However, some configuration settings requires a full restart of `pgagroal` in order to take effect. These are

* `hugepage`
* `libev`
* `log_path`
* `log_type`
* `max_connections`
* `pipeline`
* `unix_socket_dir`
* `pidfile`
* Limit rules defined by `pgagroal_databases.conf`

The configuration can also be reloaded using `pgagroal-cli -c pgagroal.conf reload`. The command is only supported
over the local interface, and hence doesn't work remotely.

## Prometheus

pgagroal has support for [Prometheus](https://prometheus.io/) when the `metrics` port is specified.

The module serves two endpoints

* `/` - Overview of the functionality (`text/html`)
* `/metrics` - The metrics (`text/plain`)

All other URLs will result in a 403 response.

The metrics endpoint supports `Transfer-Encoding: chunked` to account for a large amount of data.

The implementation is done in [prometheus.h](../src/include/prometheus.h) and
[prometheus.c](../src/libpgagroal/prometheus.c).

## Failover support

pgagroal can failover a PostgreSQL instance if clients can't write to it.

This is done using an external script provided by the user.

The implementation is done in [server.h](../src/include/server.h) and
[server.c](../src/libpgagroal/server.c).

## Logging

Simple logging implementation based on a `atomic_schar` lock.

The implementation is done in [logging.h](../src/include/logging.h) and
[logging.c](../src/libpgagroal/logging.c).

## Protocol

The protocol interactions can be debugged using [Wireshark](https://www.wireshark.org/) or
[pgprtdbg](https://github.com/jesperpedersen/pgprtdbg).