1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207
|
# PL/Proxy Language Syntax
The language is similar to plpgsql - string quoting, comments,
semicolon at the statements end.
It contains only 4 statements: `CONNECT`, `CLUSTER`, `RUN` and `SELECT`.
Each function needs to have either `CONNECT` or pair of `CLUSTER` + `RUN` statements
to specify where to run the function.
The `SELECT` statement is optional, if it is missing, there will be default
query generated based on proxy function signature.
The `RUN` statment is also optional, it defaults to `RUN ON ANY`
which means the query will be run random partition.
## CONNECT
CONNECT 'libpq connstr';
Specifies exact location where to connect and execute the query.
If several functions have same connstr, they will use same connection.
CONNECT connect_func(...);
CONNECT argname;
CONNECT $argnr;
Connect string is taken from function result or directly from argument.
If several functions have same connstr, they will use same connection.
_(New in 2.0.9)_
**NB**: giving untrusted users ability to specify full connect string creates
security hole. Eg it can used to read cleartext passwords from `~/.pgpass`
or `pg_service.conf`. If such function cannot be avoided, it's access rights
need to be restricted.
## CLUSTER
CLUSTER 'cluster_name';
Specifies exact cluster name to be run on. The cluster name will
be passed to `plproxy.get_cluster_*` functions.
CLUSTER cluster_func(..);
Cluster name can be dynamically decided upon proxy function arguments.
`cluster_func` should return text value of final cluster name.
## RUN ON ...
RUN ON ALL;
Query will be run on all partitions in cluster in parallel.
RUN ON ANY;
Query will be run on random partition.
RUN ON <NR>;
Run on partition number `<NR>`.
RUN ON partition_func(..);
Run `partition_func()` which should return one or more hash values. (int4)
Query will be run on tagged partitions. If more than one partition was
tagged, query will be sent in parallel to them.
RUN ON argname;
RUN ON $1;
Take hash value directly from function argument. _(New in 2.0.8)_
## SPLIT
SPLIT array_arg_1 [ , array_arg_2 ... ] ;
SPLIT ALL ;
Split the input arrays based on RUN ON statement into per-partition arrays.
This is done by evaluating RUN ON condition for each array element and building
per-partition parameter arrays for each matching partition. During execution
each tagged partition then gets its own subset of the array to process.
_(New in 2.1)_
The semantics of RUN ON statement is slightly changed with SPLIT arrays:
RUN ON partition_func(..);
The array is split between the partitions matching `partition_func()`. Any
SPLIT parameters passed to the function are actually replaced with the
individual array elements.
RUN ON argname;
RUN ON $1;
An array of partition numbers (or hashes) can be passed as `argname`. The function
shall be run on the partitions specified in the array.
RUN ON ANY;
Each element is assigned to random partition.
RUN ON ALL;
RUN ON <NR>;
Unaffected, except for the added overhead of array copying.
Example:
CREATE FUNCTION set_profiles(i_users text[], i_profiles text[])
RETURNS SETOF text AS $$
CLUSTER 'userdb';
SPLIT i_users, i_profiles;
RUN ON hashtext(i_users);
$$ LANGUAGE plproxy;
Given query:
SELECT * FROM set_profiles(ARRAY['foo', 'bar'], ARRAY['a', 'b']);
The hash function is called 2 times:
SELECT * FROM hashtext('foo');
SELECT * FROM hashtext('bar');
And target partitions get queries:
SELECT * FROM set_profiles(ARRAY['foo'], ARRAY['a']);
SELECT * FROM set_profiles(ARRAY['bar'], ARRAY['b']);
## TARGET
Specify function name on remote side to be called. By default
PL/Proxy uses current function name. _(New in 2.2)_
Following function:
CREATE FUNCTION some_function(username text, num int4)
RETURNS SETOF text AS $$
CLUSTER 'userdb';
RUN ON hashtext(username);
TARGET other_function;
$$ LANGUAGE plproxy;
will run following query on remote side:
SELECT * FROM other_function(username, num);
## SELECT
SELECT .... ;
By default, PL/Proxy generates query based on its own signature.
But this can be overrided by giving explicit `SELECT` statement to run.
Everything after `SELECT` until semicolon is taken as SQL to be passed on.
Only argument substitution is done on the contents, otherwise the text
is unparsed. To avoid a table column to be parsed as function argument,
table aliases should be used.
Query result should have same number of columns as function result
and same names too.
## Argument substitution
Proxy function arguments can be referenced using name or `$n` syntax.
Everything that is not argument reference is just passed on.
## Dynamic records
PL/Proxy supports function returning plain RECORD type. Such functions
need the result type specified at call site. Main use-case is to run
random queries on partitions. _(New in 2.0.6)_
Very simple example:
CREATE OR REPLACE FUNCTION dynamic_query(q text)
RETURNS SETOF RECORD AS $$
CLUSTER 'mycluster';
RUN ON ALL;
$$ LANGUAGE plproxy;
Corresponding function in partitions:
CREATE OR REPLACE FUNCTION dynamic_query(sql text)
RETURNS SETOF RECORD AS $$
DECLARE
rec RECORD;
BEGIN
FOR rec IN EXECUTE sql
LOOP
RETURN NEXT rec;
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
Sample request:
SELECT * FROM dynamic_query('SELECT id, username FROM sometable')
AS (id integer, username text);
The types given in AS clause must match actual types from query.
|