1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124
|
# Tests
This document present some ways to test the [README.md](README.md)
PostgreSQL module, which aims to speed-up longest prefix searches.
## Preparing
The `prefixes.sql` creates a table which fits `prefixes.fr.csv`
content. This file contains all the official French Telephony prefixes
used as of early 2008, as found on the
[French telecom regulation authority](http://www.art-telecom.fr/)
website (see http://www.art-telecom.fr/fileadmin/wopnum.rtf).
create table prefixes (
prefix text primary key,
name text not null,
shortname text,
state char default 'S',
check( state in ('S', 'R') )
);
comment on column prefixes.state is 'S: - R: reserved';
\copy prefixes from 'prefixes.fr.csv' with delimiter ';' csv quote '"'
## Creating the prefix_range table
We create the `ranges` table from the previous `prefixes` one, allowing to
quickly have data again after reinstalling the `prefix.sql` module.
drop table ranges;
create table ranges as select prefix::prefix_range, name, shortname, state from prefixes ;
create index idx_prefix on ranges using gist(prefix gist_prefix_range_ops);
## Using Gevel to inspect the index
For information about the Gevel project, see
http://www.sai.msu.su/~megera/oddmuse/index.cgi/Gevel and
http://www.sigaev.ru/cvsweb/cvsweb.cgi/gevel/.
dim=# select gist_stat('idx_prefix');
gist_stat
-----------------------------------------
Number of levels: 2
Number of pages: 71
Number of leaf pages: 70
Number of tuples: 12036
Number of invalid tuples: 0
Number of leaf tuples: 11966
Total size of tuples: 335728 bytes
Total size of leaf tuples: 334056 bytes
Total size of index: 581632 bytes
select * from gist_print('idx_prefix') as t(level int, valid bool, a prefix_range) where level =1;
select * from gist_print('idx_prefix') as t(level int, valid bool, a prefix_range) order by level;
## Testing the index content
Those queries should return the same line, but it fails with
`enable_seqscan to off` when the index is not properly build.
set enable_seqscan to on;
select * from ranges where prefix @> '0146640123';
select * from ranges where prefix @> '0100091234';
set enable_seqscan to off;
select * from ranges where prefix @> '0146640123';
select * from ranges where prefix @> '0100091234';
## Testing prefix_range GiST penalty code
We want `gpr_penalty` result to be the lower when its second argument
is the nearest of the first.
select a, b, pr_penalty(a::prefix_range, b::prefix_range)
from (values('095[4-5]', '0[8-9]'),
('095[4-5]', '0[0-9]'),
('095[4-5]', '[0-3]'),
('095[4-5]', '0'),
('095[4-5]', '[0-9]'),
('095[4-5]', '0[1-5]'),
('095[4-5]', '32'),
('095[4-5]', '[1-3]')) as t(a, b)
order by 3 asc;
a | b | gpr_penalty
----------+--------+-------------
095[4-5] | 0[8-9] | 1.52588e-05
095[4-5] | 0[0-9] | 1.52588e-05
095[4-5] | [0-3] | 0.00390625
095[4-5] | 0 | 0.00390625
095[4-5] | [0-9] | 0.00390625
095[4-5] | 0[1-5] | 0.0078125
095[4-5] | 32 | 1
095[4-5] | [1-3] | 1
(8 rows)
## Stress testing the index in an inner loop
We create a big telephone numbers table (with random entries) this way:
create table numbers(number text primary key);
insert into numbers
select '01' || to_char((random()*100)::int, 'FM09')
|| to_char((random()*100)::int, 'FM09')
|| to_char((random()*100)::int, 'FM09')
|| to_char((random()*100)::int, 'FM09')
from generate_series(1, 5000);
INSERT 0 5000
Repeat the step to obtain some more numbers, then:
dim=# explain analyze select * from numbers n join ranges r on r.prefix @> n.number;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..4868614.00 rows=149575000 width=45) (actual time=0.345..4994.296 rows=10213 loops=1)
-> Seq Scan on numbers n (cost=0.00..375.00 rows=25000 width=11) (actual time=0.015..12.917 rows=25000 loops=1)
-> Index Scan using idx_prefix on ranges r (cost=0.00..104.98 rows=5983 width=34) (actual time=0.182..0.197 rows=0 loops=25000)
Index Cond: (r.prefix @> (n.number)::prefix_range)
Total runtime: 4998.936 ms
(5 rows)
|