1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157
|
[Ab]using PostgreSQL to calculate :math:`\pi`
=============================================
Once the crime of binding the GMP_ multiple precision arithmetic library into
PostgreSQL_ has been perpetrated, using it to calculate one million digits of
:math:`\pi` is the least evil.
.. _GMP: https://www.gmplib.org/
.. _PostgreSQL: https://www.postgresql.org/
Calculation is performed using a `Machin-like formula`__, a generalization of
the Machin's formula:
.. math::
\frac{\pi}{4} = 4 \; \tan^{-1}\frac{1}{5} - \tan^{-1}\frac{1}{239}
known as one of the most efficient ways to compute a large number of digits of
:math:`\pi`. Specifically we will use the Chien-Lih Hwang 1997 formula (further
details and references in the above link).
Calculation is performed using the `mpz` data type, thus using integers:
to calculate an approximation of :math:`\tan^{-1}\frac{1}{x}` to :math:`n`
digits, we use a Taylor series scaled to :math:`10^n`. Full details are
provided in this nice `LiteratePrograms article`__.
.. __: https://en.wikipedia.org/wiki/Machin-like_formula
.. __: https://web.archive.org/web/20111211140154/http://en.literateprograms.org/Pi_with_Machin's_formula_(Python)
It is possible to implement the `!arccot()` function using a
`common table expression`__ from PostgreSQL 8.4:
.. __: https://www.postgresql.org/docs/current/queries-with.html
.. _PL/pgSQL: https://www.postgresql.org/docs/current/plpgsql.html
.. code-block:: postgresql
CREATE OR REPLACE FUNCTION arccot(x mpz, unity mpz)
RETURNS mpz AS
$$
WITH RECURSIVE t(n, xp) AS (
VALUES (3, $2 / -($1 ^ 3))
UNION ALL
SELECT n+2, xp / -($1 ^ 2)
FROM t
WHERE xp <> 0
)
SELECT sum(xp / n) + ($2 / $1) FROM t;
$$
LANGUAGE sql IMMUTABLE STRICT;
However using this method all the temporary terms in the calculation are kept
in memory, and the complex bookkeeping results in a non optimal use of the
CPU. The performance can be improved using `PL/pgSQL`_, which may be not the
fastest language around, but it's perfectly suitable for this task, as the
time spent in the function body is negligible respect to the time spent in the
arithmetic functions with huge operators.
.. code-block:: postgresql
CREATE OR REPLACE FUNCTION arccot(x mpz, unity mpz)
RETURNS mpz AS
$$
DECLARE
xp mpz := unity / x;
xp2 mpz := -(x ^ 2);
acc mpz := xp;
n mpz := 3;
term mpz;
BEGIN
LOOP
xp := xp / xp2;
term := xp / n;
EXIT WHEN term = 0;
acc := acc + term;
n := n + 2;
END LOOP;
RETURN acc;
END
$$
LANGUAGE plpgsql IMMUTABLE STRICT;
And the Machin-like formula using it:
.. code-block:: postgresql
CREATE OR REPLACE FUNCTION pi_hwang_97(ndigits int)
RETURNS mpz AS
$$
DECLARE
unity mpz = 10::mpz ^ (ndigits + 10);
BEGIN
RETURN 4 * (
183 * arccot(239, unity)
+ 32 * arccot(1023, unity)
- 68 * arccot(5832, unity)
+ 12 * arccot(110443, unity)
- 12 * arccot(4841182, unity)
- 100 * arccot(6826318, unity)
) / (10::mpz ^ 10);
END
$$
LANGUAGE plpgsql IMMUTABLE STRICT;
This function produces 200,000 digits of :math:`\pi` in 30 sec on my laptop
(Intel i5 2.53GHz). As a comparison, the equivalent Python implementation
(using the Python built-in long integer implementation) takes almost 4 minutes.
The function above has the shortcoming of using only one of the 4 cores
available on my machine. But the formula itself is easy to parallelize as it
contains 6 independent terms: we can produce each term in a separate backend
process, storing the computation into a table (it's a database, after all!)
and composing the result as a last step. The following Python script
implements this idea:
.. code-block:: python
import eventlet
eventlet.patcher.monkey_patch()
import sys
import psycopg2
dsn = 'dbname=regression'
nprocs = 4
ndigits = int(sys.argv[1])
cnn = psycopg2.connect(dsn)
cnn.set_isolation_level(0)
cur = cnn.cursor()
cur.execute("""
drop table if exists pi;
create table pi (mult int4, arccot mpz);
""")
def arccot((mult, arg)):
cnn = psycopg2.connect(dsn)
cnn.set_isolation_level(0)
cur = cnn.cursor()
cur.execute("""
insert into pi
values (%s, arccot(%s, 10::mpz ^ (%s + 10)));
""", (mult, arg, ndigits))
cnn.close()
pool = eventlet.GreenPool(nprocs)
list(pool.imap(arccot, [
(183, 239), (32, 1023), (-68, 5832),
(12, 110443), (-12, 4841182), (-100, 6826318)]))
cur.execute("select 4 * sum(mult * arccot) / (10::mpz ^ 10) from pi;")
print cur.fetchone()[0]
The script uses 4 cores concurrently to calculate the intermediate terms: it
produces 200,000 digits in about 13 seconds and 1 million digits in 8 minutes
and 30 seconds.
|