Re: How to run in parallel in Postgres, EXECUTE_PARALLEL

From: Joe Conway <mail(at)joeconway(dot)com>
To: Lars Aksel Opsahl <Lars(dot)Opsahl(at)nibio(dot)no>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: How to run in parallel in Postgres, EXECUTE_PARALLEL
Date: 2019-12-08 20:04:05
Message-ID: 880013b5-31bc-c266-5681-ac51118b4630@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 12/8/19 1:14 PM, Lars Aksel Opsahl wrote:
> Do you or anybody know if there are any plans for a function call that
> support the calling structure below or something like it and that then
> could finish in 1 second ? (If you are calling a void function, the
> return value should not be any problem.)
>
> DO
> $body$
> *DECLARE* 
> command_string_list text[3];
> *BEGIN*
> command_string_list[0] = 'SELECT pg_sleep(1)';
> command_string_list[1] = 'SELECT pg_sleep(1)';
> command_string_list[2] = 'SELECT pg_sleep(1)';
> EXECUTE_PARALLEL command_string_list;
> *END*
> $body$;
>
> The only way to this today as I understand it, is to open 3 new
> connections back to the database which you can be done in different ways. 

Yes, correct.

> If we had a parallel functions like the one above it's easier to
> make parallel sql without using complex scripts, java, python or other
> system.

It does require one connection per statement, but with dblink it is not
necessarily all that complex. For example (granted, this could use more
error checking, etc.):

8<----------------
CREATE OR REPLACE FUNCTION
execute_parallel(stmts text[])
RETURNS text AS
$$
declare
i int;
retv text;
conn text;
connstr text;
rv int;
db text := current_database();
begin
for i in 1..array_length(stmts,1) loop
conn := 'conn' || i::text;
connstr := 'dbname=' || db;
perform dblink_connect(conn, connstr);
rv := dblink_send_query(conn, stmts[i]);
end loop;
for i in 1..array_length(stmts,1) loop
conn := 'conn' || i::text;
select val into retv
from dblink_get_result(conn) as d(val text);
end loop;
for i in 1..array_length(stmts,1) loop
conn := 'conn' || i::text;
perform dblink_disconnect(conn);
end loop;
return 'OK';
end;
$$ language plpgsql;
8<----------------

And then:

8<----------------
\timing
DO $$
declare
stmts text[];
begin
stmts[1] = 'select pg_sleep(1)';
stmts[2] = 'select pg_sleep(1)';
stmts[3] = 'select pg_sleep(1)';
PERFORM execute_parallel(stmts);
end;
$$ LANGUAGE plpgsql;
DO
Time: 1010.831 ms (00:01.011)
8<----------------

HTH,

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Lars Aksel Opsahl 2019-12-08 21:59:51 Re: How to run in parallel in Postgres, EXECUTE_PARALLEL
Previous Message Lars Aksel Opsahl 2019-12-08 18:14:14 Re: How to run in parallel in Postgres, EXECUTE_PARALLEL