Re: How to run in parallel in Postgres

From: Ondrej Ivanič <ondrej(dot)ivanic(at)gmail(dot)com>
To:
Cc: "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: How to run in parallel in Postgres
Date: 2019-12-07 01:23:15
Message-ID: CAM6mie+EibAnj47FeA6-S7YLGQ0mLGfRroS87qYKyWDMb8mFyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Lars,

I have two suggestions:

- `xargs` almost always present and it can run in parallel (-P) but script
needs to be changed:
for((i=1;i<=28;i++)); do echo "SELECT
find_overlap_gap_single_cell('test_data.overlap_gap_input_t1','geom',4258,'test_data.overlap_gap_input_t1_res',${I},28);";
done | xargs -n1 -P 10 psql ...

- `UNION ALL` might trigger parallel execution (you need to mess with the
cost of the function and perhaps other settings):
SELECT find_overlap_gap_single_cell('test_data.overlap_gap_input_t1','geom',
4258,'test_data.overlap_gap_input_t1_res',1,28) UNION ALL
SELECT find_overlap_gap_single_cell('test_data.overlap_gap_input_t1','geom',
4258,'test_data.overlap_gap_input_t1_res',2,28)
...

Cheers,

On Thu, 5 Dec 2019 at 23:11, Lars Aksel Opsahl <Lars(dot)Opsahl(at)nibio(dot)no> wrote:

> Hi
>
> I have a function that prepares data, so the big job can be run it in
> parallel.
>
> Today I have solved this by using "Gnu parallel" like this.
> psql testdb -c"\! psql -t -q -o /tmp/run_cmd.sql testdb -c\"SELECT
> find_overlap_gap_make_run_cmd('sl_lop.overlap_gap_input_t1','geom',4258,'sl_lop.overlap_gap_input_t1_res',50);\";
> parallel -j 4 psql testdb -c :::: /tmp/run_cmd.sql" 2>> /tmp/analyze.log;
>
> The problem here is that I depend on external code which may not be
> installed.
>
> Since Postgres now supports parallel I was wondering if it's easy to
> trigger parallel dynamically created SQL calls.
>
> If you look at
> https://github.com/larsop/find-overlap-and-gap/blob/master/src/test/sql/regress/find_overlap_and_gap.sql
> you see that
>
> find_overlap_gap_make_run_cmd generates as set of 28 sql calls.
>
>
> So is it in a simple way possible to use Postgres parallel functionality
> to call this 28 functions i parallel so I don't have dependent
> on externally install programs ?
>
>
> When this 28 sql calls are done, the find_overlap_gap_make_run_cmd may continue
> to the next step of work. So the function that triggers parallel calls wait
> for them complete and then may start on the next step of work.
>
>
> Thanks .
>
>
> Lars
>
>
>
>
>
>

--
Ondrej

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Pryzby 2019-12-07 01:25:21 Re: How to run in parallel in Postgres
Previous Message Jeff Janes 2019-12-07 00:59:00 Re: autovacuum locking question