From: | Lars Aksel Opsahl <Lars(dot)Opsahl(at)nibio(dot)no> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Re: How to run in parallel in Postgres |
Date: | 2019-12-07 11:27:56 |
Message-ID: | HE1P189MB0266944A7C861C8711453F7B9D5E0@HE1P189MB0266.EURP189.PROD.OUTLOOK.COM |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
> From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
> Sent: Saturday, December 7, 2019 2:25 AM
> To: Lars Aksel Opsahl <Lars(dot)Opsahl(at)nibio(dot)no>
> Cc: pgsql-performance(at)lists(dot)postgresql(dot)org <pgsql-performance(at)lists(dot)postgresql(dot)org>
> Subject: Re: How to run in parallel in Postgres
>
> On Thu, Dec 05, 2019 at 12:10:42PM +0000, Lars Aksel Opsahl wrote:
> > I have a function that prepares data, so the big job can be run it in parallel.
> >
> > Since Postgres now supports parallel I was wondering if it's easy to trigger parallel dynamically created SQL calls.
> >
> > If you look at https://github.com/larsop/find-overlap-and-gap/blob/master/src/test/sql/regress/find_overlap_and_gap.sql you see that
> >
> > find_overlap_gap_make_run_cmd generates as set of 28 sql calls.
> >
> > So is it in a simple way possible to use Postgres parallel functionality to call this 28 functions i parallel so I don't have dependent on externally install programs ?
>
> SELECT find_overlap_gap_single_cell('test_data.overlap_gap_input_t1','geom',4258,'test_data.overlap_gap_input_t1_res',1,28);
> SELECT find_overlap_gap_single_cell('test_data.overlap_gap_input_t1','geom',4258,'test_data.overlap_gap_input_t1_res',2,28);
> SELECT find_overlap_gap_single_cell('test_data.overlap_gap_input_t1','geom',4258,'test_data.overlap_gap_input_t1_res',3,28);
> ...
>
> I see that find_overlap_gap_single_cell creates tables, so cannot be run in parallel.
> Maybe you could consider rewriting it to return data to its caller instead.
> You'd also need to mark it as PARALLEL SAFE, of course.
> Your other functions involved should be PARALLEL SAFE too.
>
> Justin
Hi Justin
The reason why I don't return the results Is that on very bug tables I usually get memory problems if I return all the results to the master function. So I usually break thing up into small unlogged tables. Then I work on each table separately or in groups. When all steps are done i merge all the small tables together. I this case we only single step, but usually I work many more steps.
But I will keep mind that it may work i parallel if I don't create any child tables but returns the result.
Thanks.
Lars
From | Date | Subject | |
---|---|---|---|
Next Message | Jean-David Beyer | 2019-12-07 12:10:15 | Re: Legal disclaimers on emails to this group |
Previous Message | Lars Aksel Opsahl | 2019-12-07 11:17:25 | Re: How to run in parallel in Postgres |