From: | "Seamus Abshere" <seamus(at)abshere(dot)net> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: A reloption for partitioned tables - parallel_workers |
Date: | 2021-02-15 15:42:14 |
Message-ID: | 1139f8a8-c6ca-4168-b5dd-1314bae37340@www.fastmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
hi,
Here we go, my first patch... solves https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55@www.fastmail.com
Best,
Seamus
On Mon, Feb 15, 2021, at 3:53 AM, Amit Langote wrote:
> Hi Seamus,
>
> On Mon, Feb 15, 2021 at 5:28 PM Seamus Abshere <seamus(at)abshere(dot)net> wrote:
> > It turns out parallel_workers may be a useful reloption for certain uses of partitioned tables, at least if they're made up of fancy column store partitions (see https://www.postgresql.org/message-id/7d6fdc20-857c-4cbe-ae2e-c0ff9520ed55%40www.fastmail.com)
> >
> > Would somebody tell me what I'm doing wrong? I would love to submit a patch but I'm stuck:
> >
> > diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
> > index cd3fdd259c..f1ade035ac 100644
> > --- a/src/backend/optimizer/path/allpaths.c
> > +++ b/src/backend/optimizer/path/allpaths.c
> > @@ -3751,6 +3751,7 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages,
> > * If the user has set the parallel_workers reloption, use that; otherwise
> > * select a default number of workers.
> > */
> > + // I want to affect this
> > if (rel->rel_parallel_workers != -1)
> > parallel_workers = rel->rel_parallel_workers;
> > else
> >
> > so I do this
> >
> > diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
> > index c687d3ee9e..597b209bfb 100644
> > --- a/src/backend/access/common/reloptions.c
> > +++ b/src/backend/access/common/reloptions.c
> > @@ -1961,13 +1961,15 @@ build_local_reloptions(local_relopts *relopts, Datum options, bool validate)
> > bytea *
> > partitioned_table_reloptions(Datum reloptions, bool validate)
> > {
> > - /*
> > - * There are no options for partitioned tables yet, but this is able to do
> > - * some validation.
> > - */
> > + static const relopt_parse_elt tab[] = {
> > + {"parallel_workers", RELOPT_TYPE_INT,
> > + offsetof(StdRdOptions, parallel_workers)},
> > + };
> > +
> > return (bytea *) build_reloptions(reloptions, validate,
> > RELOPT_KIND_PARTITIONED,
> > - 0, NULL, 0);
> > + sizeof(StdRdOptions),
> > + tab, lengthof(tab));
> > }
> >
> > That "works":
> >
> > postgres=# alter table test_3pd_cstore_partitioned set (parallel_workers = 33);
> > ALTER TABLE
> > postgres=# select relname, relkind, reloptions from pg_class where relname = 'test_3pd_cstore_partitioned';
> > relname | relkind | reloptions
> > -----------------------------+---------+-----------------------
> > test_3pd_cstore_partitioned | p | {parallel_workers=33}
> > (1 row)
> >
> > But it seems to be ignored:
> >
> > diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
> > index cd3fdd259c..c68835ce38 100644
> > --- a/src/backend/optimizer/path/allpaths.c
> > +++ b/src/backend/optimizer/path/allpaths.c
> > @@ -3751,6 +3751,8 @@ compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages,
> > * If the user has set the parallel_workers reloption, use that; otherwise
> > * select a default number of workers.
> > */
> > + // I want to affect this, but this assertion always passes
> > + Assert(rel->rel_parallel_workers == -1)
> > if (rel->rel_parallel_workers != -1)
> > parallel_workers = rel->rel_parallel_workers;
> > else
>
> You may see by inspecting the callers of compute_parallel_worker()
> that it never gets called on a partitioned table, only its leaf
> partitions. Maybe you could try calling compute_parallel_worker()
> somewhere in add_paths_to_append_rel(), which has this code to figure
> out parallel_workers to use for a parallel Append path for a given
> partitioned table:
>
> /* Find the highest number of workers requested for any subpath. */
> foreach(lc, partial_subpaths)
> {
> Path *path = lfirst(lc);
>
> parallel_workers = Max(parallel_workers, path->parallel_workers);
> }
> Assert(parallel_workers > 0);
>
> /*
> * If the use of parallel append is permitted, always request at least
> * log2(# of children) workers. We assume it can be useful to have
> * extra workers in this case because they will be spread out across
> * the children. The precise formula is just a guess, but we don't
> * want to end up with a radically different answer for a table with N
> * partitions vs. an unpartitioned table with the same data, so the
> * use of some kind of log-scaling here seems to make some sense.
> */
> if (enable_parallel_append)
> {
> parallel_workers = Max(parallel_workers,
> fls(list_length(live_childrels)));
> parallel_workers = Min(parallel_workers,
> max_parallel_workers_per_gather);
> }
> Assert(parallel_workers > 0);
>
> /* Generate a partial append path. */
> appendpath = create_append_path(root, rel, NIL, partial_subpaths,
> NIL, NULL, parallel_workers,
> enable_parallel_append,
> -1);
>
> Note that the 'rel' in this code refers to the partitioned table for
> which an Append path is being considered, so compute_parallel_worker()
> using that 'rel' would use the partitioned table's
> rel_parallel_workers as you are trying to do.
>
> --
> Amit Langote
> EDB: http://www.enterprisedb.com
>
Attachment | Content-Type | Size |
---|---|---|
0001-Allow-setting-parallel_workers-on-partitioned-tables.patch | application/octet-stream | 2.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Finnerty, Jim | 2021-02-15 16:01:16 | Re: PostgreSQL <-> Babelfish integration |
Previous Message | Joel Jacobson | 2021-02-15 15:34:43 | Re: [HACKERS] GSoC 2017: Foreign Key Arrays |