From: | "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: Parallel Inserts in CREATE TABLE AS |
Date: | 2020-11-26 06:44:40 |
Message-ID: | 30fa39b09b614f799df364f59307e57e@G08CNEXMBPEKD05.g08.fujitsu.local |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi ,
> On Thu, Nov 26, 2020 at 7:47 AM Hou, Zhijie <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>
> wrote:
> >
> > Hi,
> >
> > I have an issue about the following code:
> >
> > econtext = node->ps.ps_ExprContext;
> > ResetExprContext(econtext);
> >
> > + if (ISCTAS(node->ps.intoclause))
> > + {
> > + ExecParallelInsertInCTAS(node);
> > + return NULL;
> > + }
> >
> > /* If no projection is required, we're done. */
> > if (node->ps.ps_ProjInfo == NULL)
> > return slot;
> >
> > /*
> > * Form the result tuple using ExecProject(), and return it.
> > */
> > econtext->ecxt_outertuple = slot;
> > return ExecProject(node->ps.ps_ProjInfo);
> >
> > It seems the projection will be skipped.
> > Is this because projection is not required in this case ?
> > (I'm not very familiar with where the projection will be.)
> >
>
> For parallel inserts in CTAS, I don't think we need to project the tuples
> being returned from the underlying plan nodes, and also we have nothing
> to project from the Gather node further up. The required projection will
> happen while the tuples are being returned from the underlying nodes and
> the projected tuples are being directly fed to CTAS's dest receiver
> intorel_receive(), from there into the created table. We don't need
> ExecProject again in ExecParallelInsertInCTAS().
>
> For instance, projection will always be done when the tuple is being returned
> from an underlying sequential scan node(see ExecScan() -->
> ExecProject() and this is true for both leader and workers. In both leader
> and workers, we are just calling CTAS's dest receiver intorel_receive().
>
> Thoughts?
I took a deep look at the projection logic.
In most cases, you are right that Gather node does not need projection.
In some rare cases, such as Subplan (or initplan I guess).
The projection will happen in Gather node.
The example:
Create table test(i int);
Create table test2(a int, b int);
insert into test values(generate_series(1,10000000,1));
insert into test2 values(generate_series(1,1000,1), generate_series(1,1000,1));
postgres=# explain(verbose, costs off) select test.i,(select i from (select * from test2) as tt limit 1) from test where test.i < 2000;
QUERY PLAN
----------------------------------------
Gather
Output: test.i, (SubPlan 1)
Workers Planned: 2
-> Parallel Seq Scan on public.test
Output: test.i
Filter: (test.i < 2000)
SubPlan 1
-> Limit
Output: (test.i)
-> Seq Scan on public.test2
Output: test.i
In this case, projection is necessary,
because the subplan will be executed in projection.
If skipped, the table created will loss some data.
Best regards,
houzj
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Khandekar | 2020-11-26 06:48:12 | Re: Improving spin-lock implementation on ARM. |
Previous Message | Pavel Stehule | 2020-11-26 06:42:33 | Re: proposal: possibility to read dumped table's name from file |