From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com> |
Cc: | Greg Nancarrow <gregn4422(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com> |
Subject: | Re: Parallel INSERT (INTO ... SELECT ...) |
Date: | 2021-02-05 05:14:59 |
Message-ID: | CA+HiwqEr4z_5iWd2L_nOxG=8S=imCYFum+_RmgE36UyK7grhKg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
While reviewing the v14 set of patches (will send my comments
shortly), I too had some reservations on how 0001 decided to go about
setting hasModifyingCTE.
On Fri, Feb 5, 2021 at 1:51 PM Hou, Zhijie <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com> wrote:
> > > I took a look into the hasModifyingCTE bugfix recently, and found a
> > > possible bug case without the parallel insert patch.
> > >
> > > ---------------------------------
> > > drop table if exists test_data1;
> > > create table test_data1(a int, b int) ; insert into test_data1 select
> > > generate_series(1,1000), generate_series(1,1000); set
> > > force_parallel_mode=on;
> > >
> > > CREATE TEMP TABLE bug6051 AS
> > > select i from generate_series(1,3) as i;
> > >
> > > SELECT * FROM bug6051;
> > > CREATE RULE bug6051_ins AS ON INSERT TO bug6051 DO INSTEAD select a as
> > > i from test_data1;
> > >
> > > WITH t1 AS ( DELETE FROM bug6051 RETURNING * ) INSERT INTO bug6051
> > > SELECT * FROM t1;
> > >
> > > *******
> > > ***ERROR: cannot assign XIDs during a parallel operation
> > > *******
> > > ---------------------------------
> > >
> > > I debugged it and it did have modifycte in the parsetree after rewrite.
> > > I think if we can properly set the hasModifyingCTE, we can avoid this
> > error by not consider parallel for this.
> > >
> >
> > Thanks. You've identified that the bug exists for SELECT too. I've verified
> > that the issue is fixed by the bugfix included in the Parallel INSERT patch.
> > Are you able to review my bugfix?
> > Since the problem exists for SELECT in the current Postgres code, I'd like
> > to pull that bugfix out and provide it as a separate fix.
+1, a separate patch for this seems better.
> > My concern is that there may well be a better way to fix the issue - for
> > example, during the re-writing, rather than after the query has been
> > re-written.
> Hi,
>
> I took a look at the fix and have some thoughts on it.
> (Please correct me if you have tried this idea and found something is wrong)
>
> IMO, the main reason for the hasModifyingCTE=false is that:
> the Rewriter did not update the hasModifyingCTE when copying the existsing 'cteList' to the rewrited one.
>
> It seems there is only one place where ctelist will be copied separately.
> -------
> static Query *
> rewriteRuleAction(Query *parsetree,
> ...
> /* OK, it's safe to combine the CTE lists */
> sub_action->cteList = list_concat(sub_action->cteList,
> copyObject(parsetree->cteList));
> + sub_action->hasModifyingCTE |= parsetree->hasModifyingCTE;
> --------
>
> Based on the above, if we update the hasModifyingCTE here, we may solve this problem.
>
> And there is another point here, the sub_action may be not the final parsetree.
> If defined the rule like "DO INSTEAD insert into xx select xx from xx", Rewriter will
> Put the ctelist into subquery in parsetree's rtable.
> In this case, we may also need to update the final parsetree too.
> (I think you know this case, I found same logic in the latest patch)
>
> --------
> static Query *
> rewriteRuleAction(Query *parsetree,
> ...
> if (sub_action_ptr)
> + {
> *sub_action_ptr = sub_action;
> + rule_action->hasModifyingCTE |= parsetree->hasModifyingCTE;
> + }
> --------
>
> And the Basic test passed.
> What do you think ?
That is very close to what I was going to suggest, which is this:
diff --git a/src/backend/rewrite/rewriteHandler.c
b/src/backend/rewrite/rewriteHandler.c
index 0672f497c6..3c4417af98 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -631,6 +631,8 @@ rewriteRuleAction(Query *parsetree,
checkExprHasSubLink((Node *) rule_action->returningList);
}
+ rule_action->hasModifyingCTE |= parsetree->hasModifyingCTE;
+
return rule_action;
}
--
Amit Langote
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2021-02-05 05:22:54 | Re: Single transaction in the tablesync worker? |
Previous Message | Hou, Zhijie | 2021-02-05 04:51:19 | RE: Parallel INSERT (INTO ... SELECT ...) |