From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Amit Langote <amitlangote09(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, amitdkhan(dot)pg(at)gmail(dot)com |
Subject: | Re: partition routing layering in nodeModifyTable.c |
Date: | 2019-07-19 21:11:10 |
Message-ID: | 20190719211110.GA19074@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2019-Jul-19, Andres Freund wrote:
> On 2019-07-19 17:52:20 +0900, Amit Langote wrote:
> > The first one (0001) deals with reducing the core executor's reliance
> > on es_result_relation_info to access the currently active result
> > relation, in favor of receiving it from the caller as a function
> > argument. So no piece of core code relies on it being correctly set
> > anymore. It still needs to be set correctly for the third-party code
> > such as FDWs.
>
> I'm inclined to just remove it. There's not much code out there relying
> on it, as far as I can tell. Most FDWs don't support the direct modify
> API, and that's afaict the case where we one needs to use
> es_result_relation_info?
Yeah, I too agree with removing it; since our code doesn't use it, it
seems very likely that it will become slightly out of sync with reality
and we'd not notice until some FDW misbehaves weirdly.
> > - slot = ExecDelete(node, tupleid, oldtuple, planSlot,
> > - &node->mt_epqstate, estate,
> > + slot = ExecDelete(node, resultRelInfo, tupleid, oldtuple,
> > + planSlot, &node->mt_epqstate, estate,
> > true, node->canSetTag,
> > false /* changingPart */ , NULL, NULL);
> > break;
>
> This reminds me of another complaint: ExecDelete and ExecInsert() have
> gotten more boolean parameters for partition moving, but only one of
> them is explained with a comment (/* changingPart */) - think we should
> do that for all.
Maybe change the API to use a flags bitmask?
(IMO the placement of the comment inside the function call, making the
comma appear preceded with a space, looks ugly. If we want to add
comments, let's put each param on its own line with the comment beyond
the comma. That's what we do in other places where this pattern is
used.)
> > /* Initialize the executor state. */
> > - estate = create_estate_for_relation(rel);
> > + estate = create_estate_for_relation(rel, &resultRelInfo);
>
> Hm. It kinda seems cleaner if we were to instead return the relevant
> index, rather than the entire ResultRelInfo, as an output from
> create_estate_for_relation(). Makes it clearer that it's still in the
> EState.
Yeah.
> Or perhaps we ought to compute it in a separate step? Then that'd be
> more amenable to support replcating into partition roots.
I'm not quite seeing the shape that you're imagining this would take.
I vote not to mess with that for this patch; I bet that we'll have to
change a few other things in this code when we add better support for
partitioning in logical replication.
> > +/*
> > + * ExecMove
> > + * Move an updated tuple from the input result relation to the
> > + * new partition of its root parent table
> > + *
> > + * This works by first deleting the tuple from the input result relation
> > + * followed by inserting it into the root parent table, that is,
> > + * mtstate->rootResultRelInfo.
> > + *
> > + * Returns true if it's detected that the tuple we're trying to move has
> > + * been concurrently updated.
> > + */
> > +static bool
> > +ExecMove(ModifyTableState *mtstate, ResultRelInfo *resultRelInfo,
> > + ItemPointer tupleid, HeapTuple oldtuple, TupleTableSlot *planSlot,
> > + EPQState *epqstate, bool canSetTag, TupleTableSlot **slot,
> > + TupleTableSlot **inserted_tuple)
> > +{
>
> I know that it was one of the names I proposed, but now that I'm
> thinking about it again, it sounds too generic. Perhaps
> ExecCrossPartitionUpdate() wouldn't be a quite so generic name? Since
> there's only one reference the longer name wouldn't be painful.
That name sounds good. Isn't the return convention backwards? Sounds
like "true" should mean that it succeeded.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2019-07-19 21:17:47 | Re: partition routing layering in nodeModifyTable.c |
Previous Message | Konstantin Knizhnik | 2019-07-19 21:10:34 | Re: Built-in connection pooler |