Re: MERGE ... RETURNING

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Gurjeet Singh <gurjeet(at)singh(dot)im>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Vik Fearing <vik(at)postgresfriends(dot)org>, Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: MERGE ... RETURNING
Date: 2023-07-13 15:38:10
Message-ID: CAEZATCVjVNWjKqME6ZT1QsfwW9Ddq20D7wKCsbRO5W9fBmparw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 11 Jul 2023 at 21:43, Gurjeet Singh <gurjeet(at)singh(dot)im> wrote:
>
> > > I think the name of function pg_merge_when_clause() can be improved.
> > > How about pg_merge_when_clause_ordinal().
> >
> > That's a bit of a mouthful, but I don't have a better idea right now.
> > I've left the names alone for now, in case something better occurs to
> > anyone.
>
> +1. How do we make sure we don't forget that it needs to be named
> better. Perhaps a TODO item within the patch will help.
>

Thinking about that some more, I think the word "number" is more
familiar to most people than "ordinal". There's the row_number()
function, for example. So perhaps pg_merge_when_clause_number() would
be a better name. It's still quite long, but it's the best I can think
of.

> I believe this elog can be safely turned into an Assert.
>
> + switch (mergeActionCmdType)
> {
> - CmdType commandType = relaction->mas_action->commandType;
> ....
> + case CMD_INSERT:
> ....
> + default:
> + elog(ERROR, "unrecognized commandType: %d", (int)
> mergeActionCmdType);
>
> The same treatment can be applied to the elog(ERROR) in pg_merge_action().
>

Hmm, that's a very common code pattern used in dozens, if not hundreds
of places throughout the backend code, so I don't think this should be
different.

> > > +CREATE FUNCTION merge_into_sq_target(sid int, balance int, delta int,
> > > + OUT action text, OUT tid int,
> > > OUT new_balance int)
> > > +LANGUAGE sql AS
> > > +$$
> > > + MERGE INTO sq_target t
> > > + USING (VALUES ($1, $2, $3)) AS v(sid, balance, delta)
> > > + ON tid = v.sid
> > > + WHEN MATCHED AND tid > 2 THEN
> > >
> > > Again, for consistency, the comparison operator should be >=. There
> > > are a few more occurrences of this comparison in the rest of the file,
> > > that need the same treatment.
> > >
> >
> > I changed the new tests to use ">= 2" (and the COPY test now returns 3
> > rows, with an action of each type, which is easier to read), but I
> > don't think it's really necessary to change all the existing tests
> > from "> 2". There's nothing wrong with the "= 2" case having no
> > action, as long as the tests give decent coverage.
>
> I was just trying to drive these tests towards a consistent pattern.
> As a reader, if I see these differences, the first and the
> conservative thought I have is that these differences must be there
> for a reason, and then I have to work to find out what those reasons
> might be. And that's a lot of wasted effort, just in case someone
> intends to change something in these tests.
>

OK, I see what you're saying. I think it should be a follow-on patch
though, because I don't like the idea of this patch to be making
changes to tests not related to the feature being added.

> { oid => '9499', descr => 'command type of current MERGE action',
> - proname => 'pg_merge_action', provolatile => 'v',
> + proname => 'pg_merge_action', provolatile => 'v', proparallel => 'r',
> ....
> { oid => '9500', descr => 'index of current MERGE WHEN clause',
> - proname => 'pg_merge_when_clause', provolatile => 'v',
> + proname => 'pg_merge_when_clause', provolatile => 'v', proparallel => 'r',
> ....
>
> I see that you've now set proparallel of these functions to 'r'. I'd
> just like to understand how you got to that conclusion.
>

Now that these functions can appear in subqueries in the RETURNING
list, there exists the theoretical possibility that the subquery might
use a parallel plan (actually that can't happen today, for any query
that modifies data, but maybe someday it may become a possibility),
and it's possible to use these functions in a SELECT query inside a
PL/pgSQL function called from the RETURNING list, which might consider
a parallel plan. Since these functions rely on access to executor
state that isn't copied to parallel workers, they must be run on the
leader, hence I think PARALLEL RESTRICTED is the right level to use. A
similar example is pg_trigger_depth().

> --- error when using MERGE support functions outside MERGE
> -SELECT pg_merge_action() FROM sq_target;
>
> I believe it would be worthwhile to keep a record of the expected
> outputs of these invocations in the tests, just in case they change
> over time.
>

Yeah, that makes sense. I'll post an update soon.

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2023-07-13 15:40:35 Re: psql: Add role's membership options to the \du+ command
Previous Message Jeff Davis 2023-07-13 15:30:03 Re: MERGE ... RETURNING