Quick Links

Re: MERGE SQL Statement for PG11

From:	Peter Geoghegan <pg(at)bowt(dot)ie>
To:	Serge Rielau <serge(at)rielau(dot)com>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MERGE SQL Statement for PG11
Date:	2017-10-28 19:50:15
Message-ID:	CAH2-Wz=FT8jh305TT55D+=EReL4DfkaGOZR026Keyp_2JmLzJQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Oct 27, 2017 at 3:00 PM, Serge Rielau <serge(at)rielau(dot)com> wrote:
>> What other systems *do* have this restriction? I've never seen one that did.
>
> Not clear what you are leading up to here.
> When I did MERGE in DB2 there was also no limitation:
> "Each row in the target can only be operated on once. A row in the target can only be identified as MATCHED with one row in the result table of the table-reference”
> What there was however was a significant amount of code I had to write and test to enforce the above second sentence.

Then it seems that we were talking about two different things all along.

> Maybe in PG there is a trivial way to detect an expanding join and block it at runtime.

There is for ON CONFLICT. See the cardinality violation logic within
ExecOnConflictUpdate(). (There are esoteric cases where this error can
be raised due to a wCTE that does an insert "from afar", which is
theoretically undesirable but not actually a problem.)

The MERGE implementation that I have in mind would probably do almost
the same thing, and make the "HeapTupleSelfUpdated" case within
ExecUpdate() raise an error when the caller happened to be a MERGE,
rather than following the historic UPDATE behavior. (The behavior is
to silently suppress a second or subsequent UPDATE attempt from the
same command, a behavior that Simon's mock MERGE documentation
references.)

> So the whole point I’m trying to make is that I haven’t seen the need for the extra work I had to do once the feature appeared in the wild.

That seems pretty reasonable to me.

My whole point is that I think it's a mistake to do things like lock
rows ahead of evaluating any UPDATE predicate, in the style of ON
CONFLICT, in order to replicate the ON CONFLICT guarantees [1].

I'm arguing for implementation simplicity, too. Trying to implement
MERGE in a way that extends ON CONFLICT seems like a big mistake to
me, because ON CONFLICT updates rows on the basis of a would-be
duplicate violation, along with all the baggage that that carries.
This is actually enormously different to an equi-join that is fed by a
scan using an MVCC snapshot. The main difference is that there
actually is no MVCC snapshot in play in most cases [2]. If *no* row
with the PK value of 5 is visible to our MVCC snapshot, but an xact
committed having inserted such a row, that still counts as a CONFLICT
with READ COMMITTED.

[1] https://wiki.postgresql.org/wiki/UPSERT#Goals_for_implementation
[2] https://www.postgresql.org/docs/devel/static/transaction-iso.html#xact-read-committed
--
Peter Geoghegan

In response to

Re: MERGE SQL Statement for PG11 at 2017-10-27 22:00:47 from Serge Rielau

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Geoghegan	2017-10-28 20:04:29	Re: MERGE SQL Statement for PG11
Previous Message	Simon Riggs	2017-10-28 19:49:12	Re: MERGE SQL Statement for PG11