Re: INSERT ... ON CONFLICT syntax issues

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: INSERT ... ON CONFLICT syntax issues
Date: 2015-04-25 18:05:49
Message-ID: CAM3SWZT758uPijDh5HYp9Sa8DNO8085M85+zSoDicR_j29uV7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 25, 2015 at 2:01 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> My problem with the WHERE being inside the parens in the above is that
> it's
> a) different from CREATE INDEX

I don't think that that's an important goal.

> b) unclear whether the WHERE belongs to colb or the whole index
> expression. The equivalent for aggregates, which I bet is going to be
> used less often, caused a fair amount of confusing.

I don't see those two situations as being comparable. The inference
specification does not accept aggregates. It seems obvious to me that
the predicate only ever applies to the entire table. And it's obvious
that it's part of the inference specification because it appears in
parentheses with everything else - otherwise, *users* might find this
phantom WHERE clause ambiguous/confusing.

> But I'm generally having some doubts about the syntax.
>
> Right now it's
> INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE.
>
> A couple things:
>
> a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
> if we, at some later point, also want to handle other kind of
> violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...

I think that naming unique violations alone would be wrong (not to
mention ludicrously verbose). Heikki and I both feel that the CONFLICT
keyword captures the fact that this could be a dup violation, or an
exclusion violation. The syntax has been like this for some time, and
hasn't been a point of contention for a long time, so I thought this
was settled. Note that the syntax is quite similar to the SQLite
syntax of the same feature, that has ON CONFLICT IGNORE (it also has
ON CONFLICT REPLACE, but not ON CONFLICT UPDATE).

> b) For me there's a WITH before the index inference clause missing, to
> have it read in 'SQL' style.

I'm not seeing it. BTW, Robert was the one who initially proposed that
the unique index inference clause follow this exact style (albeit
before it accepted a WHERE clause to infer partial indexes, which was
only added a couple of months ago).

> c) Right now the UPDATE can refer to pseudo relations 'TARGET' and
> 'EXCLUDED'. I think especially the latter doesn't fit anymore at
> all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD?

NEW and OLD are terribly misleading, since surely the NEW tuple is the
one actually appended to the relation by the UPDATE, and the OLD one
is the existing one (not the excluded one). Plus they have all that
intellectual baggage from rules.

CONFLICTING, as Greg Stark pointed out many months ago, is something
that applies to both tuples that can be referenced, which is why I
*stopped* using it months ago. They conflict with *each other*. Any
conflict must pertain to both.

Dictionary.com defines "exclude" as:

"""
verb (used with object), excluded, excluding.
1.
to shut or keep out; prevent the entrance of.
2.
to shut out from consideration, privilege, etc.:
Employees and their relatives were excluded from participation in the contest.
3.
to expel and keep out; thrust out; eject:
He was excluded from the club for infractions of the rules.
"""

Seems pretty descriptive of the situation to me - I actually put a lot
of thought into this. Additionally, the word is widely understood by
non-native speakers. TARGET is also very descriptive, because it
situationally describes either the existing tuple actually present in
the table, or (from a RETURNING clause) the final tuple present in the
table post-UPDATE. We use the term "target" for that pervasively (in
the docs and in the code).

> So I guess it boils down to that I think we should switch the syntax to
> be:
>
> INSERT ... ON UNIQUE VIOLATION [WITH (cola, colb) WHERE ...] DO {NOTHING|UPDATE}

Beauty is in the eye of the beholder and all, but that seems pretty
ugly to me. Honestly, I think we should just accept that the predicate
appears in the parentheses on the odd occasion that it appears at all
- partial unique indexes are not all that common.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2015-04-25 18:19:10 Re: Fwd: [GENERAL] 4B row limit for CLOB tables
Previous Message Andrew Dunstan 2015-04-25 17:19:17 Re: forward vs backward slashes in msvc build code