From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Louis Tian <louis(dot)tian(at)aquamonix(dot)com(dot)au> |
Cc: | "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: UPSERT in Postgres |
Date: | 2023-04-06 20:58:32 |
Message-ID: | CAH2-WzkSGmA-ZyMFu_EqQJ8tSgToj6h2=w8znTUa0Nw4oQvVHA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Apr 6, 2023 at 1:21 PM Louis Tian <louis(dot)tian(at)aquamonix(dot)com(dot)au> wrote:
> An implicit assumption behind this definition is that table must have a primary key for the upsert operation to make sense since it's the primary key that uniquely identifies a row.
It could just be a unique index or a unique constraint. So you can
upsert on any individual unique constraint/index, or the primary key.
Of course there might be several on a given table, but you can only
use one as the "conflict arbiter" per statement.
> But can we achieve the same effect with "INSERT ... ON CONFLICT DO UPDATE" like a lot of references on the internet seems to suggest.
>
> insert into person (id, name) values (0, 'foo') on conflict ("id") do update set id=excluded.id, name=excluded.name
> insert into person (id, is_active) values (0, true) on conflict ("id") do update set id=excluded.id, is_active=excluded.is_active
>
> Unfortunately. the second statement will fail due to violation of the not null constraint on the "name" column.
> PostgreSQL will always try to insert the row into the table first. and only fallback to update when the uniqueness constraint is violated.
> Is this behavior wrong? maybe not, I think it is doing what it reads quite literally.
It sort of has to work that way, though. In general your example might
*not* fail, due to a row-level before trigger in the insert path.
Why doesn't your proposed upsert syntax have the same problem? I mean,
how could it not? I guess it doesn't if you assume that it'll never
take the insert path with your not NULL constraint example? But if you
know that for sure, why not just use a regular update statement? On
the other hand, if you're not sure if the insert path can be taken,
then why is it actually helpful to not just throw an error at the
earliest opportunity?
Surely upsert means "update or insert", so why wouldn't the user expect
to see an error like this, independent of the specifics of the row in question?
Isn't the user tacitly saying "I don't specifically know if the update or insert
path will be taken in respect of any given row" by using ON CONFLICT
DO UPDATE in the first place?
> That being said, I have never had a need for the ON CONFLICT DO UPDATE statement other than where I need upsert.
> But using it as "upsert" is only valid when the table is absent of any NOT NULL constraint on it's non primary key columns.
I don't know what you mean by that. "Valid"?
> The MERGE command introduced in PG15 in theory can be used to do UPSERT properly that is void of the aforementioned limitation.
> The downside is it is rather verbose.
The MERGE command has various race conditions that are particularly
relevant to UPSERT type use cases. See the wiki page you referenced
for a huge amount of information on this.
> *Feature Request*
> Given that UPSERT is an *idempotent* operator it is extremely useful.
In general UPSERT (or any definition of it that I can think of) does
not imply idempotency.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2023-04-06 21:00:56 | Re: UPSERT in Postgres |
Previous Message | Tom Lane | 2023-04-06 20:04:43 | Re: PostgreSQL Mailing list public archives : search not working ... |