From: | Louis Tian <louis(dot)tian(at)aquamonix(dot)com(dot)au> |
---|---|
To: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
Cc: | "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres |
Date: | 2023-04-09 02:46:33 |
Message-ID: | SY4P282MB1052B56685C1A94EAEB3682DA6949@SY4P282MB1052.AUSP282.PROD.OUTLOOK.COM |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi Adrian,
> No it is not as Israel Brewster pointed out.
I think we need make a distinction between an operation and a statement here
The examples Israel presented and yours are both non-idempotent statements because the different value being upserted each time the statement is executed. Not because upsert as an operation is not idempotent.
Likewise, In HTTP, PUT method is defined as idempotent. No one would expect different PUT requests with different payload to be idempotent.
Also, I don't think I am not the only one who think upsert is logically idempotent. Just a few seconds of searching on google.
- "The upsert action is a combination of insert and update. It allows us to write idempotent statements". (https://jonmeyers.io/blog/use-on-conflict-to-upsert-in-postgresql. )
-"One common example of an idempotent operation is an upsert (update or insert) statement in a database." (https://www.arecadata.com/core-data-engineering-concepts-idempotency)
- "A POST request means that the request is not idempotent, but an "upsert" request is idempotent." (https://softwareengineering.stackexchange.com/questions/426225/rest-how-to-upsert-a-resource-without-an-specific-resource-url#:~:text=A%20POST%20request%20means%20that,%22upsert%22%20request%20is%20idempotent.)
While no as direct as the article above, but the issue reported in those places are essentially stem from the same understanding.
- https://stackoverflow.com/questions/48816629/on-conflict-do-nothing-in-postgres-with-a-not-null-constraint
- https://www.postgresql.org/message-id/flat/1996438.1604952901%40sss.pgh.pa.us#ace8adc1354265aca5672028faea0c0f
So at least I am not only one imaging things?
- "a ready made statement for idempotent data like Mongo's upsert" (https://www.compose.com/articles/the-potency-of-idempotent-with-rabbitmq-and-mongodb-upsert/)
> And therein lies your problem, you are imagining something that does not
> exist and more to the point will most likely not exist as it would break all code that depends on above behavior.
I never argue the way postgresql does for insert ... on conflict is wrong and need to change.
I said "I think it is doing what it reads quite literally" just like in your words "It does what is advertised on the tin". So there is no disagreement here. So I am not saying "insert on conflict" need to change.
What I am point out here is, the "insert...on conflict do update" is not a true "upsert".
I am saying in my opinion it would be nice see UPSERT statement to be added to PostgreSQL so upsert can work "out of box" even when there're not null constraints on that table.
It might never happen judging from the replies I am getting but that's fine. The solution you have showed me is good enough as a workaround (to not depreciate it's value, just not as easy/bullet proof as it could be in my opinion).
It was just a question just to confirm my understanding so I got what I need, so thank you all for that.
Cheers,
Louis Tian
From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Sent: Sunday, April 9, 2023 7:51 AM
To: Louis Tian <louis(dot)tian(at)aquamonix(dot)com(dot)au>; Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres
CAUTION: This email originated from outside of Envirada. Do not click links or open attachments unless you recognize the sender and know the content is safe.
On 4/6/23 17:49, Louis Tian wrote:
> Hi Peter,
>
> Thanks for your reply. Appreciate the help and discussion.
>
>> In general UPSERT (or any definition of it that I can think of) does
>> not imply idempotency.
>
> "Idempotence is the property of certain operations in mathematics and computer science whereby they can be applied multiple times without changing the result beyond the initial application." from Wikipedia.
> the concept of Idempotence when applies to HTTP is consistent with the above. https://developer.mozilla.org/en-US/docs/Glossary/Idempotent. Or are you going by a different defintion that I am not aware of?
> If you execute the same upsert multiple times, the state of the database remains the same as if only execute once.
> If a row already exists, the first statement will update the row so does any subsequent statements. executing the same update multiple time is the same as executing it only once.
> If the row doesn't exist, the first statement will insert that row and any subsequent will try to update, but the update has no real effect since it the value is exactly the same as the insert.
> So by defintion, upsert is idempotent.
No it is not as Israel Brewster pointed out.
To his example I would add:
alter some_table add column ts_upsert_update timestamptz;
insert into some_table values('foo', 'bar') on conflict(tbl_id) do
update set foo_fld = excluded.foo_fld, bar_fld = some_table.bar_fld,
ts_upsert_update = now();
You are substituting whatever definition you have in your head for the
definition as it actually exists.
>
>> It could just be a unique index or a unique constraint. So you can
>> upsert on any individual unique constraint/index, or the primary key.
>> Of course there might be several on a given table, but you can only
>> use one as the "conflict arbiter" per statement.
>
> Understand that I can use any unique constraint with on conflict.
> But semantically the only correct one is the primary key, since that's what identifies a row logically.
> In that sense, any unique column(s) is a potential candidate for primary key.
> It's more of a pedantic point rather than pragmatic one.
> It's less of a problem for PostgreSQL where the semantic importance of primary key is not manifested at implementation level, since all index points to the tuple directly
> Whereas it is more import for Databaes like MySQL where the secondary index points to the primary key index.
Again you are dealing with the imagined instead of the reality. Along
that line you left out that a 'exclusion constraint violation error' can
also trigger the ON CONFLICT.
> Use some pesudo code might be helpful here to explain the difference.
>
> How on conflict works at the moment.
>
> try {
> insert row
> } catch (duplicated key error) {
> update row
> }
>
> How I think it upsert should work
And therein lies your problem, you are imagining something that does not
exist and more to the point will most likely not exist as it would break
all code that depends on above behavior.
>
> if (new.id exists) {
> update row
> } else {
> insert row
> }
>
> I am not expecting an error here. The problem is with no conflict it always go down the insert path first and results in a not null constraint error.
To be expected, as after all the command is:
INSERT INTO <some_table> ...
> While I am expecting the insert is never executed in the first place when that row already exist (as identified by it primary key). So the update execute without error.
> I hope the pesudo code above is enough to clarify the difference?
>
> Cheers,
> Louis Tian
>
>
--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com
--
Message protected by MailGuard: e-mail anti-virus, anti-spam and content filtering.https://www.mailguard.com.au/mg
Click here to report this message as spam:
https://console.mailguard.com.au/ras/25am4N38I1/7rIyiJl23oANcTXYht5xW0/0
From | Date | Subject | |
---|---|---|---|
Next Message | Alban Hertroys | 2023-04-09 09:26:16 | Re: [EXTERNAL]: Re: UPSERT in Postgres |
Previous Message | Louis Tian | 2023-04-09 01:42:21 | Re: [EXTERNAL]: Re: [EXTERNAL]: Re: UPSERT in Postgres |