Re: Duplicate key violation on upsert

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Matt Magoffin <postgresql(dot)org(at)msqr(dot)us>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Duplicate key violation on upsert
Date: 2020-03-21 18:05:57
Message-ID: a7aecaad-5bb5-3daf-95e8-2c815039f66c@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 3/20/20 1:32 PM, Matt Magoffin wrote:
>
>> On 21/03/2020, at 8:10 AM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
>> <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>>
>>> The _hyper_1_1931_chunk_da_datum_x_acc_idx index has the same
>>> definition as the da_datum_x_acc_idx above (it is defined on a child
>>> table). That is, they are both essentially:
>>> UNIQUE, btree (node_id, source_id, ts DESC, jdata_a) WHERE jdata_a IS
>>> NOT NULL
>>> The da_datum_pkey index is what the ON CONFLICT cause refers to, so
>>> (node_id, ts, source_id) is UNIQUE as well.
>>
>> Hmm, wonder if you are getting bit by this?:
>>
>> https://www.postgresql.org/docs/12/sql-insert.html#SQL-ON-CONFLICT
>>
>> "INSERT with an ON CONFLICT DO UPDATE clause is a “deterministic”
>> statement. This means that the command will not be allowed to affect
>> any single existing row more than once; a cardinality violation error
>> will be raised when this situation arises. Rows proposed for insertion
>> should not duplicate each other in terms of attributes constrained by
>> an arbiter index or constraint.”
>
> I’m not sure I’m wrapping my head around this. The INSERT affects 1 row
> as the unique values (node_id, ts, source_id) are specified in the
> statement. Is it possible that da_datum_x_acc_idx is used as the arbiter
> index in this situation, rather than da_datum_pkey (that I intended),
> and you’re saying that the jdata_a column is getting updated twice,
> first in the INSERT and second in the DO UPDATE, triggering the
> duplicate key violation?

I was thinking more about this:

"INSERT INTO solardatum.da_datum(ts, node_id, source_id, posted,
jdata_i, jdata_a, jdata_s, jdata_t)
VALUES (…) ..."

from your OP. Namely whether it was:

VALUES (), (), (), ...

and if so there were values in the (),(),() that duplicated each other.

As to the second part of your response, ON CONFLICT does one of either
INSERT or UPDATE. If:

1) There is no conflict for ON CONFLICT (node_id, ts, source_id) then
the INSERT proceeds.

2) If there is a conflict then an UPDATE occurs using the SET values.

Now just me working through this:

da_datum_pkey = (node_id, ts, source_id)
da_datum_x_acc_idx = (node_id, source_id, ts DESC, jdata_a)

If 1) from above applies then da_datum_x_acc_idx will not be tripped as
the only way that could happen is if the node_id, ts, source_id was the
same as an existing row and that can't be true because the PK over the
same values passed.

If 2) from above happened then you are trying to UPDATE a row with
matching PK values(node_id, ts, source_id). Now it is entirely possible
that since you are not testing for constraint violation on (node_id,
source_id, ts DESC, jdata_a) that you be doing SET jdata_a =
EXCLUDED.jdata_a, using a value that would trip da_datum_x_acc_idx

>
> — m@
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message pabloa98 2020-03-21 18:08:26 Re: Could postgres12 support millions of sequences? (like 10 million)
Previous Message Alastair McKinley 2020-03-21 17:58:49 Re: Explain says 8 workers planned, only 1 executed