Re: Duplicate Key Values

From: mark bradley <markbradyju(at)outlook(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Duplicate Key Values
Date: 2025-03-07 14:34:49
Message-ID: SJ2PR22MB43286857EAC74D1EE332B08CBAD52@SJ2PR22MB4328.namprd22.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

This is what MS Copilot has to say about this apparent bug where Postgres inserts extra rows violating a primary keys uniqueness constraint:

Yes, this issue has been encountered by others. There are a few potential reasons why this might happen:

1.
Sequence Out of Sync: Sometimes, the sequence that generates unique values for the primary key can become out of sync, especially after a bulk import or a database restore. You can check if the sequence is out of sync and reset it if necessary.
2.
Index Corruption: Index corruption can occur due to various reasons, such as hardware failures or bugs in earlier versions of PostgreSQL. This can lead to duplicate primary keys being inserted.
3.
Table Inheritance: If you are using table inheritance, primary keys are not enforced among inherited tables. This can lead to duplicates if not handled correctly.
4.
Application Logic: Sometimes, the application logic might inadvertently insert duplicate records. Reviewing the application code and insert statements can help identify and resolve such issues.

To resolve the issue, you can:

*
Check and reset the sequence if it's out of sync.
*
Rebuild the index if it's corrupted.

Any of the first 3 could be involved. There isn't an application involved other than pgAdmin.

1.
Originally, the key in the node table was a sequence, but I changed it to a non-sequence.
2.
There is no index on the primary key node_id, and I understand there should be one.
3.
I didn't explicitly use Postgres inheritance but there are two tables that are subclasses of node. There are dataset nodes and processing_node [s] tables. Each is a type of node and have primary keys that are foreign keys from the node table. This key is node_id.

What to do? I hesitate to just delete my tables and start over because this error will reoccur.

Best regards,
Mark Brady
amazon.com/author/markjbrady<https://amazon.com/author/markjbrady>
________________________________
From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Sent: Thursday, March 6, 2025 3:34 PM
To: mark bradley <markbradyju(at)outlook(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Duplicate Key Values

On 3/6/25 12:06, mark bradley wrote:

My mistake I forgot to Cc list on my previous post, which was:

That would be an issue and also would mean it is not a PK.

In psql do:

\d dataset

and show the results as text in your reply.

Ccing list

The below shows there is an index("dataset_pkey") on node_id.

Note, reindexing will take a lock on the table that prevents changing
data while the operation is running. See the below for more information:

https://www.postgresql.org/docs/current/sql-reindex.html

If the table is not to big and you can interrupt access to it then the
simplest command to run would be:

REINDEX TABLE dataset;

> Universal Metadata Schema=# \d dataset
> Table "public.dataset"
> Column | Type | Collation |
> Nullable | Defau
> lt
> ---------------------------+---------------------------+-----------+----------+------
> ---
> node_id | integer | |
> not null |
> dataset_name | character varying(25) | |
> not null |
> notes | text | |
> |
> dataset_type | database_type | |
> not null |
> dataset_maturity | database_maturity_type | |
> not null |
> disposition | disposition_type | |
> not null |
> start_date | date | |
> |
> end_date | date | |
> |
> most_recent_update | date | |
> |
> update_periodicity | interval | |
> |
> system_of_record | text | |
> |
> point_of_contact | integer | |
> not null |
> dataset_url | text | |
> |
> classification_level | classification_level_type | |
> not null |
> physical_location | text | |
> |
> quality_control | yes_no_type | |
> not null |
> dataset_documentation_url | text | |
> not null |
> description | text | |
> |
> node_type | node_type | |
> |
> dummy | integer | |
> |
> Indexes:
> "dataset_pkey" PRIMARY KEY, btree (node_id)
> Foreign-key constraints:
> "node_id" FOREIGN KEY (node_id) REFERENCES node(node_id) NOT VALID
> "poc" FOREIGN KEY (point_of_contact) REFERENCES poc(poc_id) NOT VALID
> Referenced by:
> TABLE "dataset_table" CONSTRAINT "dataset" FOREIGN KEY (node_id)
> REFERENCES datas
> et(node_id) NOT VALID
> TABLE "dataset_subject" CONSTRAINT "dataset_subject_node_id_fkey"
> FOREIGN KEY (no
> de_id) REFERENCES dataset(node_id)
> TABLE "system_dataset" CONSTRAINT "system_dataset_node_id_fkey"
> FOREIGN KEY (node
> _id) REFERENCES dataset(node_id) NOT VALID
> Inherits: node
>
>
> Best regards,
> Mark Brady
> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
> *Sent:* Thursday, March 6, 2025 3:03 PM
> *To:* mark bradley <markbradyju(at)outlook(dot)com>
> *Subject:* Re: Duplicate Key Values
> On 3/6/25 10:51, mark bradley wrote:
> Reply to list alos.
> Ccing list.
>
>> Looks like there is no index on node_id at the moment
>
> That would be an issue and also would mean it is not a PK.
>
> In psql do:
>
> \d dataset
>
> and show the results as text in your reply.
>
>>
>>
>> Mark Brady, Ph.D.
>> Deputy Chief Data Officer, TRMC
>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>_
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
>> *Sent:* Thursday, March 6, 2025 1:22 PM
>> *To:* mark bradley <markbradyju(at)outlook(dot)com>; Ron Johnson
>> <ronljohnsonjr(at)gmail(dot)com>; pgsql-general <pgsql-general(at)postgresql(dot)org>
>> *Subject:* Re: Duplicate Key Values
>> On 3/6/25 10:11, mark bradley wrote:
>>> Here is the table definition:
>>>
>>>
>>> And here is the error message I get when I try to delete a duplicate:
>>
>> Please answer the following:
>>
>> 1) Did you not see duplicates with the old version of pgAdmin4?
>>
>> 2) What do you see if you use psql?
>>
>> 3) Did you upgrade/move the Postgres server or the underlying OS?
>>
>> 4) Have you tried reindexing the node_id field?
>>
>>>
>>>
>>> Mark Brady,
>>> _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady
>> <https://amazon.com/author/markjbrady
> <https://amazon.com/author/markjbrady>>>_
>>> ------------------------------------------------------------------------
>>
>> --
>> Adrian Klaver
>> adrian(dot)klaver(at)aklaver(dot)com
>>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message mark bradley 2025-03-07 14:55:18 Re: Duplicate Key Values
Previous Message Greg Sabino Mullane 2025-03-07 13:45:12 Re: Review my steps for rollback to restore point