From: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
---|---|
To: | Richard van der Hoff <richard(at)matrix(dot)org>, pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Inexplicable duplicate rows with unique constraint |
Date: | 2020-01-16 17:05:06 |
Message-ID: | 16bd51ab-69ef-04f9-d47a-3e692b1ff595@aklaver.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 1/16/20 8:50 AM, Richard van der Hoff wrote:
> I'm trying to track down the cause of some duplicate rows in a table
> which I would expect to be impossible due to a unique constraint. I'm
> hoping that somebody here will be able to suggest something I might have
> missed.
>
> The problem relates to a bug filed against our application
> (https://github.com/matrix-org/synapse/issues/6696) At first I put this
> down to random data corruption on a single user's postgres instance, but
> I've now seen three separate reports in as many days and am wondering if
> there is more to it.
>
> We have a table whose schema is as follows:
>
> synapse=# \d current_state_events
> Table "public.current_state_events"
> Column | Type | Modifiers
> ------------+------+-----------
> event_id | text | not null
> room_id | text | not null
> type | text | not null
> state_key | text | not null
> membership | text |
> Indexes:
> "current_state_events_event_id_key" UNIQUE CONSTRAINT, btree
> (event_id)
> "current_state_events_room_id_type_state_key_key" UNIQUE
> CONSTRAINT, btree (room_id, type, state_key)
> "current_state_events_member_index" btree (state_key) WHERE type =
> 'm.room.member'::text
>
> Despite the presence of the
> current_state_events_room_id_type_state_key_key constraint, several
> users have reported seeing errors which suggest that their tables have
> duplicate rows for the same (room_id, type, state_key) triplet and
> indeed querying confirms that to be the case:
>
> synapse=> select count(*), room_id, type, state_key from
> current_state_events group by 2,3,4 order by count(*) DESC LIMIT 2;
> count | room_id | type | state_key
> -------+-----------------------------------+---------------+-------------------------------------
>
> 3 | !ueLfVrSWYGMnFnoCbT:darkfasel.net | m.room.member |
> @irc_ebi_:darkfasel.net
> 3 | !HwocBmCtBcHQhILtYQ:matrix.org | m.room.member |
> @freenode_AlmarShenwan_:matrix.org
> (2 rows)
>
I'm assuming the above are obfuscated?
>
> Further investigation suggests that these are genuinely separate rows
> rather than duplicate entries in an index.
If you use length() on the values are they the same?
>
> The index appears to consider itself valid:
>
> synapse=> select i.* from pg_class c join pg_index i on
> i.indexrelid=c.oid where
> relname='current_state_events_room_id_type_state_key_key';
> indexrelid | indrelid | indnatts | indisunique | indisprimary |
> indisexclusion | indimmediate | indisclustered | indisvalid |
> indcheckxmin | indisready | indislive | indisreplident | indkey |
> indcollation | indclass | indoption | indexprs | indpred
> ------------+----------+----------+-------------+--------------+----------------+--------------+----------------+------------+--------------+------------+-----------+----------------+--------+--------------+----------------+-----------+----------+---------
>
> 17023 | 16456 | 3 | t | f | f
> | t | f | t | f | t
> | t | f | 2 3 4 | 100 100 100 | 3126
> 3126 3126 | 0 0 0 | |
> (1 row)
>
> So, question: what could we be doing wrong to get ourselves into this
> situation?
>
> Some other datapoints which may be relevant:
> * this has been reported by one user on postgres 9.6.15 and one on
> 10.10, though it's hard to be certain of the version that was running
> when the duplication occurred
> * the constraint is added when the table is first created (before any
> data is added)
> * At least one user reports that he has recently migrated his database
> from one server to another via a `pg_dump -C` and later piping into psql.
>
>
--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-01-16 17:08:38 | Re: Inexplicable duplicate rows with unique constraint |
Previous Message | Justin | 2020-01-16 16:55:18 | Re: Is there a GoTo ? |