Re: Inexplicable duplicate rows with unique constraint

From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: "Richard van der Hoff" <richard(at)matrix(dot)org>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Inexplicable duplicate rows with unique constraint
Date: 2020-01-16 17:24:48
Message-ID: f2f4c44f-5673-496b-b42b-cf63ba483a5e@manitou-mail.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Richard van der Hoff wrote:

> synapse=> select count(*), room_id, type, state_key from
> current_state_events group by 2,3,4 order by count(*) DESC LIMIT 2;
> count | room_id | type | state_key
> -------+-----------------------------------+---------------+-------------------------------------
> 3 | !ueLfVrSWYGMnFnoCbT:darkfasel.net | m.room.member |
> @irc_ebi_:darkfasel.net
> 3 | !HwocBmCtBcHQhILtYQ:matrix.org | m.room.member |
> @freenode_AlmarShenwan_:matrix.org

Looking at these columns which are of type text but do not
contain words of any particular language, there's probably
no point in using a linguistic-aware collation for them.

If you maintain the database schema, what you could do to avoid
the dependency on the OS collation and stay clear of the particular
upgrade difficulty of collations is to use COLLATE "C" for this kind of
field, as opposed to the default collation of the database.
As a bonus, operations with the "C" collations tend to be faster,
sometimes even much faster.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2020-01-16 17:27:25 Re: Inexplicable duplicate rows with unique constraint
Previous Message Richard van der Hoff 2020-01-16 17:24:02 Re: Inexplicable duplicate rows with unique constraint