From: | Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Encoding/collation question |
Date: | 2019-12-12 09:37:13 |
Message-ID: | 20191212093713.GA3164@hermes.hilbert.loc |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Dec 12, 2019 at 05:03:59AM +0000, Andrew Gierth wrote:
> Rich> I doubt that my use will notice meaningful differences. Since
> Rich> there are only two or three databases in UTF8 and its collation
> Rich> perhaps I'll convert those to LATIN1 and C.
>
> Note that it's perfectly fine to use UTF8 encoding and C collation (this
> has the effect of sorting strings in Unicode codepoint order); this is
> as fast for comparisons as LATIN1/C is.
>
> For those cases where you need data to be sorted in a
> culturally-meaningful order rather than in codepoint order, you can set
> collations on specific columns or in individual queries.
Nice, thanks for pointing that out. One addition: while this
may seem like "the" magic bullet it should be noted that one
will need additional indexes for culturally-meaningful ORDER
BY sorts to be fast (while having a default non-C collation
one will get a by-default culturally-meaningful index for
that one non-C locale).
Question: is C collation expected to be future-proof /
rock-solid /stable -- like UTF8 for encoding choice -- or
could it end up like the SQL-ASCII encoding did: Yeah, we
support it, it's been in use a long time, it should work,
but, nah, one doesn't really want to choose it over UTF8 if
at all possible, or at least know *exactly* what one is doing
and BTW YMMV ?
Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-12-12 13:35:53 | Re: Encoding/collation question |
Previous Message | Deepti Sharma S | 2019-12-12 05:34:12 | RE: PostgreSQL version compatibility with RHEL7.7 |