From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Daniel Verite" <daniel(at)manitou-mail(dot)org> |
Cc: | "Peter Eisentraut" <peter(at)eisentraut(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Vik Fearing <vik(at)2ndquadrant(dot)fr> |
Subject: | Re: Does UCS_BASIC have the right CTYPE? |
Date: | 2023-10-26 21:32:14 |
Message-ID: | 1401159.1698355934@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Daniel Verite" <daniel(at)manitou-mail(dot)org> writes:
> To me the question of what we should put in pg_collation.collctype
> for the "ucs_basic" collation leads to another question which is:
> why do we even consider collctype in the first place?
For starters, C locale should certainly act different from others.
I'm not sold that arguing from Unicode's behavior to other encodings
makes sense, either. Unicode can get away with defining that there's
only one case-folding rule because they have the luxury of inventing
new code points when the "same" glyph should act differently according
to different languages' rules. Encodings with a small number of code
points don't have that luxury. In particular see the mess around dotted
and dotless I/J in Turkish vs. everywhere else.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2023-10-26 21:53:29 | Re: Making aggregate deserialization (and WAL receive) functions slightly faster |
Previous Message | Nathan Bossart | 2023-10-26 21:28:32 | Re: [17] Special search_path names "!pg_temp" and "!pg_catalog" |