Re: unicode match normal forms

From: goldgraeber-werbetechnik(at)t-online(dot)de
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: unicode match normal forms
Date: 2021-05-18 05:50:10
Message-ID: wolfgang-1210518075009.A027656@linux-tuxedo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>> El día lunes, mayo 17, 2021 a las 01:27:40p. m. -0000, hamann(dot)w(at)t-online(dot)de escribió:
>> >> > Hi,
>> > >> > in unicode letter ä exists in two versions - linux and windows use a composite whereas macos prefers
>> > the decomposed form. Is there any way to make a semi-exact match that accepts both variants?
>> > This question is not about fulltext but about matching filenames across a network - I wish to avoid two equally-looking
>> > filenames.
>> >> There is only *one* codepoint for the German letter a Umlaut:
>> LATIN SMALL LETTER A WITH DIAERESI U+00E4
>>
Hi Matthias,

unfortunately there also is letter a with combining dieretic - and it is used by MacOS
The mac seems to prefer decomposed characters in other contexts as well, so in my
everyday job I used to have fun with product catalogues from a few companies.
Depending on the computer used for adding / editing a productthe relevant field could be
iso-latin-1, utf8 normal, or utf8 decomposed

>> Said that, having such chars (non ASCII) in file names, I count as a bad
>> idea.
I usually try to avoid whitespace and accented charactersin filenames, to be able to use ssh and scp
without much hassle, but I am not the user in this case.

Now, if I look at a music collection (stored as folders with mp3 files for the tracks), I would really prefer
"Einstürzende Neubauten" over Einstuerzende_Neubauten

Regards
Wolfgang

>>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ben Hoskings 2021-05-18 07:46:19 Re: Occasional lengthy locking causing stalling on commit
Previous Message goldgraeber-werbetechnik 2021-05-18 05:18:08 Re: unicode match normal forms