Quick Links

Re: Unicode + LC_COLLATE

From:	"Priem, Alexander" <ap(at)cict(dot)nl>
To:	'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Unicode + LC_COLLATE
Date:	2004-04-22 14:37:17
Message-ID:	2A07EC2D0BC2774AAD6F74769F60D52A08330B@ahmose.cict_ad.nl
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

> C locale basically means "sort by the byte sequence values". It'll do
> something self-consistent, but maybe not what you'd like for UTF8
> characters.
>
> Does that sort rationally at all? I should think you'd need to specify
> an LC_COLLATE setting that's designed for UTF8 encoding, not 8859-15.
>
> If you only ever store characters that are in 7-bit ASCII then none of
> this will affect you, and you can get away with broken combinations of
> encoding and locale. But if you'd like to sort characters outside the
> minimal ASCII set then you need to get it right ...

But if you use anything other than C, you can't use indexes in Like-clauses,
right?

Would lc-collate=C be bad in combination with UNICODE encoding? What
lc-collate setting would you recommend for UNICODE encoding which will
provide good sorting for all (most) common languages? (dutch, english,
french, german)

Alexander Priem

Responses

Re: Unicode + LC_COLLATE at 2004-04-23 06:32:56 from John Sidney-Woollett
Re: Unicode + LC_COLLATE at 2004-04-23 10:06:12 from Peter Eisentraut

Browse pgsql-general by date

	From	Date	Subject
Next Message	scott.marlowe	2004-04-22 14:59:17	Re: 7.3.4 on Linux: UPDATE .. foo=foo+1 degrades massivly
Previous Message	Michael Chaney	2004-04-22 14:34:36	Re: FW: Postgres alongside MS SQL Server