Quick Links

Re: once again, sorting with Unicode

From:	"Troy" <tjk(at)tksoft(dot)com>
To:	antti(dot)haapala(at)iki(dot)fi (Antti Haapala)
Cc:	tjk(at)tksoft(dot)com (Troy K(dot)), postgre(at)totw(dot)org (JBJ), pgsql-sql(at)postgresql(dot)org
Subject:	Re: once again, sorting with Unicode
Date:	2003-02-20 10:51:28
Message-ID:	200302201051.h1KApSSN018184@tksoft.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-sql

You are right, of course. I was thinking in terms of the encoded
data. Applications usually get data in UTF8 or UTF16. If the
input data is true unicode, then there is no difference in
the byte values (just skip the 0x00 bytes).

Cheers,

Troy

>
>
> On Wed, 19 Feb 2003, Troy wrote:
>
> > > I have a multi-lingual database (currently 11 languages) which sorts
> > > fine in MySQL (8859-1 character set) I have now converted the data to
> > > Unicode and compiled Postgre with unicode support.
> > >
> > > I can select and insert unicode and so was rather pleased about that.
> > > Until I saw that it wasn't working properly when ordering!
> >
> > The cause for the different values is the fact that unicode characters
> > have different numeric values from ISO8859-1 and other encodings. Only
> > ascii values are in sync with unicode numeric values. This I am sure you
> > knew.
>
> No, ISO8859-1 maps directly to unicode up to U+00FF. So the actual
> _numeric_ values are the same. But actual byte patterns are encoding
> dependent.
>
> Have you set database encoding to UTF-8? Are you using proper UTF-8
> locales? POSIX compiled locales are often charset dependent.
>
> --
> Antti Haapala
>
>
>
>

In response to

Re: once again, sorting with Unicode at 2003-02-19 12:37:27 from Antti Haapala

Browse pgsql-sql by date

	From	Date	Subject
Next Message	Richard Huxton	2003-02-20 11:08:23	Re: VIEW or Stored Proc - Is this even possible?
Previous Message	Troy	2003-02-20 10:45:37	Re: once again, sorting with Unicode