Re: String comparison problem in select - too many results

From: Durumdara <durumdara(at)gmail(dot)com>
To:
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: String comparison problem in select - too many results
Date: 2018-01-11 12:23:47
Message-ID: CAEcMXhkB5=PLaBbyZWipZ8QR682ASYMo_ZfHbrme0toTBVfvDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Dear Francesco!

My "bug" is that I commonly used Windows environment where the default
collation is ANSI, based on Windows language which is Hungarian here
(Windows1250).
But because of special characters we used UTF8 to store data in database.
I supposed that UTF8.hu_HU is working like local natural order here, and
the common ASCII chars are (like '/') in same position.
Python/Delphi/LibreOffice can sort these numbers correctly (based on local
ANSI sort).

I supposed that UTF8.hu_HU is using mostly same order which is valid here
and it contains all ASCII + ANSI characters we are using here in daily
work, and they are in very similar order.
I never thought that it can't handle normal characters in 7 bit range...

For these numbers I can use C collation, it's ok.

Thank you!

dd

2018-01-11 11:11 GMT+01:00 Francisco Olarte <folarte(at)peoplecall(dot)com>:

> On Thu, Jan 11, 2018 at 9:57 AM, Durumdara <durumdara(at)gmail(dot)com> wrote:
> > I tried in in different servers, different databases.
> > 1.) Windows local PG: LC_COLLATE = 'Hungarian_Hungary.1250' - ok.
> > 2.) Linux remote PG: LC_CTYPE = 'en_US.UTF-8' - also wrong!!! - UTF
> > problem???
>
> Your problem seems to be you consider wrong anything that doesn't
> match your expectation.
>
> > 3.) Forcing C collation: - ok
> > 4.) Replace '/' to 'A': - ok
>
> ... More examples zapped, as they do not prove anything.
>
>
> > The main problem that we have many searches in programme where we suppose
> > good evaluation, and we have more sites with different servers (and
> > collation).
>
> You must define good evaluation. In your case it seems you consider
> good evaluation is lexicographical comparison of ascii char values.
> This is called 'C' collation and you have been told to it.
>
> If your "programme" is doing string comparison in server collation and
> you need good comparison, defined as C collation, that is a bug. Fix
> it.
>
> I would recommend reading
> https://www.postgresql.org/docs/9.6/static/collation.html and related
> docs, but someone who so boldly states postgres collations are
> good/bad surely knows all about it.
>
>
> Try to build from this:
>
>
> with xx(x) as (values ('18/0113'),('18/0212'),('180/2010'))
> select x collate "C" from xx order by 1;
>
> Francisco Olarte.
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andreas Joseph Krogh 2018-01-11 12:51:21 Sv: ORDER BY custom type
Previous Message Andreas Joseph Krogh 2018-01-11 11:00:55 ORDER BY custom type