Re: [HACKERS] Implications of multi-byte support in a distribution

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: Hannu Krosing <hannu(at)trust(dot)ee>, Milan Zamazal <pdm(at)debian(dot)cz>, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Implications of multi-byte support in a distribution
Date: 1999-09-03 00:55:17
Message-ID: 199909030055.JAA01575@ext16.sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > > Each encoding/character set can behave however you want. You can reuse
> > > collation and sorting code from another character set, or define a
> > > unique one.
> > Is it really inside one postmaster instance ?
> > If so, then is the character encoding defined at the create table /
> > create index process (maybe even separately for each field ?) or can I
> > specify it when sort'ing ?
>
> Yes, yes, and yes ;)

But we can't avoid calling strcoll() and some other codes surrounded
by #ifdef LOCALE? I think he actually wants is to define his own
collation *and* not to use locale if the column is ASCII only.

> I would propose that we implement the explicit collation features of
> SQL92 using implicit type conversion. So if you want to use a
> different sorting order on a *compatible* character set, then (looking
> up in Date and Darwen for the syntax...):
>
> 'test string' COLLATE CASE_INSENSITIVITY
>
> becomes internally
>
> case_insensitivity('test string'::text)
>
> and
>
> c1 < c2 COLLATE CASE_INSENSITIVITY
>
> becomes
>
> case_insensitivity(c1) < case_insensitivity(c2)

This idea seems great and elegant. Ok, what about throwing away #ifdef
LOCALE? Same thing can be obtained by defining a special callation
LOCALE_AWARE. This seems much more consistent for me. Or even better,
we could explicitly have predefined COLLATION for each language (these
can be automatically generated from existing locale data). This would
avoid some platform specific locale problems.
---
Tatsuo Ishii

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Lockhart 1999-09-03 01:45:53 Re: [HACKERS] Implications of multi-byte support in a distribution
Previous Message Tom Lane 1999-09-02 23:02:39 Re: [HACKERS] md.c is feeling much better now, thank you