From: | Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> |
---|---|
To: | peter_e(at)gmx(dot)net, e99re41(at)DoCS(dot)UU(dot)SE |
Cc: | t-ishii(at)sra(dot)co(dot)jp, tgl(at)sss(dot)pgh(dot)pa(dot)us, hackers(at)postgreSQL(dot)org |
Subject: | Re: [HACKERS] Multibyte in autoconf |
Date: | 1999-12-08 14:31:52 |
Message-ID: | 19991208233152I.t-ishii@sra.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> > > If no --pgencoding, you get default (non-multibyte) coding even
> > > if you compiled with --enable-mb.
> >
> > Not agreed. I think it would be better to give an error if no default
> > encoding is not sepecified if configured with --enable-mb. Reasons:
> >
> > 1) Users tend to use only one encoding rather than switching multiple
> > encoding database. Thus major encoding for the user should be properly
> > set as the default.
>
> Users also initdb only once, and that is the time to *choose* what they
> want. Then and only then. Once they're done with that they'll never have
> to worry about it again.
>
> > 2) if non-multibyte coding such as SQL_ASCII is accidently set as the
> > default, and if a multi-byte user create a database with no encoding
> > arugument, the result would be a disaster.
>
> Huh, so if I compile my database with multibyte and then I then I choose
> to not have a default encoding in template1 but maybe I want to have the
> multibyte option available for some other database later on, that will be
> a disaster? Not so good.
First of all, it's not possible not to have a default encoding in
template1. Probably you mean you choose SQL_ASCII (encoding no. is 0)
as the defaut encoding. Anyway, I'm going to give an example scenario
of the disaster.
1) initdb with no encoding augument (suppose that SQL_ASCII is set as
the default encoding in template1)
2) a user creates a database with no encoding augument. he thought
that the default encoding is EUC_JP.
3) he makes a table then fills it with some Japanese data.
4) later he pulls data from the table and found that it no longer
Japanese!
> What I'm also thinking of is the the package maintainer. They should be
> able to provide a "neutral" yet multibyte (and locale, and cyrillic)
> enabled package, and one should be able to use that even if one doesn't
> want to use the multibyte features right now or at all.
So you think a postgres package with multibyte/locale/cyrillic options
enabled is a good thing for everyone? At least I don't like locale
option. It is not only useless for multibyte languages such as
Japanese, but it makes slow for text comparison. I wouldn't say locale
is useless for everyone, however. I admit it is usefull for single
byte encodings.
I think it would be very hard to make a unified ideal package for
everyone.
> Also, it should not be initdb's job to verify that the encodings are
> correct, supported, etc. The backend should find that out itself. That
> eliminates duplication of the same logic, which the backend can do better
> anyway.
Actually that duplication can be eliminated by using the same
code. I think pg_id command will do the job.
BTW, I don't think the current implmentation of multibyte is not yet
completed. Next target would be NATIONAL CHARATER support (not sure
it's for 7.0, though). I would like to find a solution for the
problem of locale I stated above.
--
Tatsuo Ishii
From | Date | Subject | |
---|---|---|---|
Next Message | Don Schindhelm | 1999-12-08 15:24:06 | Free SQLweb interface to postgresql w/E-Commerce capabilities |
Previous Message | Brian E Gallew | 1999-12-08 14:00:23 | Re: [HACKERS] Table aliases in delete statements? |