From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | tgl(at)sss(dot)pgh(dot)pa(dot)us |
Cc: | ishii(at)postgresql(dot)org, andrew(at)dunslane(dot)net, laurenz(dot)albe(at)wien(dot)gv(dot)at, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: invalidly encoded strings |
Date: | 2007-09-11 03:29:36 |
Message-ID: | 20070911.122936.68059988.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
> Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
> >> BTW, it strikes me that there is another hole that we need to plug in
> >> this area, and that's the convert() function. Being able to create
> >> a value of type text that is not in the database encoding is simply
> >> broken. Perhaps we could make it work on bytea instead (providing
> >> a cast from text to bytea but not vice versa), or maybe we should just
> >> forbid the whole thing if the database encoding isn't SQL_ASCII.
>
> > Please don't do that. It will break an usefull use case of convert().
>
> The reason we have a problem here is that we've been choosing
> convenience over safety in encoding-related issues. I wonder if we must
> stoop to having a "strict_encoding_checks" GUC variable to satisfy
> everyone.
Please show me concrete examples how I could introduce a vulnerability
using this kind of convert() usage.
> > A user has a database encoded in UTF-8. He has English, French,
> > Chinese and Japanese data in tables. To sort the tables in the
> > language order, he will do like this:
>
> > SELECT * FROM japanese_table ORDER BY convert(japanese_text using utf8_to_euc_jp);
>
> > Without using convert(), he will get random order of data.
>
> I'd say that *with* convert() he will get a random order of data. This
> is making a boatload of unsupportable assumptions about the locale and
> encoding of the surrounding database. There are a lot of bad-encoding
> situations for which strcoll() simply breaks down completely and can't
> even deliver self-consistent answers.
>
> It might work the way you are expecting if the database uses SQL_ASCII
> encoding and C locale --- and I'd be fine with allowing convert() only
> when the database encoding is SQL_ASCII.
I don't believe that. With C locale, the convert() works fine as I
described.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-09-11 04:01:36 | What is happening on buildfarm member dugong? |
Previous Message | Tom Lane | 2007-09-11 03:20:02 | Re: invalidly encoded strings |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavan Deolasee | 2007-09-11 03:51:49 | Re: HOT patch - version 15 |
Previous Message | Tom Lane | 2007-09-11 03:20:02 | Re: invalidly encoded strings |