From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | pgsql(at)j-davis(dot)com |
Cc: | ishii(at)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us, andrew(at)dunslane(dot)net, laurenz(dot)albe(at)wien(dot)gv(dot)at, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: invalidly encoded strings |
Date: | 2007-09-11 07:17:06 |
Message-ID: | 20070911.161706.26986487.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
> On Tue, 2007-09-11 at 14:50 +0900, Tatsuo Ishii wrote:
> >
> > > On Tue, 2007-09-11 at 12:29 +0900, Tatsuo Ishii wrote:
> > > > Please show me concrete examples how I could introduce a
> > vulnerability
> > > > using this kind of convert() usage.
> > >
> > > Try the sequence below. Then, try to dump and then reload the
> > database.
> > > When you try to reload it, you will get an error:
> > >
> > > ERROR: invalid byte sequence for encoding "UTF8": 0xbd
> >
> > I know this could be a problem (like chr() with invalid byte pattern).
> > What I really want to know is, read query something like this:
> >
> > SELECT * FROM japanese_table ORDER BY convert(japanese_text using
> > utf8_to_euc_jp);
>
> I guess I don't quite understand the question.
>
> I agree that ORDER BY convert() must be safe in the C locale, because it
> just passes the strings to strcmp().
>
> Are you saying that we should not remove convert() until we can support
> multiple locales in one database?
>
> If we make convert() operate on bytea and return bytea, as Tom
> suggested, would that solve your use case?
The problem is, the above use case is just one of what I can think of.
Another use case is, something like this:
SELECT sum(octet_length(convert(text_column using utf8_to_euc_jp))) FROM mytable;
to know the total byte length of text column if it's encoded in
EUC_JP.
So I'm not sure we could change convert() returning bytea without
complaing from users...
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Bartunov | 2007-09-11 07:19:51 | Re: Ts_rank internals |
Previous Message | db | 2007-09-11 06:35:54 | Re: invalidly encoded strings |
From | Date | Subject | |
---|---|---|---|
Next Message | Albe Laurenz | 2007-09-11 07:41:34 | Re: invalidly encoded strings |
Previous Message | Gregory Stark | 2007-09-11 07:12:01 | Re: HOT patch - version 15 |