Re: Bug in UTF8-Validation Code?

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Michael Fuhr <mike(at)fuhr(dot)org>, Mario Weilguni <mweilguni(at)sime(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Bug in UTF8-Validation Code?
Date: 2007-03-17 16:23:07
Message-ID: 45FC15EB.8030508@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
>> Last year Jeff suggested adding something like:
>> pg_verifymbstr(string,strlen(string),0);
>> to each relevant input routine. Would that be an acceptable solution?
>>
>
> The problem with that is that it duplicates effort: in many cases
> (especially COPY IN) the data's already been validated. I'm not sure
> how to fix that, but I think you'll get some push-back if you double
> the encoding verification work in COPY for nothing.
>
> Given that we are moving away from backslash-enabled literals, I'm
> not as convinced as some that this must be fixed...
>
>
>
>

They will still be available in E'\nn' form, won't they?

One thought I had was that it might make sense to have a flag that would
inhibit the check, that could be set (and reset) by routines that check
for themselves, such as COPY IN. Then bulk load performance should not
be hit much.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2007-03-17 16:45:18 Re: Bison 2.1 on win32
Previous Message Magnus Hagander 2007-03-17 16:17:11 Re: Bison 2.1 on win32