From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Mark Dilger <pgsql(at)markdilger(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us> |
Subject: | Re: Bug in UTF8-Validation Code? |
Date: | 2007-06-13 23:35:55 |
Message-ID: | 46707F5B.7080802@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
What is the state of play with this item? I think this is a must-fix bug
for 8.3. There was a flurry of messages back in April but since then I
don't recall seeing anything.
cheers
andrew
Mark Dilger wrote:
> Mark Dilger wrote:
>> Bruce Momjian wrote:
>>> Added to TODO:
>>>
>>> * Fix cases where invalid byte encodings are accepted by the
>>> database,
>>> but throw an error on SELECT
>>>
>>> http://archives.postgresql.org/pgsql-hackers/2007-03/msg00767.php
>>>
>>> Is anyone working on fixing this bug?
>>
>> Hi, has anyone volunteered to fix this bug? I did not see any reply
>> on the mailing list to your question above.
>>
>> mark
>
> OK, I can take a stab at fixing this. I'd like to state some
> assumptions so people can comment and reply:
>
> I assume that I need to fix *all* cases where invalid byte encodings
> get into the database through functions shipped in the core distribution.
>
> I assume I do not need to worry about people getting bad data into the
> system through their own database extensions.
>
> I assume that the COPY problem discussed up-thread goes away once you
> eliminate all the paths by which bad data can get into the system.
> However, existing database installations with bad data already loaded
> will not be magically fixed with these code patches.
>
> Do any of the string functions (see
> http://www.postgresql.org/docs/8.2/interactive/functions-string.html)
> run the risk of generating invalid utf8 encoded strings? Do I need to
> add checks? Are there known bugs with these functions in this regard?
>
> If not, I assume I can add mbverify calls to the various input
> routines (textin, varcharin, etc) where invalid utf8 could otherwise
> enter the system.
>
> I assume that this work can be limited to HEAD and that I don't need
> to back-patch it. (I suspect this assumption is a contentious one.)
>
> Advice and comments are welcome,
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2007-06-14 01:19:40 | Re: Can autovac try to lock multiple tables at once? |
Previous Message | PFC | 2007-06-13 22:09:02 | Re: Controlling Load Distributed Checkpoints |