From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: XML with invalid chars |
Date: | 2011-05-08 22:25:27 |
Message-ID: | 4DC71857.5070902@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 04/27/2011 11:41 PM, Noah Misch wrote:
> On Wed, Apr 27, 2011 at 11:22:37PM -0400, Andrew Dunstan wrote:
>> On 04/27/2011 05:30 PM, Noah Misch wrote:
>>> To make things worse, the dump/reload problems seems to depend on your version
>>> of libxml2, or something. With git master, a CentOS 5 system with
>>> 2.6.26-2.1.2.8.el5_5.1 accepts the ^A byte, but an Ubuntu 8.04 LTS system with
>>> 2.6.31.dfsg-2ubuntu rejects it. Even with a patch like this, systems with a
>>> lenient libxml2 will be liable to store XML data that won't restore on a system
>>> with a strict libxml2. Perhaps we should emit a build-time warning if the local
>>> libxml2 is lenient?
>> No, I think we need to be strict ourselves.
> Then I suppose we'd also scan for invalid characters in xml_parse()? Or, at
> least, do so when linked to a libxml2 that neglects to do so itself?
Yep.
>>> Injecting the check here aids "xmlelement" and "xmlforest" , but "xmlcomment"
>>> and "xmlpi" still let the invalid byte through. You can also still inject the
>>> byte into an attribute value via "xmlelement". I wonder if it wouldn't make
>>> more sense to just pass any XML that we generate from scratch through libxml2.
>>> There are a lot of holes to plug, otherwise.
>> Maybe there are, but I'd want lots of convincing that we should do that
>> at this stage. Maybe for 9.2. I think we can plug the holes fairly
>> simply for xmlpi and xmlcomment, and catch the attributes by moving this
>> check up into map_sql_value_to_xml_value().
> I don't have much convincing to offer -- hunting down the holes seem fine, too.
>
>
I think I've done that. Here's the patch I have now. It looks like we
can catch pretty much everything by putting checks in four places, which
isn't too bad.
Please review and try to break.
cheers
andrew
Attachment | Content-Type | Size |
---|---|---|
xmlchars2.patch | text/x-patch | 2.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | lee Richard | 2011-05-08 23:30:26 | Re: Questions about the internal of fastpath function call |
Previous Message | Heikki Linnakangas | 2011-05-08 22:11:24 | Re: patch for new feature: Buffer Cache Hibernation |