Re: Fix XML handling with DOCTYPE

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Chapman Flack <chap(at)anastigmatix(dot)net>
Cc: Ryan Lambert <ryan(at)rustprooflabs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fix XML handling with DOCTYPE
Date: 2019-03-17 15:45:34
Message-ID: 5146.1552837534@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Chapman Flack <chap(at)anastigmatix(dot)net> writes:
> On 03/16/19 17:21, Tom Lane wrote:
>> Hm, so, maybe just
>>
>> (1) always try to parse as document. If successful, we're done.
>>
>> (2) otherwise, if allowed by xmloption, try to parse using our
>> current logic for the CONTENT case.

> What I don't like about that is that (a) the input could be
> arbitrarily long and complex to parse (not that you can't imagine
> a database populated with lots of short little XML snippets, but
> at the same time, a query could quite plausibly deal in yooge ones),
> and (b), step (1) could fail at the last byte of the input, followed
> by total reparsing as (2).

That doesn't seem particularly likely to me: based on what's been
said here, I'd expect parsing with the wrong expectation to usually
fail near the start of the input. In any case, the other patch
also requires repeat parsing, no? It's just doing that in a different
set of cases.

The reason I'm pressing you for a simpler patch is that dump/reload
failures are pretty bad, so ideally we'd find a fix that we're
comfortable with back-patching into the released branches.
Personally I would never dare to back-patch the proposed patch:
it's too complex, so it's not real clear that it doesn't have unwanted
side-effects, and it's not at all certain that there aren't libxml
version dependencies in it. (Maybe another committer with more
familiarity with libxml would evaluate the risks differently, but
I doubt it.) But I think that something close to what I sketched
above would pass muster as safe-to-backpatch.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2019-03-17 16:31:38 Re: jsonpath
Previous Message Tom Lane 2019-03-17 15:03:25 Re: jsonpath