From: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> |
---|---|
To: | GENERAL <pgsql-general(at)postgresql(dot)org> |
Subject: | XML - DOCTYPE element - documentation suggestion |
Date: | 2010-06-17 18:43:22 |
Message-ID: | 4C1A6CCA.30203@postnewspapers.com.au |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi all
I've been working with XML storage in Pg and was puzzled by the fact
that Pg appears to refuse to store a document with a DOCTYPE declaration
- it was interpreting it as a regular element and rejecting it.
This turns out to be because Pg parses XML as a fragment (ie option
CONTENT) when casting, and XML fragments cannot have a doctype.
Unfortunately the error is ... unhelpful ... and the documentation
neglects to mention this issue. Hence my post.
I didn't see anything about this in the FAQ or in the docs for the XML
datatype
(http://www.postgresql.org/docs/current/interactive/datatype-xml.html)
and was half-way through writing this post when I found a helpful
message on the list:
http://www.mail-archive.com/pgsql-general(at)postgresql(dot)org/msg119713.html
that hinted the way. Even then it took me a while to figure out that you
can't specify DOCUMENT or CONTENT on the XML type its self, but must
specify it while parsing instead and use a CHECK constraint if you want
to require storage of whole documents in a field.
The xml datatype documentation should probably mention that whole
documents must be loaded with an XMLPARSE(DOCUMENT 'doc_text_here), they
cannot just be cast from text to xml as happens when you pass an xml
document as text to a parameter during an INSERT. This should probably
appear under "CREATING XML VALUES" in:
http://www.postgresql.org/docs/current/static/datatype-xml.html
... and probably deserves mention in a new "CAVEATS" or "NOTES" section
too, as it' *will* catch people out even if they R TFM.
I'd expect this to work:
CREATE TABLE test_xml ( doc xml );
INSERT INTO test_xml ( doc ) VALUES (
$$<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE test SYSTEM 'test.dtd'><test>dummy content</test>$$
);
... but it fails with:
ERROR: invalid XML content
LINE 2: $$<?xml version="1.0" encoding="utf-8"?>
^
DETAIL: Entity: line 2: parser error : StartTag: invalid element name
<!DOCTYPE test SYSTEM 'test.dtd'><test>dummy content</test>
^
though xmllint (from libxml) is quite happy with the document. This had
me quite confused for a while.
--
Craig Ringer
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-06-17 18:53:50 | Re: postgres crash SOS |
Previous Message | Marvin S. Addison | 2010-06-17 18:14:12 | Excessive Deadlocks On Concurrent Inserts to Shared Parent Row |