summaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2019-03-23 16:24:30 -0400
committerTom Lane <tgl@sss.pgh.pa.us>2019-03-23 16:24:30 -0400
commitda45927cef781b5fcdc89b9b1b96c3fb1d83bbd3 (patch)
tree66b38a43472fa73b1d305f05781a61c57d55fabc /doc/src
parent4ae9c4bbb03d0cbc4c8bc7ba01b53f388608c2e6 (diff)
Accept XML documents when xmloption = content, as required by SQL:2006+.
Previously we were using the SQL:2003 definition, which doesn't allow this, but that creates a serious dump/restore gotcha: there is no setting of xmloption that will allow all valid XML data. Hence, switch to the 2006 definition. Since libxml doesn't accept <!DOCTYPE> directives in the mode we use for CONTENT parsing, the implementation is to detect <!DOCTYPE> in the input and switch to DOCUMENT parsing mode. This should not cost much, because <!DOCTYPE> should be close to the front of the input if it's there at all. It's possible that this causes the error messages for malformed input to be slightly different than they were before, if said input includes <!DOCTYPE>; but that does not seem like a big problem. In passing, buy back a few cycles in parsing of large XML documents by not doing strlen() of the whole input in parse_xml_decl(). Back-patch because dump/restore failures are not nice. This change shouldn't break any cases that worked before, so it seems safe to back-patch. Chapman Flack (revised a bit by me) Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/datatype.sgml18
1 files changed, 5 insertions, 13 deletions
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index e3cfa97dcd5..9c0e8b6fafd 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4100,9 +4100,11 @@ a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11
<para>
The <type>xml</type> type can store well-formed
<quote>documents</quote>, as defined by the XML standard, as well
- as <quote>content</quote> fragments, which are defined by the
- production <literal>XMLDecl? content</literal> in the XML
- standard. Roughly, this means that content fragments can have
+ as <quote>content</quote> fragments, which are defined by reference
+ to the more permissive
+ <ulink url="https://www.w3.org/TR/2010/REC-xpath-datamodel-20101214/#DocumentNode"><quote>document node</quote></ulink>
+ of the XQuery and XPath data model.
+ Roughly, this means that content fragments can have
more than one top-level element or character node. The expression
<literal><replaceable>xmlvalue</replaceable> IS DOCUMENT</literal>
can be used to evaluate whether a particular <type>xml</type>
@@ -4177,16 +4179,6 @@ SET xmloption TO { DOCUMENT | CONTENT };
data are allowed.
</para>
- <note>
- <para>
- With the default XML option setting, you cannot directly cast
- character strings to type <type>xml</type> if they contain a
- document type declaration, because the definition of XML content
- fragment does not accept them. If you need to do that, either
- use <literal>XMLPARSE</literal> or change the XML option.
- </para>
- </note>
-
</sect2>
<sect2>