diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2020-03-31 11:14:30 -0400 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2020-03-31 11:14:42 -0400 |
commit | 70dc4c509b330fdd965d795e8d7f41f09d56c9ae (patch) | |
tree | 5ee559440900f2f9a778289a9ccff8b59788858f /doc/src | |
parent | e07e2a40bd0c3c02a9baf2e5ddccf665e73208fb (diff) |
Fix lquery's NOT handling, and add ability to quantify non-'*' items.
The existing implementation of the ltree ~ lquery match operator is
sufficiently complex and undocumented that it's hard to tell exactly
what it does. But one thing it clearly gets wrong is the combination
of NOT symbols (!) and '*' symbols. A pattern such as '*.!foo.*'
should, by any ordinary understanding of regular expression behavior,
match any ltree that has at least one label that's not "foo". As best
we can tell by experimentation, what it's actually matching is any
ltree in which *no* label is "foo". That's surprising, and not at all
what the documentation says.
Now, that's arguably a useful behavior, so if we rewrite to fix the
bug we should provide some other way to get it. To do so, add the
ability to attach lquery quantifiers to non-'*' items as well as '*'s.
Then the pattern '!foo{,}' expresses "any ltree in which no label is
foo". For backwards compatibility, the default quantifier for non-'*'
items has to be "{1}", although the default for '*' items is '{,}'.
I wouldn't have done it like that in a green field, but it's not
totally horrible.
Armed with that, rewrite checkCond() from scratch. Treating '*' and
non-'*' items alike makes it simpler, not more complicated, so that
the function actually gets a lot shorter than it was.
Filip RembiaĆkowski, Tom Lane, Nikita Glukhov, per a very
ancient bug report from M. Palm
Discussion: https://postgr.es/m/CAP_rww=waX2Oo6q+MbMSiZ9ktdj6eaJj0cQzNu=Ry2cCDij5fw@mail.gmail.com
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/ltree.sgml | 38 |
1 files changed, 24 insertions, 14 deletions
diff --git a/doc/src/sgml/ltree.sgml b/doc/src/sgml/ltree.sgml index ae4b33ec85e..d7dd55540a8 100644 --- a/doc/src/sgml/ltree.sgml +++ b/doc/src/sgml/ltree.sgml @@ -60,7 +60,8 @@ <type>lquery</type> represents a regular-expression-like pattern for matching <type>ltree</type> values. A simple word matches that label within a path. A star symbol (<literal>*</literal>) matches zero - or more labels. For example: + or more labels. These can be joined with dots to form a pattern that + must match the whole label path. For example: <synopsis> foo <lineannotation>Match the exact label path <literal>foo</literal></lineannotation> *.foo.* <lineannotation>Match any label path containing the label <literal>foo</literal></lineannotation> @@ -69,19 +70,25 @@ foo <lineannotation>Match the exact label path <literal>foo</literal></l </para> <para> - Star symbols can also be quantified to restrict how many labels - they can match: + Both star symbols and simple words can be quantified to restrict how many + labels they can match: <synopsis> *{<replaceable>n</replaceable>} <lineannotation>Match exactly <replaceable>n</replaceable> labels</lineannotation> *{<replaceable>n</replaceable>,} <lineannotation>Match at least <replaceable>n</replaceable> labels</lineannotation> *{<replaceable>n</replaceable>,<replaceable>m</replaceable>} <lineannotation>Match at least <replaceable>n</replaceable> but not more than <replaceable>m</replaceable> labels</lineannotation> -*{,<replaceable>m</replaceable>} <lineannotation>Match at most <replaceable>m</replaceable> labels — same as </lineannotation> *{0,<replaceable>m</replaceable>} +*{,<replaceable>m</replaceable>} <lineannotation>Match at most <replaceable>m</replaceable> labels — same as </lineannotation>*{0,<replaceable>m</replaceable>} +foo{<replaceable>n</replaceable>,<replaceable>m</replaceable>} <lineannotation>Match at least <replaceable>n</replaceable> but not more than <replaceable>m</replaceable> occurrences of <literal>foo</literal></lineannotation> +foo{,} <lineannotation>Match any number of occurrences of <literal>foo</literal>, including zero</lineannotation> </synopsis> + In the absence of any explicit quantifier, the default for a star symbol + is to match any number of labels (that is, <literal>{,}</literal>) while + the default for a non-star item is to match exactly once (that + is, <literal>{1}</literal>). </para> <para> There are several modifiers that can be put at the end of a non-star - label in <type>lquery</type> to make it match more than just the exact match: + <type>lquery</type> item to make it match more than just the exact match: <synopsis> @ <lineannotation>Match case-insensitively, for example <literal>a@</literal> matches <literal>A</literal></lineannotation> * <lineannotation>Match any label with this prefix, for example <literal>foo*</literal> matches <literal>foobar</literal></lineannotation> @@ -97,17 +104,20 @@ foo <lineannotation>Match the exact label path <literal>foo</literal></l </para> <para> - Also, you can write several possibly-modified labels separated with - <literal>|</literal> (OR) to match any of those labels, and you can put - <literal>!</literal> (NOT) at the start to match any label that doesn't - match any of the alternatives. + Also, you can write several possibly-modified non-star items separated with + <literal>|</literal> (OR) to match any of those items, and you can put + <literal>!</literal> (NOT) at the start of a non-star group to match any + label that doesn't match any of the alternatives. A quantifier, if any, + goes at the end of the group; it means some number of matches for the + group as a whole (that is, some number of labels matching or not matching + any of the alternatives). </para> <para> Here's an annotated example of <type>lquery</type>: <programlisting> -Top.*{0,2}.sport*@.!football|tennis.Russ*|Spain -a. b. c. d. e. +Top.*{0,2}.sport*@.!football|tennis{1,}.Russ*|Spain +a. b. c. d. e. </programlisting> This query will match any label path that: </para> @@ -129,8 +139,8 @@ a. b. c. d. e. </listitem> <listitem> <para> - then a label not matching <literal>football</literal> nor - <literal>tennis</literal> + then has one or more labels, none of which + match <literal>football</literal> nor <literal>tennis</literal> </para> </listitem> <listitem> @@ -632,7 +642,7 @@ ltreetest=> SELECT path FROM test WHERE path ~ '*.Astronomy.*'; Top.Collections.Pictures.Astronomy.Astronauts (7 rows) -ltreetest=> SELECT path FROM test WHERE path ~ '*.!pictures@.*.Astronomy.*'; +ltreetest=> SELECT path FROM test WHERE path ~ '*.!pictures@.Astronomy.*'; path ------------------------------------ Top.Science.Astronomy |