From c5608ea26a1f51998ad3cf987c3f0bda643c87a8 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Wed, 12 Mar 2014 17:13:22 +0200 Subject: Allow opclasses to provide tri-valued GIN consistent functions. With the GIN "fast scan" feature, GIN can skip items without fetching all the keys for them, if it can prove that they don't match regardless of those keys. So far, it has done the proving by calling the boolean consistent function with all combinations of TRUE/FALSE for the unfetched keys, but since that's O(n^2), it becomes unfeasible with more than a few keys. We can avoid calling consistent with all the combinations, if we can tell the operator class implementation directly which keys are unknown. This commit includes a triConsistent function for the built-in array and tsvector opclasses. Alexander Korotkov, with some changes by me. --- doc/src/sgml/gin.sgml | 54 +++++++++++++++++++++++++++++++++++++++++------- doc/src/sgml/xindex.sgml | 13 +++++++++++- 2 files changed, 59 insertions(+), 8 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml index 9ffa8be7bc3..0bd759220fb 100644 --- a/doc/src/sgml/gin.sgml +++ b/doc/src/sgml/gin.sgml @@ -74,15 +74,15 @@ All it takes to get a GIN access method working is to - implement four (or five) user-defined methods, which define the behavior of + implement a few user-defined methods, which define the behavior of keys in the tree and the relationships between keys, indexed items, and indexable queries. In short, GIN combines extensibility with generality, code reuse, and a clean interface. - The four methods that an operator class for - GIN must provide are: + There are three methods that an operator class for + GIN must provide: @@ -190,7 +190,18 @@ + + + An operator class must also provide a function to check if an indexed item + matches the query. It comes in two flavors, a boolean consistent + function, and a ternary triConsistent function. + triConsistent covers the functionality of both, so providing + triConsistent alone is sufficient. However, if the boolean variant is + significantly cheaper to calculate, it can be advantegous to provide both. + If only the boolean variant is provided, some optimizations that depend on + refuting index items before fetching all the keys are disabled. + bool consistent(bool check[], StrategyNumber n, Datum query, int32 nkeys, Pointer extra_data[], bool *recheck, @@ -241,10 +252,38 @@ + + + GinLogicValue triConsistent(GinLogicValue check[], StrategyNumber n, Datum query, + int32 nkeys, Pointer extra_data[], + Datum queryKeys[], bool nullFlags[]) + + + triConsistent is similar to consistent, + but instead of a boolean check[], there are three possible + values for each key: GIN_TRUE, GIN_FALSE and + GIN_MAYBE. GIN_FALSE and GIN_TRUE + have the same meaning as regular boolean values. + GIN_MAYBE means that the presence of that key is not known. + When GIN_MAYBE values are present, the function should only + return GIN_TRUE if the item matches whether or not the index item + contains the corresponding query keys. Likewise, the function must + return GIN_FALSE only if the item does not match, whether or not it + contains the GIN_MAYBE keys. If the result depends on the GIN_MAYBE + entries, ie. the match cannot be confirmed or refuted based on the + known query keys, the function must return GIN_MAYBE. + + + When there are no GIN_MAYBE values in the check vector, + GIN_MAYBE return value is equivalent of setting + recheck flag in the boolean consistent function. + + + - Optionally, an operator class for - GIN can supply a fifth method: + Optionally, an operator class for GIN can supply the + following method: @@ -282,8 +321,9 @@ above vary depending on the operator class. The item values passed to extractValue are always of the operator class's input type, and all key values must be of the class's STORAGE type. The type of - the query argument passed to extractQuery and - consistent is whatever is specified as the right-hand input + the query argument passed to extractQuery, + consistent and triConsistent is whatever is + specified as the right-hand input type of the class member operator identified by the strategy number. This need not be the same as the item type, so long as key values of the correct type can be extracted from it. diff --git a/doc/src/sgml/xindex.sgml b/doc/src/sgml/xindex.sgml index 806b35824e4..0fe97ba4b5d 100644 --- a/doc/src/sgml/xindex.sgml +++ b/doc/src/sgml/xindex.sgml @@ -567,7 +567,10 @@ consistent - determine whether value matches query condition + + determine whether value matches query condition (boolean variant) + (optional if support function 6 is present) + 4 @@ -580,6 +583,14 @@ 5 + + triConsistent + + determine whether value matches query condition (ternary variant) + (optional if support function 4 is present) + + 6 + -- cgit v1.2.3