From ac9099fc1dd460bffaafec19272159dd7bc86f5b Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sun, 4 Apr 2021 14:28:35 -0400 Subject: Fix confusion in SP-GiST between attribute type and leaf storage type. According to the documentation, the attType passed to the opclass config function (and also relied on by the core code) is the type of the heap column or expression being indexed. But what was actually being passed was the type stored for the index column. This made no difference for user-defined SP-GiST opclasses, because we weren't allowing the STORAGE clause of CREATE OPCLASS to be used, so the two types would be the same. But it's silly not to allow that, seeing that the built-in poly_ops opclass has a different value for opckeytype than opcintype, and that if you want to do lossy storage then the types must really be different. (Thus, user-defined opclasses doing lossy storage had to lie about what type is in the index.) Hence, remove the restriction, and make sure that we use the input column type not opckeytype where relevant. For reasons of backwards compatibility with existing user-defined opclasses, we can't quite insist that the specified leafType match the STORAGE clause; instead just add an amvalidate() warning if they don't match. Also fix some bugs that would only manifest when trying to return index entries when attType is different from attLeafType. It's not too surprising that these have not been reported, because the only usual reason for such a difference is to store the leaf value lossily, rendering index-only scans impossible. Add a src/test/modules module to exercise cases where attType is different from attLeafType and yet index-only scan is supported. Discussion: https://postgr.es/m/3728741.1617381471@sss.pgh.pa.us --- doc/src/sgml/ref/create_opclass.sgml | 2 +- doc/src/sgml/spgist.sgml | 74 +++++++++++++++++++++++------------- 2 files changed, 49 insertions(+), 27 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/ref/create_opclass.sgml b/doc/src/sgml/ref/create_opclass.sgml index 2d75a1c0b0d..4f1bfae8223 100644 --- a/doc/src/sgml/ref/create_opclass.sgml +++ b/doc/src/sgml/ref/create_opclass.sgml @@ -234,7 +234,7 @@ CREATE OPERATOR CLASS name [ DEFAUL The data type actually stored in the index. Normally this is the same as the column data type, but some index methods - (currently GiST, GIN and BRIN) allow it to be different. The + (currently GiST, GIN, SP-GiST and BRIN) allow it to be different. The STORAGE clause must be omitted unless the index method allows a different type to be used. If the column data_type is specified diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml index ea88ae45e5b..054234784fa 100644 --- a/doc/src/sgml/spgist.sgml +++ b/doc/src/sgml/spgist.sgml @@ -205,10 +205,12 @@ - Leaf tuples of an SP-GiST tree contain values of the - same data type as the indexed column. Leaf tuples at the root level will - always contain the original indexed data value, but leaf tuples at lower - levels might contain only a compressed representation, such as a suffix. + Leaf tuples of an SP-GiST tree usually contain values + of the same data type as the indexed column, although it is also possible + for them to contain lossy representations of the indexed column. + Leaf tuples stored at the root level will directly represent + the original indexed data value, but leaf tuples at lower + levels might contain only a partial value, such as a suffix. In that case the operator class support functions must be able to reconstruct the original value using information accumulated from the inner tuples that are passed through to reach the leaf level. @@ -330,19 +332,29 @@ typedef struct spgConfigOut - leafType is typically the same as - attType. For the reasons of backward - compatibility, method config can - leave leafType uninitialized; that would - give the same effect as setting leafType equal - to attType. When attType - and leafType are different, then optional + leafType should match the index storage type + defined by the operator class's opckeytype + catalog entry. + (Note that opckeytype can be zero, + implying the storage type is the same as the operator class's input + type, which is the most common situation.) + For reasons of backward compatibility, the config + method can set leafType to some other value, + and that value will be used; but this is deprecated since the index + contents are then incorrectly identified in the catalogs. + Also, it's permissible to + leave leafType uninitialized (zero); + that is interpreted as meaning the index storage type derived from + opckeytype. + + + + When attType + and leafType are different, the optional method compress must be provided. Method compress is responsible for transformation of datums to be indexed from attType to leafType. - Note: both consistent functions will get scankeys - unchanged, without transformation using compress. @@ -677,8 +689,7 @@ typedef struct spgInnerConsistentOut reconstructedValue is the value reconstructed for the parent tuple; it is (Datum) 0 at the root level or if the inner_consistent function did not provide a value at the - parent level. reconstructedValue is always of - spgConfigOut.leafType type. + parent level. traversalValue is a pointer to any traverse data passed down from the previous call of inner_consistent on the parent index tuple, or NULL at the root level. @@ -713,9 +724,14 @@ typedef struct spgInnerConsistentOut necessarily so, so an array is used.) If value reconstruction is needed, set reconstructedValues to an array of the values - of spgConfigOut.leafType type reconstructed for each child node to be visited; otherwise, leave reconstructedValues as NULL. + The reconstructed values are assumed to be of type + spgConfigOut.leafType. + (However, since the core system will do nothing with them except + possibly copy them, it is sufficient for them to have the + same typlen and typbyval + properties as leafType.) If ordered search is performed, set distances to an array of distance values according to orderbys array (nodes with lowest distances will be processed first). Leave it @@ -797,8 +813,7 @@ typedef struct spgLeafConsistentOut reconstructedValue is the value reconstructed for the parent tuple; it is (Datum) 0 at the root level or if the inner_consistent function did not provide a value at the - parent level. reconstructedValue is always of - spgConfigOut.leafType type. + parent level. traversalValue is a pointer to any traverse data passed down from the previous call of inner_consistent on the parent index tuple, or NULL at the root level. @@ -816,8 +831,8 @@ typedef struct spgLeafConsistentOut The function must return true if the leaf tuple matches the query, or false if not. In the true case, if returnData is true then - leafValue must be set to the value of - spgConfigIn.attType type + leafValue must be set to the value (of type + spgConfigIn.attType) originally supplied to be indexed for this leaf tuple. Also, recheck may be set to true if the match is uncertain and so the operator(s) must be re-applied to the actual @@ -834,7 +849,7 @@ typedef struct spgLeafConsistentOut - The optional user-defined method are: + The optional user-defined methods are: @@ -842,15 +857,22 @@ typedef struct spgLeafConsistentOut Datum compress(Datum in) - Converts the data item into a format suitable for physical storage in - a leaf tuple of index page. It accepts + Converts a data item into a format suitable for physical storage in + a leaf tuple of the index. It accepts a value of type spgConfigIn.attType - value and returns - spgConfigOut.leafType - value. Output value should not be toasted. + and returns a value of type + spgConfigOut.leafType. + The output value must not contain an out-of-line TOAST pointer. + + + + Note: the compress method is only applied to + values to be stored. The consistent methods receive query scankeys + unchanged, without transformation using compress. + options -- cgit v1.2.3