From 1dc5ebc9077ab742079ce5dac9a6664248d42916 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Thu, 14 May 2015 12:08:40 -0400 Subject: Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund --- doc/src/sgml/storage.sgml | 42 ++++++++++++++++++++++++++-- doc/src/sgml/xtypes.sgml | 71 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 111 insertions(+), 2 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml index d8c52875d82..e5b7b4b68d0 100644 --- a/doc/src/sgml/storage.sgml +++ b/doc/src/sgml/storage.sgml @@ -503,8 +503,9 @@ comparison table, in which all the HTML pages were cut down to 7 kB to fit. TOAST pointers can point to data that is not on disk, but is elsewhere in the memory of the current server process. Such pointers obviously cannot be long-lived, but they are nonetheless useful. There -is currently just one sub-case: -pointers to indirect data. +are currently two sub-cases: +pointers to indirect data and +pointers to expanded data. @@ -518,6 +519,43 @@ that the referenced data survives for as long as the pointer could exist, and there is no infrastructure to help with this. + +Expanded TOAST pointers are useful for complex data types +whose on-disk representation is not especially suited for computational +purposes. As an example, the standard varlena representation of a +PostgreSQL array includes dimensionality information, a +nulls bitmap if there are any null elements, then the values of all the +elements in order. When the element type itself is variable-length, the +only way to find the N'th element is to scan through all the +preceding elements. This representation is appropriate for on-disk storage +because of its compactness, but for computations with the array it's much +nicer to have an expanded or deconstructed +representation in which all the element starting locations have been +identified. The TOAST pointer mechanism supports this need by +allowing a pass-by-reference Datum to point to either a standard varlena +value (the on-disk representation) or a TOAST pointer that +points to an expanded representation somewhere in memory. The details of +this expanded representation are up to the data type, though it must have +a standard header and meet the other API requirements given +in src/include/utils/expandeddatum.h. C-level functions +working with the data type can choose to handle either representation. +Functions that do not know about the expanded representation, but simply +apply PG_DETOAST_DATUM to their inputs, will automatically +receive the traditional varlena representation; so support for an expanded +representation can be introduced incrementally, one function at a time. + + + +TOAST pointers to expanded values are further broken down +into read-write and read-only pointers. +The pointed-to representation is the same either way, but a function that +receives a read-write pointer is allowed to modify the referenced value +in-place, whereas one that receives a read-only pointer must not; it must +first create a copy if it wants to make a modified version of the value. +This distinction and some associated conventions make it possible to avoid +unnecessary copying of expanded values during query execution. + + For all types of in-memory TOAST pointer, the TOAST management code ensures that no such pointer datum can accidentally get diff --git a/doc/src/sgml/xtypes.sgml b/doc/src/sgml/xtypes.sgml index 2459616281d..ac0b8a2943f 100644 --- a/doc/src/sgml/xtypes.sgml +++ b/doc/src/sgml/xtypes.sgml @@ -300,6 +300,77 @@ CREATE TYPE complex ( + + Another feature that's enabled by TOAST support is the + possibility of having an expanded in-memory data + representation that is more convenient to work with than the format that + is stored on disk. The regular or flat varlena storage format + is ultimately just a blob of bytes; it cannot for example contain + pointers, since it may get copied to other locations in memory. + For complex data types, the flat format may be quite expensive to work + with, so PostgreSQL provides a way to expand + the flat format into a representation that is more suited to computation, + and then pass that format in-memory between functions of the data type. + + + + To use expanded storage, a data type must define an expanded format that + follows the rules given in src/include/utils/expandeddatum.h, + and provide functions to expand a flat varlena value into + expanded format and flatten the expanded format back to the + regular varlena representation. Then ensure that all C functions for + the data type can accept either representation, possibly by converting + one into the other immediately upon receipt. This does not require fixing + all existing functions for the data type at once, because the standard + PG_DETOAST_DATUM macro is defined to convert expanded inputs + into regular flat format. Therefore, existing functions that work with + the flat varlena format will continue to work, though slightly + inefficiently, with expanded inputs; they need not be converted until and + unless better performance is important. + + + + C functions that know how to work with an expanded representation + typically fall into two categories: those that can only handle expanded + format, and those that can handle either expanded or flat varlena inputs. + The former are easier to write but may be less efficient overall, because + converting a flat input to expanded form for use by a single function may + cost more than is saved by operating on the expanded format. + When only expanded format need be handled, conversion of flat inputs to + expanded form can be hidden inside an argument-fetching macro, so that + the function appears no more complex than one working with traditional + varlena input. + To handle both types of input, write an argument-fetching function that + will detoast external, short-header, and compressed varlena inputs, but + not expanded inputs. Such a function can be defined as returning a + pointer to a union of the flat varlena format and the expanded format. + Callers can use the VARATT_IS_EXPANDED_HEADER() macro to + determine which format they received. + + + + The TOAST infrastructure not only allows regular varlena + values to be distinguished from expanded values, but also + distinguishes read-write and read-only pointers to + expanded values. C functions that only need to examine an expanded + value, or will only change it in safe and non-semantically-visible ways, + need not care which type of pointer they receive. C functions that + produce a modified version of an input value are allowed to modify an + expanded input value in-place if they receive a read-write pointer, but + must not modify the input if they receive a read-only pointer; in that + case they have to copy the value first, producing a new value to modify. + A C function that has constructed a new expanded value should always + return a read-write pointer to it. Also, a C function that is modifying + a read-write expanded value in-place should take care to leave the value + in a sane state if it fails partway through. + + + + For examples of working with expanded values, see the standard array + infrastructure, particularly + src/backend/utils/adt/array_expanded.c. + + -- cgit v1.2.3