summaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2017-10-14 15:21:39 -0400
committerTom Lane <tgl@sss.pgh.pa.us>2017-10-14 15:21:39 -0400
commit4de2d4fba38f4f7aff7f95401eb43a6cd05a6db4 (patch)
treed899e4a764d3d6234918a1652882c71f1af564ed /doc/src
parent5f340cb30ce2f0d9f272840b0d977b0a4b854f0b (diff)
Explicitly track whether aggregate final functions modify transition state.
Up to now, there's been hard-wired assumptions that normal aggregates' final functions never modify their transition states, while ordered-set aggregates' final functions always do. This has always been a bit limiting, and in particular it's getting in the way of improving the built-in ordered-set aggregates to allow merging of transition states. Therefore, let's introduce catalog and CREATE AGGREGATE infrastructure that lets the finalfn's behavior be declared explicitly. There are now three possibilities for the finalfn behavior: it's purely read-only, it trashes the transition state irrecoverably, or it changes the state in such a way that no more transfn calls are possible but the state can still be passed to other, compatible finalfns. There are no examples of this third case today, but we'll shortly make the built-in OSAs act like that. This change allows user-defined aggregates to explicitly disclaim support for use as window functions, and/or to prevent transition state merging, if their implementations cannot handle that. While it was previously possible to handle the window case with a run-time error check, there was not any way to prevent transition state merging, which in retrospect is something commit 804163bc2 should have provided for. But better late than never. In passing, split out pg_aggregate.c's extern function declarations into a new header file pg_aggregate_fn.h, similarly to what we've done for some other catalog headers, so that pg_aggregate.h itself can be safe for frontend files to include. This lets pg_dump use the symbolic names for relevant constants. Discussion: https://postgr.es/m/4834.1507849699@sss.pgh.pa.us
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/catalogs.sgml20
-rw-r--r--doc/src/sgml/ref/create_aggregate.sgml72
-rw-r--r--doc/src/sgml/xaggr.sgml20
3 files changed, 99 insertions, 13 deletions
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 9af77c1f5ab..cfec2465d26 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -487,6 +487,26 @@
<entry>True to pass extra dummy arguments to <structfield>aggmfinalfn</structfield></entry>
</row>
<row>
+ <entry><structfield>aggfinalmodify</structfield></entry>
+ <entry><type>char</type></entry>
+ <entry></entry>
+ <entry>Whether <structfield>aggfinalfn</structfield> modifies the
+ transition state value:
+ <literal>r</literal> if it is read-only,
+ <literal>s</literal> if the <structfield>aggtransfn</structfield>
+ cannot be applied after the <structfield>aggfinalfn</structfield>, or
+ <literal>w</literal> if it writes on the value
+ </entry>
+ </row>
+ <row>
+ <entry><structfield>aggmfinalmodify</structfield></entry>
+ <entry><type>char</type></entry>
+ <entry></entry>
+ <entry>Like <structfield>aggfinalmodify</structfield>, but for
+ the <structfield>aggmfinalfn</structfield>
+ </entry>
+ </row>
+ <row>
<entry><structfield>aggsortop</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal><link linkend="catalog-pg-operator"><structname>pg_operator</structname></link>.oid</literal></entry>
diff --git a/doc/src/sgml/ref/create_aggregate.sgml b/doc/src/sgml/ref/create_aggregate.sgml
index c96e4faba7b..4d9c8b0b707 100644
--- a/doc/src/sgml/ref/create_aggregate.sgml
+++ b/doc/src/sgml/ref/create_aggregate.sgml
@@ -27,6 +27,7 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> ( [ <replacea
[ , SSPACE = <replaceable class="parameter">state_data_size</replaceable> ]
[ , FINALFUNC = <replaceable class="parameter">ffunc</replaceable> ]
[ , FINALFUNC_EXTRA ]
+ [ , FINALFUNC_MODIFY = { READ_ONLY | SHARABLE | READ_WRITE } ]
[ , COMBINEFUNC = <replaceable class="parameter">combinefunc</replaceable> ]
[ , SERIALFUNC = <replaceable class="parameter">serialfunc</replaceable> ]
[ , DESERIALFUNC = <replaceable class="parameter">deserialfunc</replaceable> ]
@@ -37,6 +38,7 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> ( [ <replacea
[ , MSSPACE = <replaceable class="parameter">mstate_data_size</replaceable> ]
[ , MFINALFUNC = <replaceable class="parameter">mffunc</replaceable> ]
[ , MFINALFUNC_EXTRA ]
+ [ , MFINALFUNC_MODIFY = { READ_ONLY | SHARABLE | READ_WRITE } ]
[ , MINITCOND = <replaceable class="parameter">minitial_condition</replaceable> ]
[ , SORTOP = <replaceable class="parameter">sort_operator</replaceable> ]
[ , PARALLEL = { SAFE | RESTRICTED | UNSAFE } ]
@@ -49,6 +51,7 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> ( [ [ <replac
[ , SSPACE = <replaceable class="parameter">state_data_size</replaceable> ]
[ , FINALFUNC = <replaceable class="parameter">ffunc</replaceable> ]
[ , FINALFUNC_EXTRA ]
+ [ , FINALFUNC_MODIFY = { READ_ONLY | SHARABLE | READ_WRITE } ]
[ , INITCOND = <replaceable class="parameter">initial_condition</replaceable> ]
[ , PARALLEL = { SAFE | RESTRICTED | UNSAFE } ]
[ , HYPOTHETICAL ]
@@ -63,6 +66,7 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> (
[ , SSPACE = <replaceable class="parameter">state_data_size</replaceable> ]
[ , FINALFUNC = <replaceable class="parameter">ffunc</replaceable> ]
[ , FINALFUNC_EXTRA ]
+ [ , FINALFUNC_MODIFY = { READ_ONLY | SHARABLE | READ_WRITE } ]
[ , COMBINEFUNC = <replaceable class="parameter">combinefunc</replaceable> ]
[ , SERIALFUNC = <replaceable class="parameter">serialfunc</replaceable> ]
[ , DESERIALFUNC = <replaceable class="parameter">deserialfunc</replaceable> ]
@@ -73,6 +77,7 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> (
[ , MSSPACE = <replaceable class="parameter">mstate_data_size</replaceable> ]
[ , MFINALFUNC = <replaceable class="parameter">mffunc</replaceable> ]
[ , MFINALFUNC_EXTRA ]
+ [ , MFINALFUNC_MODIFY = { READ_ONLY | SHARABLE | READ_WRITE } ]
[ , MINITCOND = <replaceable class="parameter">minitial_condition</replaceable> ]
[ , SORTOP = <replaceable class="parameter">sort_operator</replaceable> ]
)
@@ -197,7 +202,8 @@ CREATE AGGREGATE <replaceable class="parameter">name</replaceable> (
as described in <xref linkend="xaggr-moving-aggregates">. This requires
specifying the <literal>MSFUNC</>, <literal>MINVFUNC</>,
and <literal>MSTYPE</> parameters, and optionally
- the <literal>MSPACE</>, <literal>MFINALFUNC</>, <literal>MFINALFUNC_EXTRA</>,
+ the <literal>MSPACE</>, <literal>MFINALFUNC</>,
+ <literal>MFINALFUNC_EXTRA</>, <literal>MFINALFUNC_MODIFY</>,
and <literal>MINITCOND</> parameters. Except for <literal>MINVFUNC</>,
these parameters work like the corresponding simple-aggregate parameters
without <literal>M</>; they define a separate implementation of the
@@ -413,6 +419,21 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
</varlistentry>
<varlistentry>
+ <term><literal>FINALFUNC_MODIFY</> = { <literal>READ_ONLY</> | <literal>SHARABLE</> | <literal>READ_WRITE</> }</term>
+ <listitem>
+ <para>
+ This option specifies whether the final function is a pure function
+ that does not modify its arguments. <literal>READ_ONLY</> indicates
+ it does not; the other two values indicate that it may change the
+ transition state value. See <xref linkend="sql-createaggregate-notes"
+ endterm="sql-createaggregate-notes-title"> below for more detail. The
+ default is <literal>READ_ONLY</>, except for ordered-set aggregates,
+ for which the default is <literal>READ_WRITE</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><replaceable class="parameter">combinefunc</replaceable></term>
<listitem>
<para>
@@ -564,6 +585,16 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
</varlistentry>
<varlistentry>
+ <term><literal>MFINALFUNC_MODIFY</> = { <literal>READ_ONLY</> | <literal>SHARABLE</> | <literal>READ_WRITE</> }</term>
+ <listitem>
+ <para>
+ This option is like <literal>FINALFUNC_MODIFY</>, but it describes
+ the behavior of the moving-aggregate final function.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><replaceable class="parameter">minitial_condition</replaceable></term>
<listitem>
<para>
@@ -587,12 +618,12 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
</varlistentry>
<varlistentry>
- <term><literal>PARALLEL</literal></term>
+ <term><literal>PARALLEL =</> { <literal>SAFE</> | <literal>RESTRICTED</> | <literal>UNSAFE</> }</term>
<listitem>
<para>
The meanings of <literal>PARALLEL SAFE</>, <literal>PARALLEL
RESTRICTED</>, and <literal>PARALLEL UNSAFE</> are the same as
- for <xref linkend="sql-createfunction">. An aggregate will not be
+ in <xref linkend="sql-createfunction">. An aggregate will not be
considered for parallelization if it is marked <literal>PARALLEL
UNSAFE</> (which is the default!) or <literal>PARALLEL RESTRICTED</>.
Note that the parallel-safety markings of the aggregate's support
@@ -624,8 +655,8 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
</para>
</refsect1>
- <refsect1>
- <title>Notes</title>
+ <refsect1 id="sql-createaggregate-notes">
+ <title id="sql-createaggregate-notes-title">Notes</title>
<para>
In parameters that specify support function names, you can write
@@ -635,6 +666,34 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
</para>
<para>
+ Ordinarily, Postgres functions are expected to be true functions that
+ do not modify their input values. However, an aggregate transition
+ function, <emphasis>when used in the context of an aggregate</>,
+ is allowed to cheat and modify its transition-state argument in place.
+ This can provide substantial performance benefits compared to making
+ a fresh copy of the transition state each time.
+ </para>
+
+ <para>
+ Likewise, while an aggregate final function is normally expected not to
+ modify its input values, sometimes it is impractical to avoid modifying
+ the transition-state argument. Such behavior must be declared using
+ the <literal>FINALFUNC_MODIFY</> parameter. The <literal>READ_WRITE</>
+ value indicates that the final function modifies the transition state in
+ unspecified ways. This value prevents use of the aggregate as a window
+ function, and it also prevents merging of transition states for aggregate
+ calls that share the same input values and transition functions.
+ The <literal>SHARABLE</> value indicates that the transition function
+ cannot be applied after the final function, but multiple final-function
+ calls can be performed on the ending transition state value. This value
+ prevents use of the aggregate as a window function, but it allows merging
+ of transition states. (That is, the optimization of interest here is not
+ applying the same final function repeatedly, but applying different final
+ functions to the same ending transition state value. This is allowed as
+ long as none of the final functions are marked <literal>READ_WRITE</>.)
+ </para>
+
+ <para>
If an aggregate supports moving-aggregate mode, it will improve
calculation efficiency when the aggregate is used as a window function
for a window with moving frame start (that is, a frame start mode other
@@ -671,7 +730,8 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
Note that whether or not the aggregate supports moving-aggregate
mode, <productname>PostgreSQL</productname> can handle a moving frame
end without recalculation; this is done by continuing to add new values
- to the aggregate's state. It is assumed that the final function does
+ to the aggregate's state. This is why use of an aggregate as a window
+ function requires that the final function be read-only: it must
not damage the aggregate's state value, so that the aggregation can be
continued even after an aggregate result value has been obtained for
one set of frame boundaries.
diff --git a/doc/src/sgml/xaggr.sgml b/doc/src/sgml/xaggr.sgml
index 79a9f288b2b..9e6a6648dc3 100644
--- a/doc/src/sgml/xaggr.sgml
+++ b/doc/src/sgml/xaggr.sgml
@@ -487,6 +487,13 @@ SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY income) FROM households;
C, since their state values aren't definable as any SQL data type.
(In the above example, notice that the state value is declared as
type <type>internal</> &mdash; this is typical.)
+ Also, because the final function performs the sort, it is not possible
+ to continue adding input rows by executing the transition function again
+ later. This means the final function is not <literal>READ_ONLY</>;
+ it must be declared in <xref linkend="sql-createaggregate">
+ as <literal>READ_WRITE</>, or as <literal>SHARABLE</> if it's
+ possible for additional final-function calls to make use of the
+ already-sorted state.
</para>
<para>
@@ -622,16 +629,15 @@ SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY income) FROM households;
<programlisting>
if (AggCheckCallContext(fcinfo, NULL))
</programlisting>
- One reason for checking this is that when it is true for a transition
- function, the first input
+ One reason for checking this is that when it is true, the first input
must be a temporary state value and can therefore safely be modified
in-place rather than allocating a new copy.
See <function>int8inc()</> for an example.
- (This is the <emphasis>only</>
- case where it is safe for a function to modify a pass-by-reference input.
- In particular, final functions for normal aggregates must not
- modify their inputs in any case, because in some cases they will be
- re-executed on the same final state value.)
+ (While aggregate transition functions are always allowed to modify
+ the transition value in-place, aggregate final functions are generally
+ discouraged from doing so; if they do so, the behavior must be declared
+ when creating the aggregate. See <xref linkend="sql-createaggregate">
+ for more detail.)
</para>
<para>