summaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/func.sgml70
-rw-r--r--doc/src/sgml/queries.sgml175
-rw-r--r--doc/src/sgml/ref/select.sgml33
3 files changed, 271 insertions, 7 deletions
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index b1e94d7b9e2..89a609f61c1 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12228,7 +12228,9 @@ NULL baz</literallayout>(3 rows)</entry>
<xref linkend="functions-aggregate-statistics-table">.
The built-in ordered-set aggregate functions
are listed in <xref linkend="functions-orderedset-table"> and
- <xref linkend="functions-hypothetical-table">.
+ <xref linkend="functions-hypothetical-table">. Grouping operations,
+ which are closely related to aggregate functions, are listed in
+ <xref linkend="functions-grouping-table">.
The special syntax considerations for aggregate
functions are explained in <xref linkend="syntax-aggregates">.
Consult <xref linkend="tutorial-agg"> for additional introductory
@@ -13326,6 +13328,72 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
to the rule specified in the <literal>ORDER BY</> clause.
</para>
+ <table id="functions-grouping-table">
+ <title>Grouping Operations</title>
+
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry>Function</entry>
+ <entry>Return Type</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+
+ <tbody>
+
+ <row>
+ <entry>
+ <indexterm>
+ <primary>GROUPING</primary>
+ </indexterm>
+ <function>GROUPING(<replaceable class="parameter">args...</replaceable>)</function>
+ </entry>
+ <entry>
+ <type>integer</type>
+ </entry>
+ <entry>
+ Integer bitmask indicating which arguments are not being included in the current
+ grouping set
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>
+ Grouping operations are used in conjunction with grouping sets (see
+ <xref linkend="queries-grouping-sets">) to distinguish result rows. The
+ arguments to the <literal>GROUPING</> operation are not actually evaluated,
+ but they must match exactly expressions given in the <literal>GROUP BY</>
+ clause of the associated query level. Bits are assigned with the rightmost
+ argument being the least-significant bit; each bit is 0 if the corresponding
+ expression is included in the grouping criteria of the grouping set generating
+ the result row, and 1 if it is not. For example:
+<screen>
+<prompt>=&gt;</> <userinput>SELECT * FROM items_sold;</>
+ make | model | sales
+-------+-------+-------
+ Foo | GT | 10
+ Foo | Tour | 20
+ Bar | City | 15
+ Bar | Sport | 5
+(4 rows)
+
+<prompt>=&gt;</> <userinput>SELECT make, model, GROUPING(make,model), sum(sales) FROM items_sold GROUP BY ROLLUP(make,model);</>
+ make | model | grouping | sum
+-------+-------+----------+-----
+ Foo | GT | 0 | 10
+ Foo | Tour | 0 | 20
+ Bar | City | 0 | 15
+ Bar | Sport | 0 | 5
+ Foo | | 1 | 30
+ Bar | | 1 | 20
+ | | 3 | 50
+(7 rows)
+</screen>
+ </para>
+
</sect1>
<sect1 id="functions-window">
diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 7dbad462a5d..ab49bd7e91f 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -1183,6 +1183,181 @@ SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
</para>
</sect2>
+ <sect2 id="queries-grouping-sets">
+ <title><literal>GROUPING SETS</>, <literal>CUBE</>, and <literal>ROLLUP</></title>
+
+ <indexterm zone="queries-grouping-sets">
+ <primary>GROUPING SETS</primary>
+ </indexterm>
+ <indexterm zone="queries-grouping-sets">
+ <primary>CUBE</primary>
+ </indexterm>
+ <indexterm zone="queries-grouping-sets">
+ <primary>ROLLUP</primary>
+ </indexterm>
+
+ <para>
+ More complex grouping operations than those described above are possible
+ using the concept of <firstterm>grouping sets</>. The data selected by
+ the <literal>FROM</> and <literal>WHERE</> clauses is grouped separately
+ by each specified grouping set, aggregates computed for each group just as
+ for simple <literal>GROUP BY</> clauses, and then the results returned.
+ For example:
+<screen>
+<prompt>=&gt;</> <userinput>SELECT * FROM items_sold;</>
+ brand | size | sales
+-------+------+-------
+ Foo | L | 10
+ Foo | M | 20
+ Bar | M | 15
+ Bar | L | 5
+(4 rows)
+
+<prompt>=&gt;</> <userinput>SELECT brand, size, sum(sales) FROM items_sold GROUP BY GROUPING SETS ((brand), (size), ());</>
+ brand | size | sum
+-------+------+-----
+ Foo | | 30
+ Bar | | 20
+ | L | 15
+ | M | 35
+ | | 50
+(5 rows)
+</screen>
+ </para>
+
+ <para>
+ Each sublist of <literal>GROUPING SETS</> may specify zero or more columns
+ or expressions and is interpreted the same way as though it were directly
+ in the <literal>GROUP BY</> clause. An empty grouping set means that all
+ rows are aggregated down to a single group (which is output even if no
+ input rows were present), as described above for the case of aggregate
+ functions with no <literal>GROUP BY</> clause.
+ </para>
+
+ <para>
+ References to the grouping columns or expressions are replaced
+ by <literal>NULL</> values in result rows for grouping sets in which those
+ columns do not appear. To distinguish which grouping a particular output
+ row resulted from, see <xref linkend="functions-grouping-table">.
+ </para>
+
+ <para>
+ A shorthand notation is provided for specifying two common types of grouping set.
+ A clause of the form
+<programlisting>
+ROLLUP ( <replaceable>e1</>, <replaceable>e2</>, <replaceable>e3</>, ... )
+</programlisting>
+ represents the given list of expressions and all prefixes of the list including
+ the empty list; thus it is equivalent to
+<programlisting>
+GROUPING SETS (
+ ( <replaceable>e1</>, <replaceable>e2</>, <replaceable>e3</>, ... ),
+ ...
+ ( <replaceable>e1</>, <replaceable>e2</> )
+ ( <replaceable>e1</> )
+ ( )
+)
+</programlisting>
+ This is commonly used for analysis over hierarchical data; e.g. total
+ salary by department, division, and company-wide total.
+ </para>
+
+ <para>
+ A clause of the form
+<programlisting>
+CUBE ( <replaceable>e1</>, <replaceable>e2</>, ... )
+</programlisting>
+ represents the given list and all of its possible subsets (i.e. the power
+ set). Thus
+<programlisting>
+CUBE ( a, b, c )
+</programlisting>
+ is equivalent to
+<programlisting>
+GROUPING SETS (
+ ( a, b, c ),
+ ( a, b ),
+ ( a, c ),
+ ( a ),
+ ( b, c ),
+ ( b ),
+ ( c ),
+ ( ),
+)
+</programlisting>
+ </para>
+
+ <para>
+ The individual elements of a <literal>CUBE</> or <literal>ROLLUP</>
+ clause may be either individual expressions, or sub-lists of elements in
+ parentheses. In the latter case, the sub-lists are treated as single
+ units for the purposes of generating the individual grouping sets.
+ For example:
+<programlisting>
+CUBE ( (a,b), (c,d) )
+</programlisting>
+ is equivalent to
+<programlisting>
+GROUPING SETS (
+ ( a, b, c, d )
+ ( a, b )
+ ( c, d )
+ ( )
+)
+</programlisting>
+ and
+<programlisting>
+ROLLUP ( a, (b,c), d )
+</programlisting>
+ is equivalent to
+<programlisting>
+GROUPING SETS (
+ ( a, b, c, d )
+ ( a, b, c )
+ ( a )
+ ( )
+)
+</programlisting>
+ </para>
+
+ <para>
+ The <literal>CUBE</> and <literal>ROLLUP</> constructs can be used either
+ directly in the <literal>GROUP BY</> clause, or nested inside a
+ <literal>GROUPING SETS</> clause. If one <literal>GROUPING SETS</> clause
+ is nested inside another, the effect is the same as if all the elements of
+ the inner clause had been written directly in the outer clause.
+ </para>
+
+ <para>
+ If multiple grouping items are specified in a single <literal>GROUP BY</>
+ clause, then the final list of grouping sets is the cross product of the
+ individual items. For example:
+<programlisting>
+GROUP BY a, CUBE(b,c), GROUPING SETS ((d), (e))
+</programlisting>
+ is equivalent to
+<programlisting>
+GROUP BY GROUPING SETS (
+ (a,b,c,d), (a,b,c,e),
+ (a,b,d), (a,b,e),
+ (a,c,d), (a,c,e),
+ (a,d), (a,e)
+)
+</programlisting>
+ </para>
+
+ <note>
+ <para>
+ The construct <literal>(a,b)</> is normally recognized in expressions as
+ a <link linkend="sql-syntax-row-constructors">row constructor</link>.
+ Within the <literal>GROUP BY</> clause, this does not apply at the top
+ levels of expressions, and <literal>(a,b)</> is parsed as a list of
+ expressions as described above. If for some reason you <emphasis>need</>
+ a row constructor in a grouping expression, use <literal>ROW(a,b)</>.
+ </para>
+ </note>
+ </sect2>
+
<sect2 id="queries-window">
<title>Window Function Processing</title>
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 42e04660a19..632d7935cb4 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -37,7 +37,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
[ * | <replaceable class="parameter">expression</replaceable> [ [ AS ] <replaceable class="parameter">output_name</replaceable> ] [, ...] ]
[ FROM <replaceable class="parameter">from_item</replaceable> [, ...] ]
[ WHERE <replaceable class="parameter">condition</replaceable> ]
- [ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ]
+ [ GROUP BY <replaceable class="parameter">grouping_element</replaceable> [, ...] ]
[ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ]
[ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
[ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] <replaceable class="parameter">select</replaceable> ]
@@ -60,6 +60,15 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
[ WITH ORDINALITY ] [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ]
<replaceable class="parameter">from_item</replaceable> [ NATURAL ] <replaceable class="parameter">join_type</replaceable> <replaceable class="parameter">from_item</replaceable> [ ON <replaceable class="parameter">join_condition</replaceable> | USING ( <replaceable class="parameter">join_column</replaceable> [, ...] ) ]
+<phrase>and <replaceable class="parameter">grouping_element</replaceable> can be one of:</phrase>
+
+ ( )
+ <replaceable class="parameter">expression</replaceable>
+ ( <replaceable class="parameter">expression</replaceable> [, ...] )
+ ROLLUP ( { <replaceable class="parameter">expression</replaceable> | ( <replaceable class="parameter">expression</replaceable> [, ...] ) } [, ...] )
+ CUBE ( { <replaceable class="parameter">expression</replaceable> | ( <replaceable class="parameter">expression</replaceable> [, ...] ) } [, ...] )
+ GROUPING SETS ( <replaceable class="parameter">grouping_element</replaceable> [, ...] )
+
<phrase>and <replaceable class="parameter">with_query</replaceable> is:</phrase>
<replaceable class="parameter">with_query_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ] AS ( <replaceable class="parameter">select</replaceable> | <replaceable class="parameter">values</replaceable> | <replaceable class="parameter">insert</replaceable> | <replaceable class="parameter">update</replaceable> | <replaceable class="parameter">delete</replaceable> )
@@ -665,23 +674,35 @@ WHERE <replaceable class="parameter">condition</replaceable>
<para>
The optional <literal>GROUP BY</literal> clause has the general form
<synopsis>
-GROUP BY <replaceable class="parameter">expression</replaceable> [, ...]
+GROUP BY <replaceable class="parameter">grouping_element</replaceable> [, ...]
</synopsis>
</para>
<para>
<literal>GROUP BY</literal> will condense into a single row all
selected rows that share the same values for the grouped
- expressions. <replaceable
- class="parameter">expression</replaceable> can be an input column
- name, or the name or ordinal number of an output column
- (<command>SELECT</command> list item), or an arbitrary
+ expressions. An <replaceable
+ class="parameter">expression</replaceable> used inside a
+ <replaceable class="parameter">grouping_element</replaceable>
+ can be an input column name, or the name or ordinal number of an
+ output column (<command>SELECT</command> list item), or an arbitrary
expression formed from input-column values. In case of ambiguity,
a <literal>GROUP BY</literal> name will be interpreted as an
input-column name rather than an output column name.
</para>
<para>
+ If any of <literal>GROUPING SETS</>, <literal>ROLLUP</> or
+ <literal>CUBE</> are present as grouping elements, then the
+ <literal>GROUP BY</> clause as a whole defines some number of
+ independent <replaceable>grouping sets</>. The effect of this is
+ equivalent to constructing a <literal>UNION ALL</> between
+ subqueries with the individual grouping sets as their
+ <literal>GROUP BY</> clauses. For further details on the handling
+ of grouping sets see <xref linkend="queries-grouping-sets">.
+ </para>
+
+ <para>
Aggregate functions, if any are used, are computed across all rows
making up each group, producing a separate value for each group.
(If there are aggregate functions but no <literal>GROUP BY</literal>