diff options
Diffstat (limited to 'doc/src')
| -rw-r--r-- | doc/src/sgml/func.sgml | 70 | ||||
| -rw-r--r-- | doc/src/sgml/queries.sgml | 175 | ||||
| -rw-r--r-- | doc/src/sgml/ref/select.sgml | 33 |
3 files changed, 271 insertions, 7 deletions
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index b1e94d7b9e2..89a609f61c1 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -12228,7 +12228,9 @@ NULL baz</literallayout>(3 rows)</entry> <xref linkend="functions-aggregate-statistics-table">. The built-in ordered-set aggregate functions are listed in <xref linkend="functions-orderedset-table"> and - <xref linkend="functions-hypothetical-table">. + <xref linkend="functions-hypothetical-table">. Grouping operations, + which are closely related to aggregate functions, are listed in + <xref linkend="functions-grouping-table">. The special syntax considerations for aggregate functions are explained in <xref linkend="syntax-aggregates">. Consult <xref linkend="tutorial-agg"> for additional introductory @@ -13326,6 +13328,72 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab; to the rule specified in the <literal>ORDER BY</> clause. </para> + <table id="functions-grouping-table"> + <title>Grouping Operations</title> + + <tgroup cols="3"> + <thead> + <row> + <entry>Function</entry> + <entry>Return Type</entry> + <entry>Description</entry> + </row> + </thead> + + <tbody> + + <row> + <entry> + <indexterm> + <primary>GROUPING</primary> + </indexterm> + <function>GROUPING(<replaceable class="parameter">args...</replaceable>)</function> + </entry> + <entry> + <type>integer</type> + </entry> + <entry> + Integer bitmask indicating which arguments are not being included in the current + grouping set + </entry> + </row> + </tbody> + </tgroup> + </table> + + <para> + Grouping operations are used in conjunction with grouping sets (see + <xref linkend="queries-grouping-sets">) to distinguish result rows. The + arguments to the <literal>GROUPING</> operation are not actually evaluated, + but they must match exactly expressions given in the <literal>GROUP BY</> + clause of the associated query level. Bits are assigned with the rightmost + argument being the least-significant bit; each bit is 0 if the corresponding + expression is included in the grouping criteria of the grouping set generating + the result row, and 1 if it is not. For example: +<screen> +<prompt>=></> <userinput>SELECT * FROM items_sold;</> + make | model | sales +-------+-------+------- + Foo | GT | 10 + Foo | Tour | 20 + Bar | City | 15 + Bar | Sport | 5 +(4 rows) + +<prompt>=></> <userinput>SELECT make, model, GROUPING(make,model), sum(sales) FROM items_sold GROUP BY ROLLUP(make,model);</> + make | model | grouping | sum +-------+-------+----------+----- + Foo | GT | 0 | 10 + Foo | Tour | 0 | 20 + Bar | City | 0 | 15 + Bar | Sport | 0 | 5 + Foo | | 1 | 30 + Bar | | 1 | 20 + | | 3 | 50 +(7 rows) +</screen> + </para> + </sect1> <sect1 id="functions-window"> diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml index 7dbad462a5d..ab49bd7e91f 100644 --- a/doc/src/sgml/queries.sgml +++ b/doc/src/sgml/queries.sgml @@ -1183,6 +1183,181 @@ SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit </para> </sect2> + <sect2 id="queries-grouping-sets"> + <title><literal>GROUPING SETS</>, <literal>CUBE</>, and <literal>ROLLUP</></title> + + <indexterm zone="queries-grouping-sets"> + <primary>GROUPING SETS</primary> + </indexterm> + <indexterm zone="queries-grouping-sets"> + <primary>CUBE</primary> + </indexterm> + <indexterm zone="queries-grouping-sets"> + <primary>ROLLUP</primary> + </indexterm> + + <para> + More complex grouping operations than those described above are possible + using the concept of <firstterm>grouping sets</>. The data selected by + the <literal>FROM</> and <literal>WHERE</> clauses is grouped separately + by each specified grouping set, aggregates computed for each group just as + for simple <literal>GROUP BY</> clauses, and then the results returned. + For example: +<screen> +<prompt>=></> <userinput>SELECT * FROM items_sold;</> + brand | size | sales +-------+------+------- + Foo | L | 10 + Foo | M | 20 + Bar | M | 15 + Bar | L | 5 +(4 rows) + +<prompt>=></> <userinput>SELECT brand, size, sum(sales) FROM items_sold GROUP BY GROUPING SETS ((brand), (size), ());</> + brand | size | sum +-------+------+----- + Foo | | 30 + Bar | | 20 + | L | 15 + | M | 35 + | | 50 +(5 rows) +</screen> + </para> + + <para> + Each sublist of <literal>GROUPING SETS</> may specify zero or more columns + or expressions and is interpreted the same way as though it were directly + in the <literal>GROUP BY</> clause. An empty grouping set means that all + rows are aggregated down to a single group (which is output even if no + input rows were present), as described above for the case of aggregate + functions with no <literal>GROUP BY</> clause. + </para> + + <para> + References to the grouping columns or expressions are replaced + by <literal>NULL</> values in result rows for grouping sets in which those + columns do not appear. To distinguish which grouping a particular output + row resulted from, see <xref linkend="functions-grouping-table">. + </para> + + <para> + A shorthand notation is provided for specifying two common types of grouping set. + A clause of the form +<programlisting> +ROLLUP ( <replaceable>e1</>, <replaceable>e2</>, <replaceable>e3</>, ... ) +</programlisting> + represents the given list of expressions and all prefixes of the list including + the empty list; thus it is equivalent to +<programlisting> +GROUPING SETS ( + ( <replaceable>e1</>, <replaceable>e2</>, <replaceable>e3</>, ... ), + ... + ( <replaceable>e1</>, <replaceable>e2</> ) + ( <replaceable>e1</> ) + ( ) +) +</programlisting> + This is commonly used for analysis over hierarchical data; e.g. total + salary by department, division, and company-wide total. + </para> + + <para> + A clause of the form +<programlisting> +CUBE ( <replaceable>e1</>, <replaceable>e2</>, ... ) +</programlisting> + represents the given list and all of its possible subsets (i.e. the power + set). Thus +<programlisting> +CUBE ( a, b, c ) +</programlisting> + is equivalent to +<programlisting> +GROUPING SETS ( + ( a, b, c ), + ( a, b ), + ( a, c ), + ( a ), + ( b, c ), + ( b ), + ( c ), + ( ), +) +</programlisting> + </para> + + <para> + The individual elements of a <literal>CUBE</> or <literal>ROLLUP</> + clause may be either individual expressions, or sub-lists of elements in + parentheses. In the latter case, the sub-lists are treated as single + units for the purposes of generating the individual grouping sets. + For example: +<programlisting> +CUBE ( (a,b), (c,d) ) +</programlisting> + is equivalent to +<programlisting> +GROUPING SETS ( + ( a, b, c, d ) + ( a, b ) + ( c, d ) + ( ) +) +</programlisting> + and +<programlisting> +ROLLUP ( a, (b,c), d ) +</programlisting> + is equivalent to +<programlisting> +GROUPING SETS ( + ( a, b, c, d ) + ( a, b, c ) + ( a ) + ( ) +) +</programlisting> + </para> + + <para> + The <literal>CUBE</> and <literal>ROLLUP</> constructs can be used either + directly in the <literal>GROUP BY</> clause, or nested inside a + <literal>GROUPING SETS</> clause. If one <literal>GROUPING SETS</> clause + is nested inside another, the effect is the same as if all the elements of + the inner clause had been written directly in the outer clause. + </para> + + <para> + If multiple grouping items are specified in a single <literal>GROUP BY</> + clause, then the final list of grouping sets is the cross product of the + individual items. For example: +<programlisting> +GROUP BY a, CUBE(b,c), GROUPING SETS ((d), (e)) +</programlisting> + is equivalent to +<programlisting> +GROUP BY GROUPING SETS ( + (a,b,c,d), (a,b,c,e), + (a,b,d), (a,b,e), + (a,c,d), (a,c,e), + (a,d), (a,e) +) +</programlisting> + </para> + + <note> + <para> + The construct <literal>(a,b)</> is normally recognized in expressions as + a <link linkend="sql-syntax-row-constructors">row constructor</link>. + Within the <literal>GROUP BY</> clause, this does not apply at the top + levels of expressions, and <literal>(a,b)</> is parsed as a list of + expressions as described above. If for some reason you <emphasis>need</> + a row constructor in a grouping expression, use <literal>ROW(a,b)</>. + </para> + </note> + </sect2> + <sect2 id="queries-window"> <title>Window Function Processing</title> diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml index 42e04660a19..632d7935cb4 100644 --- a/doc/src/sgml/ref/select.sgml +++ b/doc/src/sgml/ref/select.sgml @@ -37,7 +37,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac [ * | <replaceable class="parameter">expression</replaceable> [ [ AS ] <replaceable class="parameter">output_name</replaceable> ] [, ...] ] [ FROM <replaceable class="parameter">from_item</replaceable> [, ...] ] [ WHERE <replaceable class="parameter">condition</replaceable> ] - [ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ] + [ GROUP BY <replaceable class="parameter">grouping_element</replaceable> [, ...] ] [ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ] [ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ] [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] <replaceable class="parameter">select</replaceable> ] @@ -60,6 +60,15 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac [ WITH ORDINALITY ] [ [ AS ] <replaceable class="parameter">alias</replaceable> [ ( <replaceable class="parameter">column_alias</replaceable> [, ...] ) ] ] <replaceable class="parameter">from_item</replaceable> [ NATURAL ] <replaceable class="parameter">join_type</replaceable> <replaceable class="parameter">from_item</replaceable> [ ON <replaceable class="parameter">join_condition</replaceable> | USING ( <replaceable class="parameter">join_column</replaceable> [, ...] ) ] +<phrase>and <replaceable class="parameter">grouping_element</replaceable> can be one of:</phrase> + + ( ) + <replaceable class="parameter">expression</replaceable> + ( <replaceable class="parameter">expression</replaceable> [, ...] ) + ROLLUP ( { <replaceable class="parameter">expression</replaceable> | ( <replaceable class="parameter">expression</replaceable> [, ...] ) } [, ...] ) + CUBE ( { <replaceable class="parameter">expression</replaceable> | ( <replaceable class="parameter">expression</replaceable> [, ...] ) } [, ...] ) + GROUPING SETS ( <replaceable class="parameter">grouping_element</replaceable> [, ...] ) + <phrase>and <replaceable class="parameter">with_query</replaceable> is:</phrase> <replaceable class="parameter">with_query_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ] AS ( <replaceable class="parameter">select</replaceable> | <replaceable class="parameter">values</replaceable> | <replaceable class="parameter">insert</replaceable> | <replaceable class="parameter">update</replaceable> | <replaceable class="parameter">delete</replaceable> ) @@ -665,23 +674,35 @@ WHERE <replaceable class="parameter">condition</replaceable> <para> The optional <literal>GROUP BY</literal> clause has the general form <synopsis> -GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] +GROUP BY <replaceable class="parameter">grouping_element</replaceable> [, ...] </synopsis> </para> <para> <literal>GROUP BY</literal> will condense into a single row all selected rows that share the same values for the grouped - expressions. <replaceable - class="parameter">expression</replaceable> can be an input column - name, or the name or ordinal number of an output column - (<command>SELECT</command> list item), or an arbitrary + expressions. An <replaceable + class="parameter">expression</replaceable> used inside a + <replaceable class="parameter">grouping_element</replaceable> + can be an input column name, or the name or ordinal number of an + output column (<command>SELECT</command> list item), or an arbitrary expression formed from input-column values. In case of ambiguity, a <literal>GROUP BY</literal> name will be interpreted as an input-column name rather than an output column name. </para> <para> + If any of <literal>GROUPING SETS</>, <literal>ROLLUP</> or + <literal>CUBE</> are present as grouping elements, then the + <literal>GROUP BY</> clause as a whole defines some number of + independent <replaceable>grouping sets</>. The effect of this is + equivalent to constructing a <literal>UNION ALL</> between + subqueries with the individual grouping sets as their + <literal>GROUP BY</> clauses. For further details on the handling + of grouping sets see <xref linkend="queries-grouping-sets">. + </para> + + <para> Aggregate functions, if any are used, are computed across all rows making up each group, producing a separate value for each group. (If there are aggregate functions but no <literal>GROUP BY</literal> |
