Indexes with INCLUDE columns and their support in B-tree

This patch introduces INCLUDE clause to index definition. This clause specifies a list of columns which will be included as a non-key part in the index. The INCLUDE columns exist solely to allow more queries to benefit from index-only scans. Also, such columns don't need to have appropriate operator classes. Expressions are not supported as INCLUDE columns since they cannot be used in index-only scans. Index access methods supporting INCLUDE are indicated by amcaninclude flag in IndexAmRoutine. For now, only B-tree indexes support INCLUDE clause. In B-tree indexes INCLUDE columns are truncated from pivot index tuples (tuples located in non-leaf pages and high keys). Therefore, B-tree indexes now might have variable number of attributes. This patch also provides generic facility to support that: pivot tuples contain number of their attributes in t_tid.ip_posid. Free 13th bit of t_info is used for indicating that. This facility will simplify further support of index suffix truncation. The changes of above are backward-compatible, pg_upgrade doesn't need special handling of B-tree indexes for that. Bump catalog version Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes, David Rowley, Alexander Korotkov Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru
author: Teodor Sigaev <teodor@sigaev.ru> 2018-04-07 23:00:39 +0300
committer: Teodor Sigaev <teodor@sigaev.ru> 2018-04-07 23:00:39 +0300
commit: 8224de4f42ccf98e08db07b43d52fed72f962ebb (patch)
tree: 0c4aae878e522178def568fcd2dd274233780f88 /doc/src
parent: 01bb85169afadfe63e2f0e344ff671292080de7e (diff)
6 files changed, 126 insertions, 10 deletions
diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml
index 10abf90189e..ca81fbbc848 100644
--- a/doc/src/sgml/btree.sgml
+++ b/doc/src/sgml/btree.sgml
@@ -433,6 +433,23 @@ returns bool
 
 </sect1>
 
+<sect1 id="btree-included-attributes">
+ <title>Included attributes in B-tree indexes</title>
+
+ <para>
+  As of <productname>PostgreSQL</productname> 11.0 there is an optional
+  INCLUDE clause, which allows to add non-key (included) attributes to index.
+  Those included attributes allow more queries to benefit from index-only scans.
+  We never use included attributes in ScanKeys for search.  That allows us to
+  include into B-tree any datatypes, even those which don't have suitable
+  operator classes.  Included columns only stored in regular tuples on leaf
+  pages.  All pivot tuples on non-leaf pages and highkey tuples are truncated
+  to contain only key attributes.  That helps to slightly reduce the size of
+  index.
+ </para>
+
+</sect1>
+
 <sect1 id="btree-implementation">
  <title>Implementation</title>
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index e8efa13e8df..c304262fdb6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -3743,8 +3743,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       <entry><structfield>indnatts</structfield></entry>
       <entry><type>int2</type></entry>
       <entry></entry>
-      <entry>The number of columns in the index (duplicates
-      <literal>pg_class.relnatts</literal>)</entry>
+      <entry>The total number of columns in the index (duplicates
+      <literal>pg_class.relnatts</literal>). This number includes both key and included attributes.</entry>
+     </row>
+
+      <row>
+      <entry><structfield>indnkeyatts</structfield></entry>
+      <entry><type>int2</type></entry>
+      <entry></entry>
+      <entry>The number of key columns in the index. "Key columns" are ordinary
+      index columns (as opposed to "included" columns).</entry>
      </row>
 
      <row>
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index a7f6c8dc6ad..24c3405f918 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -114,6 +114,8 @@ typedef struct IndexAmRoutine
     bool        amcanparallel;
     /* type of data stored in index, or InvalidOid if variable */
     Oid         amkeytype;
+    /* does AM support columns included with clause INCLUDE? */
+    bool        amcaninclude;
 
     /* interface functions */
     ambuild_function ambuild;
@@ -985,7 +987,8 @@ amparallelrescan (IndexScanDesc scan);
    using <firstterm>unique indexes</firstterm>, which are indexes that disallow
    multiple entries with identical keys.  An access method that supports this
    feature sets <structfield>amcanunique</structfield> true.
-   (At present, only b-tree supports it.)
+   (At present, only b-tree supports it.) Columns listed in the
+    <literal>INCLUDE</literal> clause are not used to enforce uniqueness.
   </para>
 
   <para>
diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml
index 0818196e766..14a1aa56cb5 100644
--- a/doc/src/sgml/indices.sgml
+++ b/doc/src/sgml/indices.sgml
@@ -638,7 +638,8 @@ CREATE INDEX test3_desc_index ON test3 (id DESC NULLS LAST);
    Indexes can also be used to enforce uniqueness of a column's value,
    or the uniqueness of the combined values of more than one column.
 <synopsis>
-CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> (<replaceable>column</replaceable> <optional>, ...</optional>);
+CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> (<replaceable>column</replaceable> <optional>, ...</optional>)
+[ INCLUDE (<replaceable>column</replaceable> <optional>, ...</optional>) ];
 </synopsis>
    Currently, only B-tree indexes can be declared unique.
   </para>
@@ -647,7 +648,9 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla
    When an index is declared unique, multiple table rows with equal
    indexed values are not allowed.  Null values are not considered
    equal.  A multicolumn unique index will only reject cases where all
-   indexed columns are equal in multiple rows.
+   indexed columns are equal in multiple rows.  Columns listed in the
+   <literal>INCLUDE</literal> clause aren't used to enforce constraints
+   (UNIQUE, PRIMARY KEY, etc).
   </para>
 
   <para>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 6a6490cac3d..91692325a50 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -23,6 +23,7 @@ PostgreSQL documentation
 <synopsis>
 CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class="parameter">name</replaceable> ] ON [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ USING <replaceable class="parameter">method</replaceable> ]
     ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] [ <replaceable class="parameter">opclass</replaceable> ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
+    [ INCLUDE ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
     [ WITH ( <replaceable class="parameter">storage_parameter</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] ) ]
     [ TABLESPACE <replaceable class="parameter">tablespace_name</replaceable> ]
     [ WHERE <replaceable class="parameter">predicate</replaceable> ]
@@ -144,6 +145,56 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
      </varlistentry>
 
      <varlistentry>
+      <term><literal>INCLUDE</literal></term>
+      <listitem>
+       <para>
+        The optional <literal>INCLUDE</literal> clause specifies a
+        list of columns which will be included as a non-key part in the index.
+        Columns listed in this clause cannot also be present as index key columns.
+        The <literal>INCLUDE</literal> columns exist solely to
+        allow more queries to benefit from <firstterm>index-only scans</firstterm>
+        by including the values of the specified columns in the index.  These values
+        would otherwise have to be obtained by reading the table's heap.
+       </para>
+
+       <para>
+        In <literal>UNIQUE</literal> indexes, uniqueness is only enforced
+        for key columns.  Columns listed in the <literal>INCLUDE</literal>
+        clause have no effect on uniqueness enforcement.  Other constraints
+        (<literal>PRIMARY KEY</literal> and <literal>EXCLUDE</literal>) work
+        the same way.
+       </para>
+
+       <para>
+        Columns listed in the <literal>INCLUDE</literal> clause don't need
+        appropriate operator classes; the clause can contain non-key index
+        columns whose data types don't have operator classes defined for
+        a given access method.
+       </para>
+
+       <para>
+        Expressions are not supported as included columns since they cannot be
+        used in index-only scans.
+       </para>
+
+       <para>
+        Currently, only the B-tree index access method supports this feature.
+        In B-tree indexes, the values of columns listed in the
+        <literal>INCLUDE</literal> clause are included in leaf tuples which
+        are linked to the heap tuples, but are not included into pivot tuples
+        used for tree navigation.  Therefore, moving columns from the list of
+        key columns to the <literal>INCLUDE</literal> clause can slightly
+        reduce index size and improve the tree branching factor.
+       </para>
+
+       <para>
+        Indexes with columns listed in the <literal>INCLUDE</literal> clause
+        are also called <quote>covering indexes</quote>.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><replaceable class="parameter">name</replaceable></term>
       <listitem>
        <para>
@@ -729,7 +780,7 @@ Indexes:
   <title>Examples</title>
 
   <para>
-   To create a B-tree index on the column <literal>title</literal> in
+   To create a unique B-tree index on the column <literal>title</literal> in
    the table <literal>films</literal>:
 <programlisting>
 CREATE UNIQUE INDEX title_idx ON films (title);
@@ -737,6 +788,15 @@ CREATE UNIQUE INDEX title_idx ON films (title);
   </para>
 
   <para>
+   To create a unique B-tree index on the column <literal>title</literal>
+   and included columns <literal>director</literal> and <literal>rating</literal>
+   in the table <literal>films</literal>:
+<programlisting>
+CREATE UNIQUE INDEX title_idx ON films (title) INCLUDE (director, rating);
+</programlisting>
+  </para>
+
+  <para>
    To create an index on the expression <literal>lower(title)</literal>,
    allowing efficient case-insensitive searches:
 <programlisting>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index be0effa5d91..cb3867dbd52 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -73,8 +73,8 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 
 [ CONSTRAINT <replaceable class="parameter">constraint_name</replaceable> ]
 { CHECK ( <replaceable class="parameter">expression</replaceable> ) [ NO INHERIT ] |
-  UNIQUE ( <replaceable class="parameter">column_name</replaceable> [, ... ] ) <replaceable class="parameter">index_parameters</replaceable> |
-  PRIMARY KEY ( <replaceable class="parameter">column_name</replaceable> [, ... ] ) <replaceable class="parameter">index_parameters</replaceable> |
+  UNIQUE ( <replaceable class="parameter">column_name</replaceable> [, ... ] ) <replaceable class="parameter">index_parameters</replaceable> <optional> INCLUDE (<replaceable class="parameter">column_name</replaceable> [, ...]) </optional> |
+  PRIMARY KEY ( <replaceable class="parameter">column_name</replaceable> [, ... ] ) <replaceable class="parameter">index_parameters</replaceable> <optional> INCLUDE (<replaceable class="parameter">column_name</replaceable> [, ...]) </optional> |
   EXCLUDE [ USING <replaceable class="parameter">index_method</replaceable> ] ( <replaceable class="parameter">exclude_element</replaceable> WITH <replaceable class="parameter">operator</replaceable> [, ... ] ) <replaceable class="parameter">index_parameters</replaceable> [ WHERE ( <replaceable class="parameter">predicate</replaceable> ) ] |
   FOREIGN KEY ( <replaceable class="parameter">column_name</replaceable> [, ... ] ) REFERENCES <replaceable class="parameter">reftable</replaceable> [ ( <replaceable class="parameter">refcolumn</replaceable> [, ... ] ) ]
     [ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ] [ ON DELETE <replaceable class="parameter">action</replaceable> ] [ ON UPDATE <replaceable class="parameter">action</replaceable> ] }
@@ -769,7 +769,8 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
 
    <varlistentry>
     <term><literal>UNIQUE</literal> (column constraint)</term>
-    <term><literal>UNIQUE ( <replaceable class="parameter">column_name</replaceable> [, ... ] )</literal> (table constraint)</term>
+    <term><literal>UNIQUE ( <replaceable class="parameter">column_name</replaceable> [, ... ] )</literal>
+    <optional> INCLUDE ( <replaceable class="parameter">column_name</replaceable> [, ...]) </optional> (table constraint)</term>
 
     <listitem>
      <para>
@@ -798,12 +799,25 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
       partitioned table, as well as those of all its descendant partitioned
       tables, must be included in the constraint definition.
      </para>
+
+     <para>
+      Adding a unique constraint will automatically create a unique btree
+      index on the column or group of columns used in the constraint.
+      The optional clause <literal>INCLUDE</literal> adds to that index
+      one or more columns on which the uniqueness is not enforced.
+      Note that although the constraint is not enforced on the included columns,
+      it still depends on them.  Consequently, some operations on these columns
+      (e.g. <literal>DROP COLUMN</literal>) can cause cascade constraint and
+      index deletion.  See paragraph about <literal>INCLUDE</literal> in
+      <xref linkend="sql-createindex"/> for more information.
+     </para>
     </listitem>
    </varlistentry>
 
    <varlistentry>
     <term><literal>PRIMARY KEY</literal> (column constraint)</term>
-    <term><literal>PRIMARY KEY ( <replaceable class="parameter">column_name</replaceable> [, ... ] )</literal> (table constraint)</term>
+    <term><literal>PRIMARY KEY ( <replaceable class="parameter">column_name</replaceable> [, ... ] )</literal>
+    <optional> INCLUDE ( <replaceable class="parameter">column_name</replaceable> [, ...]) </optional> (table constraint)</term>
     <listitem>
      <para>
       The <literal>PRIMARY KEY</literal> constraint specifies that a column or
@@ -833,6 +847,17 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
       tables.
      </para>
 
+     <para>
+      Adding a <literal>PRIMARY KEY</literal> constraint will automatically
+      create a unique btree index on the column or group of columns used in the
+      constraint.  The optional <literal>INCLUDE</literal> clause allows a list
+      of columns to be specified which will be included in the non-key portion
+      of the index.  Although uniqueness is not enforced on the included columns,
+      the constraint still depends on them. Consequently, some operations on the
+      included columns (e.g. <literal>DROP COLUMN</literal>) can cause cascade
+      constraint and index deletion. See paragraph about <literal>INCLUDE</literal>
+      in <xref linkend="sql-createindex"/> for more information.
+     </para>
     </listitem>
    </varlistentry>
author	Teodor Sigaev <teodor@sigaev.ru>	2018-04-07 23:00:39 +0300
committer	Teodor Sigaev <teodor@sigaev.ru>	2018-04-07 23:00:39 +0300
commit	8224de4f42ccf98e08db07b43d52fed72f962ebb (patch)
tree	0c4aae878e522178def568fcd2dd274233780f88 /doc/src
parent	01bb85169afadfe63e2f0e344ff671292080de7e (diff)