summaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2019-12-11 15:53:36 -0500
committerTom Lane <tgl@sss.pgh.pa.us>2019-12-11 15:53:36 -0500
commit49b83f6474349ecc756348f422003f447a5058bf (patch)
tree3a03049d46e50a01956f44388d9af664a1bb1f05 /doc/src
parent2ed302ab97034a683816a3d6610e03345e6aef3b (diff)
Doc: back-patch documentation about limitations of CHECK constraints.
Back-patch commits 36d442a25 and 1f66c657f into all supported branches. I'd considered doing this when putting in the latter commit, but failed to pull the trigger. Now that we've had an actual field complaint about the lack of such docs, let's do it. Per bug #16158 from Piotr Jander. Original patches by Lætitia Avrot, Patrick Francelle, and me. Discussion: https://postgr.es/m/16158-7ccf2f74b3d655db@postgresql.org
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/ddl.sgml53
-rw-r--r--doc/src/sgml/ref/alter_domain.sgml21
-rw-r--r--doc/src/sgml/ref/alter_table.sgml83
-rw-r--r--doc/src/sgml/ref/create_domain.sgml24
-rw-r--r--doc/src/sgml/ref/create_table.sgml3
5 files changed, 151 insertions, 33 deletions
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 7dbdca60742..06d6e48a35f 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -403,6 +403,59 @@ CREATE TABLE products (
ensure that a column does not contain null values, the not-null
constraint described in the next section can be used.
</para>
+
+ <note>
+ <para>
+ <productname>PostgreSQL</productname> does not support
+ <literal>CHECK</literal> constraints that reference table data other than
+ the new or updated row being checked. While a <literal>CHECK</literal>
+ constraint that violates this rule may appear to work in simple
+ tests, it cannot guarantee that the database will not reach a state
+ in which the constraint condition is false (due to subsequent changes
+ of the other row(s) involved). This would cause a database dump and
+ reload to fail. The reload could fail even when the complete
+ database state is consistent with the constraint, due to rows not
+ being loaded in an order that will satisfy the constraint. If
+ possible, use <literal>UNIQUE</literal>, <literal>EXCLUDE</literal>,
+ or <literal>FOREIGN KEY</literal> constraints to express
+ cross-row and cross-table restrictions.
+ </para>
+
+ <para>
+ If what you desire is a one-time check against other rows at row
+ insertion, rather than a continuously-maintained consistency
+ guarantee, a custom <link linkend="triggers">trigger</link> can be used
+ to implement that. (This approach avoids the dump/reload problem because
+ <application>pg_dump</application> does not reinstall triggers until after
+ reloading data, so that the check will not be enforced during a
+ dump/reload.)
+ </para>
+ </note>
+
+ <note>
+ <para>
+ <productname>PostgreSQL</productname> assumes that
+ <literal>CHECK</literal> constraints' conditions are immutable, that
+ is, they will always give the same result for the same input row.
+ This assumption is what justifies examining <literal>CHECK</literal>
+ constraints only when rows are inserted or updated, and not at other
+ times. (The warning above about not referencing other table data is
+ really a special case of this restriction.)
+ </para>
+
+ <para>
+ An example of a common way to break this assumption is to reference a
+ user-defined function in a <literal>CHECK</literal> expression, and
+ then change the behavior of that
+ function. <productname>PostgreSQL</productname> does not disallow
+ that, but it will not notice if there are rows in the table that now
+ violate the <literal>CHECK</literal> constraint. That would cause a
+ subsequent database dump and reload to fail.
+ The recommended way to handle such a change is to drop the constraint
+ (using <command>ALTER TABLE</command>), adjust the function definition,
+ and re-add the constraint, thereby rechecking it against all table rows.
+ </para>
+ </note>
</sect2>
<sect2>
diff --git a/doc/src/sgml/ref/alter_domain.sgml b/doc/src/sgml/ref/alter_domain.sgml
index 85253e209b0..8201cbb65fc 100644
--- a/doc/src/sgml/ref/alter_domain.sgml
+++ b/doc/src/sgml/ref/alter_domain.sgml
@@ -118,8 +118,8 @@ ALTER DOMAIN <replaceable class="parameter">name</replaceable>
<listitem>
<para>
This form validates a constraint previously added as
- <literal>NOT VALID</literal>, that is, verify that all data in columns using the
- domain satisfy the specified constraint.
+ <literal>NOT VALID</literal>, that is, it verifies that all values in
+ table columns of the domain type satisfy the specified constraint.
</para>
</listitem>
</varlistentry>
@@ -202,7 +202,7 @@ ALTER DOMAIN <replaceable class="parameter">name</replaceable>
<term><literal>NOT VALID</literal></term>
<listitem>
<para>
- Do not verify existing column data for constraint validity.
+ Do not verify existing stored data for constraint validity.
</para>
</listitem>
</varlistentry>
@@ -273,6 +273,21 @@ ALTER DOMAIN <replaceable class="parameter">name</replaceable>
<title>Notes</title>
<para>
+ Although <command>ALTER DOMAIN ADD CONSTRAINT</command> attempts to verify
+ that existing stored data satisfies the new constraint, this check is not
+ bulletproof, because the command cannot <quote>see</quote> table rows that
+ are newly inserted or updated and not yet committed. If there is a hazard
+ that concurrent operations might insert bad data, the way to proceed is to
+ add the constraint using the <literal>NOT VALID</literal> option, commit
+ that command, wait until all transactions started before that commit have
+ finished, and then issue <command>ALTER DOMAIN VALIDATE
+ CONSTRAINT</command> to search for data violating the constraint. This
+ method is reliable because once the constraint is committed, all new
+ transactions are guaranteed to enforce it against new values of the domain
+ type.
+ </para>
+
+ <para>
Currently, <command>ALTER DOMAIN ADD CONSTRAINT</command>, <command>ALTER
DOMAIN VALIDATE CONSTRAINT</command>, and <command>ALTER DOMAIN SET NOT
NULL</command> will fail if the named domain or any derived domain is used
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index c2f25cc7b68..d63490f310d 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -359,27 +359,36 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
<xref linkend="sql-createtable"/>, plus the option <literal>NOT
VALID</literal>, which is currently only allowed for foreign key
and CHECK constraints.
- If the constraint is marked <literal>NOT VALID</literal>, the
- potentially-lengthy initial check to verify that all rows in the table
- satisfy the constraint is skipped. The constraint will still be
+ </para>
+
+ <para>
+ Normally, this form will cause a scan of the table to verify that all
+ existing rows in the table satisfy the new constraint. But if
+ the <literal>NOT VALID</literal> option is used, this
+ potentially-lengthy scan is skipped. The constraint will still be
enforced against subsequent inserts or updates (that is, they'll fail
unless there is a matching row in the referenced table, in the case
- of foreign keys; and they'll fail unless the new row matches the
- specified check constraints). But the
+ of foreign keys, or they'll fail unless the new row matches the
+ specified check condition). But the
database will not assume that the constraint holds for all rows in
the table, until it is validated by using the <literal>VALIDATE
- CONSTRAINT</literal> option. Foreign key constraints on partitioned
- tables may not be declared <literal>NOT VALID</literal> at present.
+ CONSTRAINT</literal> option.
+ See <xref linkend="sql-altertable-notes"
+ endterm="sql-altertable-notes-title"/> below for more information
+ about using the <literal>NOT VALID</literal> option.
</para>
<para>
- The addition of a foreign key constraint requires a
- <literal>SHARE ROW EXCLUSIVE</literal> lock on the referenced table.
+ Addition of a foreign key constraint requires a
+ <literal>SHARE ROW EXCLUSIVE</literal> lock on the referenced table,
+ in addition to the lock on the table receiving the constraint.
</para>
<para>
Additional restrictions apply when unique or primary key constraints
- are added to partitioned tables; see <xref linkend="sql-createtable" />.
+ are added to partitioned tables; see <xref linkend="sql-createtable"/>.
+ Also, foreign key constraints on partitioned
+ tables may not be declared <literal>NOT VALID</literal> at present.
</para>
</listitem>
@@ -454,23 +463,13 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
<term><literal>VALIDATE CONSTRAINT</literal></term>
<listitem>
<para>
- This form validates a foreign key or check constraint that was previously created
- as <literal>NOT VALID</literal>, by scanning the table to ensure there
- are no rows for which the constraint is not satisfied.
- Nothing happens if the constraint is already marked valid.
- </para>
- <para>
- Validation can be a long process on larger tables. The value of separating
- validation from initial creation is that you can defer validation to less
- busy times, or can be used to give additional time to correct pre-existing
- errors while preventing new errors. Note also that validation on its own
- does not prevent normal write commands against the table while it runs.
- </para>
- <para>
- Validation acquires only a <literal>SHARE UPDATE EXCLUSIVE</literal> lock
- on the table being altered. If the constraint is a foreign key then
- a <literal>ROW SHARE</literal> lock is also required on
- the table referenced by the constraint.
+ This form validates a foreign key or check constraint that was
+ previously created as <literal>NOT VALID</literal>, by scanning the
+ table to ensure there are no rows for which the constraint is not
+ satisfied. Nothing happens if the constraint is already marked valid.
+ (See <xref linkend="sql-altertable-notes"
+ endterm="sql-altertable-notes-title"/> below for an explanation of the
+ usefulness of this command.)
</para>
</listitem>
</varlistentry>
@@ -1190,8 +1189,8 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</variablelist>
</refsect1>
- <refsect1>
- <title>Notes</title>
+ <refsect1 id="sql-altertable-notes">
+ <title id="sql-altertable-notes-title">Notes</title>
<para>
The key word <literal>COLUMN</literal> is noise and can be omitted.
@@ -1238,6 +1237,32 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</para>
<para>
+ Scanning a large table to verify a new foreign key or check constraint
+ can take a long time, and other updates to the table are locked out
+ until the <command>ALTER TABLE ADD CONSTRAINT</command> command is
+ committed. The main purpose of the <literal>NOT VALID</literal>
+ constraint option is to reduce the impact of adding a constraint on
+ concurrent updates. With <literal>NOT VALID</literal>,
+ the <command>ADD CONSTRAINT</command> command does not scan the table
+ and can be committed immediately. After that, a <literal>VALIDATE
+ CONSTRAINT</literal> command can be issued to verify that existing rows
+ satisfy the constraint. The validation step does not need to lock out
+ concurrent updates, since it knows that other transactions will be
+ enforcing the constraint for rows that they insert or update; only
+ pre-existing rows need to be checked. Hence, validation acquires only
+ a <literal>SHARE UPDATE EXCLUSIVE</literal> lock on the table being
+ altered. (If the constraint is a foreign key then a <literal>ROW
+ SHARE</literal> lock is also required on the table referenced by the
+ constraint.) In addition to improving concurrency, it can be useful to
+ use <literal>NOT VALID</literal> and <literal>VALIDATE
+ CONSTRAINT</literal> in cases where the table is known to contain
+ pre-existing violations. Once the constraint is in place, no new
+ violations can be inserted, and the existing problems can be corrected
+ at leisure until <literal>VALIDATE CONSTRAINT</literal> finally
+ succeeds.
+ </para>
+
+ <para>
The <literal>DROP COLUMN</literal> form does not physically remove
the column, but simply makes it invisible to SQL operations. Subsequent
insert and update operations in the table will store a null value for the
diff --git a/doc/src/sgml/ref/create_domain.sgml b/doc/src/sgml/ref/create_domain.sgml
index 49d53043304..81a89249260 100644
--- a/doc/src/sgml/ref/create_domain.sgml
+++ b/doc/src/sgml/ref/create_domain.sgml
@@ -214,6 +214,30 @@ INSERT INTO tab (domcol) VALUES ((SELECT domcol FROM tab WHERE false));
and then to apply column <literal>NOT NULL</literal> constraints to columns of
the domain type as needed, rather than directly to the domain type.
</para>
+
+ <para>
+ <productname>PostgreSQL</productname> assumes that
+ <literal>CHECK</literal> constraints' conditions are immutable, that is,
+ they will always give the same result for the same input value. This
+ assumption is what justifies examining <literal>CHECK</literal>
+ constraints only when a value is first converted to be of a domain type,
+ and not at other times. (This is essentially the same as the treatment
+ of table <literal>CHECK</literal> constraints, as described in
+ <xref linkend="ddl-constraints-check-constraints"/>.)
+ </para>
+
+ <para>
+ An example of a common way to break this assumption is to reference a
+ user-defined function in a <literal>CHECK</literal> expression, and then
+ change the behavior of that
+ function. <productname>PostgreSQL</productname> does not disallow that,
+ but it will not notice if there are stored values of the domain type that
+ now violate the <literal>CHECK</literal> constraint. That would cause a
+ subsequent database dump and reload to fail. The recommended way to
+ handle such a change is to drop the constraint (using <command>ALTER
+ DOMAIN</command>), adjust the function definition, and re-add the
+ constraint, thereby rechecking it against stored data.
+ </para>
</refsect1>
<refsect1>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 7cadd1a6aed..4f27daf1ae6 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -698,7 +698,8 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
<para>
Currently, <literal>CHECK</literal> expressions cannot contain
subqueries nor refer to variables other than columns of the
- current row. The system column <literal>tableoid</literal>
+ current row (see <xref linkend="ddl-constraints-check-constraints"/>).
+ The system column <literal>tableoid</literal>
may be referenced, but not any other system column.
</para>