summaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/fdwhandler.sgml232
1 files changed, 214 insertions, 18 deletions
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index 33863f04f82..236157743a5 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -665,6 +665,108 @@ IsForeignRelUpdatable (Relation rel);
</sect2>
+ <sect2 id="fdw-callbacks-row-locking">
+ <title>FDW Routines For Row Locking</title>
+
+ <para>
+ If an FDW wishes to support <firstterm>late row locking</> (as described
+ in <xref linkend="fdw-row-locking">), it must provide the following
+ callback functions:
+ </para>
+
+ <para>
+<programlisting>
+RowMarkType
+GetForeignRowMarkType (RangeTblEntry *rte,
+ LockClauseStrength strength);
+</programlisting>
+
+ Report which row-marking option to use for a foreign table.
+ <literal>rte</> is the <structname>RangeTblEntry</> node for the table
+ and <literal>strength</> describes the lock strength requested by the
+ relevant <literal>FOR UPDATE/SHARE</> clause, if any. The result must be
+ a member of the <literal>RowMarkType</> enum type.
+ </para>
+
+ <para>
+ This function is called during query planning for each foreign table that
+ appears in an <command>UPDATE</>, <command>DELETE</>, or <command>SELECT
+ FOR UPDATE/SHARE</> query and is not the target of <command>UPDATE</>
+ or <command>DELETE</>.
+ </para>
+
+ <para>
+ If the <function>GetForeignRowMarkType</> pointer is set to
+ <literal>NULL</>, the <literal>ROW_MARK_COPY</> option is always used.
+ (This implies that <function>RefetchForeignRow</> will never be called,
+ so it need not be provided either.)
+ </para>
+
+ <para>
+ See <xref linkend="fdw-row-locking"> for more information.
+ </para>
+
+ <para>
+<programlisting>
+HeapTuple
+RefetchForeignRow (EState *estate,
+ ExecRowMark *erm,
+ Datum rowid,
+ bool *updated);
+</programlisting>
+
+ Re-fetch one tuple from the foreign table, after locking it if required.
+ <literal>estate</> is global execution state for the query.
+ <literal>erm</> is the <structname>ExecRowMark</> struct describing
+ the target foreign table and the row lock type (if any) to acquire.
+ <literal>rowid</> identifies the tuple to be fetched.
+ <literal>updated</> is an output parameter.
+ </para>
+
+ <para>
+ This function should return a palloc'ed copy of the fetched tuple,
+ or <literal>NULL</> if the row lock couldn't be obtained. The row lock
+ type to acquire is defined by <literal>erm-&gt;markType</>, which is the
+ value previously returned by <function>GetForeignRowMarkType</>.
+ (<literal>ROW_MARK_REFERENCE</> means to just re-fetch the tuple without
+ acquiring any lock, and <literal>ROW_MARK_COPY</> will never be seen by
+ this routine.)
+ </para>
+
+ <para>
+ In addition, <literal>*updated</> should be set to <literal>true</>
+ if what was fetched was an updated version of the tuple rather than
+ the same version previously obtained. (If the FDW cannot be sure about
+ this, always returning <literal>true</> is recommended.)
+ </para>
+
+ <para>
+ Note that by default, failure to acquire a row lock should result in
+ raising an error; a <literal>NULL</> return is only appropriate if
+ the <literal>SKIP LOCKED</> option is specified
+ by <literal>erm-&gt;waitPolicy</>.
+ </para>
+
+ <para>
+ The <literal>rowid</> is the <structfield>ctid</> value previously read
+ for the row to be re-fetched. Although the <literal>rowid</> value is
+ passed as a <type>Datum</>, it can currently only be a <type>tid</>. The
+ function API is chosen in hopes that it may be possible to allow other
+ datatypes for row IDs in future.
+ </para>
+
+ <para>
+ If the <function>RefetchForeignRow</> pointer is set to
+ <literal>NULL</>, attempts to re-fetch rows will fail
+ with an error message.
+ </para>
+
+ <para>
+ See <xref linkend="fdw-row-locking"> for more information.
+ </para>
+
+ </sect2>
+
<sect2 id="fdw-callbacks-explain">
<title>FDW Routines for <command>EXPLAIN</></title>
@@ -1093,30 +1195,124 @@ GetForeignServerByName(const char *name, bool missing_ok);
</para>
<para>
- For an <command>UPDATE</> or <command>DELETE</> against an external data
- source that supports concurrent updates, it is recommended that the
- <literal>ForeignScan</> operation lock the rows that it fetches, perhaps
- via the equivalent of <command>SELECT FOR UPDATE</>. The FDW may also
- choose to lock rows at fetch time when the foreign table is referenced
- in a <command>SELECT FOR UPDATE/SHARE</>; if it does not, the
- <literal>FOR UPDATE</> or <literal>FOR SHARE</> option is essentially a
- no-op so far as the foreign table is concerned. This behavior may yield
- semantics slightly different from operations on local tables, where row
- locking is customarily delayed as long as possible: remote rows may get
- locked even though they subsequently fail locally-applied restriction or
- join conditions. However, matching the local semantics exactly would
- require an additional remote access for every row, and might be
- impossible anyway depending on what locking semantics the external data
- source provides.
- </para>
-
- <para>
<command>INSERT</> with an <literal>ON CONFLICT</> clause does not
support specifying the conflict target, as remote constraints are not
locally known. This in turn implies that <literal>ON CONFLICT DO
UPDATE</> is not supported, since the specification is mandatory there.
</para>
+ </sect1>
+
+ <sect1 id="fdw-row-locking">
+ <title>Row Locking in Foreign Data Wrappers</title>
+
+ <para>
+ If an FDW's underlying storage mechanism has a concept of locking
+ individual rows to prevent concurrent updates of those rows, it is
+ usually worthwhile for the FDW to perform row-level locking with as
+ close an approximation as practical to the semantics used in
+ ordinary <productname>PostgreSQL</> tables. There are multiple
+ considerations involved in this.
+ </para>
+
+ <para>
+ One key decision to be made is whether to perform <firstterm>early
+ locking</> or <firstterm>late locking</>. In early locking, a row is
+ locked when it is first retrieved from the underlying store, while in
+ late locking, the row is locked only when it is known that it needs to
+ be locked. (The difference arises because some rows may be discarded by
+ locally-checked restriction or join conditions.) Early locking is much
+ simpler and avoids extra round trips to a remote store, but it can cause
+ locking of rows that need not have been locked, resulting in reduced
+ concurrency or even unexpected deadlocks. Also, late locking is only
+ possible if the row to be locked can be uniquely re-identified later.
+ Preferably the row identifier should identify a specific version of the
+ row, as <productname>PostgreSQL</> TIDs do.
+ </para>
+
+ <para>
+ By default, <productname>PostgreSQL</> ignores locking considerations
+ when interfacing to FDWs, but an FDW can perform early locking without
+ any explicit support from the core code. The API functions described
+ in <xref linkend="fdw-callbacks-row-locking">, which were added
+ in <productname>PostgreSQL</> 9.5, allow an FDW to use late locking if
+ it wishes.
+ </para>
+
+ <para>
+ An additional consideration is that in <literal>READ COMMITTED</>
+ isolation mode, <productname>PostgreSQL</> may need to re-check
+ restriction and join conditions against an updated version of some
+ target tuple. Rechecking join conditions requires re-obtaining copies
+ of the non-target rows that were previously joined to the target tuple.
+ When working with standard <productname>PostgreSQL</> tables, this is
+ done by including the TIDs of the non-target tables in the column list
+ projected through the join, and then re-fetching non-target rows when
+ required. This approach keeps the join data set compact, but it
+ requires inexpensive re-fetch capability, as well as a TID that can
+ uniquely identify the row version to be re-fetched. By default,
+ therefore, the approach used with foreign tables is to include a copy of
+ the entire row fetched from a foreign table in the column list projected
+ through the join. This puts no special demands on the FDW but can
+ result in reduced performance of merge and hash joins. An FDW that is
+ capable of meeting the re-fetch requirements can choose to do it the
+ first way.
+ </para>
+
+ <para>
+ For an <command>UPDATE</> or <command>DELETE</> on a foreign table, it
+ is recommended that the <literal>ForeignScan</> operation on the target
+ table perform early locking on the rows that it fetches, perhaps via the
+ equivalent of <command>SELECT FOR UPDATE</>. An FDW can detect whether
+ a table is an <command>UPDATE</>/<command>DELETE</> target at plan time
+ by comparing its relid to <literal>root-&gt;parse-&gt;resultRelation</>,
+ or at execution time by using <function>ExecRelationIsTargetRelation()</>.
+ An alternative possibility is to perform late locking within the
+ <function>ExecForeignUpdate</> or <function>ExecForeignDelete</>
+ callback, but no special support is provided for this.
+ </para>
+
+ <para>
+ For foreign tables that are specified to be locked by a <command>SELECT
+ FOR UPDATE/SHARE</> command, the <literal>ForeignScan</> operation can
+ again perform early locking by fetching tuples with the equivalent
+ of <command>SELECT FOR UPDATE/SHARE</>. To perform late locking
+ instead, provide the callback functions defined
+ in <xref linkend="fdw-callbacks-row-locking">.
+ In <function>GetForeignRowMarkType</>, select rowmark option
+ <literal>ROW_MARK_EXCLUSIVE</>, <literal>ROW_MARK_NOKEYEXCLUSIVE</>,
+ <literal>ROW_MARK_SHARE</>, or <literal>ROW_MARK_KEYSHARE</> depending
+ on the requested lock strength. (The core code will act the same
+ regardless of which of these four options you choose.)
+ Elsewhere, you can detect whether a foreign table was specified to be
+ locked by this type of command by using <function>get_plan_rowmark</> at
+ plan time, or <function>ExecFindRowMark</> at execution time; you must
+ check not only whether a non-null rowmark struct is returned, but that
+ its <structfield>strength</> field is not <literal>LCS_NONE</>.
+ </para>
+
+ <para>
+ Lastly, for foreign tables that are used in an <command>UPDATE</>,
+ <command>DELETE</> or <command>SELECT FOR UPDATE/SHARE</> command but
+ are not specified to be row-locked, you can override the default choice
+ to copy entire rows by having <function>GetForeignRowMarkType</> select
+ option <literal>ROW_MARK_REFERENCE</> when it sees lock strength
+ <literal>LCS_NONE</>. This will cause <function>RefetchForeignRow</> to
+ be called with that value for <structfield>markType</>; it should then
+ re-fetch the row without acquiring any new lock. (If you have
+ a <function>GetForeignRowMarkType</> function but don't wish to re-fetch
+ unlocked rows, select option <literal>ROW_MARK_COPY</>
+ for <literal>LCS_NONE</>.)
+ </para>
+
+ <para>
+ See <filename>src/include/nodes/lockoptions.h</>, the comments
+ for <type>RowMarkType</> and <type>PlanRowMark</>
+ in <filename>src/include/nodes/plannodes.h</>, and the comments for
+ <type>ExecRowMark</> in <filename>src/include/nodes/execnodes.h</> for
+ additional information.
+ </para>
+
</sect1>
</chapter>