From afb9249d06f47d7a6d4a89fea0c3625fe43c5a5d Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Tue, 12 May 2015 14:10:10 -0400 Subject: Add support for doing late row locking in FDWs. Previously, FDWs could only do "early row locking", that is lock a row as soon as it's fetched, even though local restriction/join conditions might discard the row later. This patch adds callbacks that allow FDWs to do late locking in the same way that it's done for regular tables. To make use of this feature, an FDW must support the "ctid" column as a unique row identifier. Currently, since ctid has to be of type TID, the feature is of limited use, though in principle it could be used by postgres_fdw. We may eventually allow FDWs to specify another data type for ctid, which would make it possible for more FDWs to use this feature. This commit does not modify postgres_fdw to use late locking. We've tested some prototype code for that, but it's not in committable shape, and besides it's quite unclear whether it actually makes sense to do late locking against a remote server. The extra round trips required are likely to outweigh any benefit from improved concurrency. Etsuro Fujita, reviewed by Ashutosh Bapat, and hacked up a lot by me --- doc/src/sgml/fdwhandler.sgml | 232 +++++++++++++++++++++++++++++++++++++++---- 1 file changed, 214 insertions(+), 18 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml index 33863f04f82..236157743a5 100644 --- a/doc/src/sgml/fdwhandler.sgml +++ b/doc/src/sgml/fdwhandler.sgml @@ -665,6 +665,108 @@ IsForeignRelUpdatable (Relation rel); + + FDW Routines For Row Locking + + + If an FDW wishes to support late row locking (as described + in ), it must provide the following + callback functions: + + + + +RowMarkType +GetForeignRowMarkType (RangeTblEntry *rte, + LockClauseStrength strength); + + + Report which row-marking option to use for a foreign table. + rte is the RangeTblEntry node for the table + and strength describes the lock strength requested by the + relevant FOR UPDATE/SHARE clause, if any. The result must be + a member of the RowMarkType enum type. + + + + This function is called during query planning for each foreign table that + appears in an UPDATE, DELETE, or SELECT + FOR UPDATE/SHARE query and is not the target of UPDATE + or DELETE. + + + + If the GetForeignRowMarkType pointer is set to + NULL, the ROW_MARK_COPY option is always used. + (This implies that RefetchForeignRow will never be called, + so it need not be provided either.) + + + + See for more information. + + + + +HeapTuple +RefetchForeignRow (EState *estate, + ExecRowMark *erm, + Datum rowid, + bool *updated); + + + Re-fetch one tuple from the foreign table, after locking it if required. + estate is global execution state for the query. + erm is the ExecRowMark struct describing + the target foreign table and the row lock type (if any) to acquire. + rowid identifies the tuple to be fetched. + updated is an output parameter. + + + + This function should return a palloc'ed copy of the fetched tuple, + or NULL if the row lock couldn't be obtained. The row lock + type to acquire is defined by erm->markType, which is the + value previously returned by GetForeignRowMarkType. + (ROW_MARK_REFERENCE means to just re-fetch the tuple without + acquiring any lock, and ROW_MARK_COPY will never be seen by + this routine.) + + + + In addition, *updated should be set to true + if what was fetched was an updated version of the tuple rather than + the same version previously obtained. (If the FDW cannot be sure about + this, always returning true is recommended.) + + + + Note that by default, failure to acquire a row lock should result in + raising an error; a NULL return is only appropriate if + the SKIP LOCKED option is specified + by erm->waitPolicy. + + + + The rowid is the ctid value previously read + for the row to be re-fetched. Although the rowid value is + passed as a Datum, it can currently only be a tid. The + function API is chosen in hopes that it may be possible to allow other + datatypes for row IDs in future. + + + + If the RefetchForeignRow pointer is set to + NULL, attempts to re-fetch rows will fail + with an error message. + + + + See for more information. + + + + FDW Routines for <command>EXPLAIN</> @@ -1092,24 +1194,6 @@ GetForeignServerByName(const char *name, bool missing_ok); structures that copyObject knows how to copy. - - For an UPDATE or DELETE against an external data - source that supports concurrent updates, it is recommended that the - ForeignScan operation lock the rows that it fetches, perhaps - via the equivalent of SELECT FOR UPDATE. The FDW may also - choose to lock rows at fetch time when the foreign table is referenced - in a SELECT FOR UPDATE/SHARE; if it does not, the - FOR UPDATE or FOR SHARE option is essentially a - no-op so far as the foreign table is concerned. This behavior may yield - semantics slightly different from operations on local tables, where row - locking is customarily delayed as long as possible: remote rows may get - locked even though they subsequently fail locally-applied restriction or - join conditions. However, matching the local semantics exactly would - require an additional remote access for every row, and might be - impossible anyway depending on what locking semantics the external data - source provides. - - INSERT with an ON CONFLICT clause does not support specifying the conflict target, as remote constraints are not @@ -1117,6 +1201,118 @@ GetForeignServerByName(const char *name, bool missing_ok); UPDATE is not supported, since the specification is mandatory there. + + + + Row Locking in Foreign Data Wrappers + + + If an FDW's underlying storage mechanism has a concept of locking + individual rows to prevent concurrent updates of those rows, it is + usually worthwhile for the FDW to perform row-level locking with as + close an approximation as practical to the semantics used in + ordinary PostgreSQL tables. There are multiple + considerations involved in this. + + + + One key decision to be made is whether to perform early + locking or late locking. In early locking, a row is + locked when it is first retrieved from the underlying store, while in + late locking, the row is locked only when it is known that it needs to + be locked. (The difference arises because some rows may be discarded by + locally-checked restriction or join conditions.) Early locking is much + simpler and avoids extra round trips to a remote store, but it can cause + locking of rows that need not have been locked, resulting in reduced + concurrency or even unexpected deadlocks. Also, late locking is only + possible if the row to be locked can be uniquely re-identified later. + Preferably the row identifier should identify a specific version of the + row, as PostgreSQL TIDs do. + + + + By default, PostgreSQL ignores locking considerations + when interfacing to FDWs, but an FDW can perform early locking without + any explicit support from the core code. The API functions described + in , which were added + in PostgreSQL 9.5, allow an FDW to use late locking if + it wishes. + + + + An additional consideration is that in READ COMMITTED + isolation mode, PostgreSQL may need to re-check + restriction and join conditions against an updated version of some + target tuple. Rechecking join conditions requires re-obtaining copies + of the non-target rows that were previously joined to the target tuple. + When working with standard PostgreSQL tables, this is + done by including the TIDs of the non-target tables in the column list + projected through the join, and then re-fetching non-target rows when + required. This approach keeps the join data set compact, but it + requires inexpensive re-fetch capability, as well as a TID that can + uniquely identify the row version to be re-fetched. By default, + therefore, the approach used with foreign tables is to include a copy of + the entire row fetched from a foreign table in the column list projected + through the join. This puts no special demands on the FDW but can + result in reduced performance of merge and hash joins. An FDW that is + capable of meeting the re-fetch requirements can choose to do it the + first way. + + + + For an UPDATE or DELETE on a foreign table, it + is recommended that the ForeignScan operation on the target + table perform early locking on the rows that it fetches, perhaps via the + equivalent of SELECT FOR UPDATE. An FDW can detect whether + a table is an UPDATE/DELETE target at plan time + by comparing its relid to root->parse->resultRelation, + or at execution time by using ExecRelationIsTargetRelation(). + An alternative possibility is to perform late locking within the + ExecForeignUpdate or ExecForeignDelete + callback, but no special support is provided for this. + + + + For foreign tables that are specified to be locked by a SELECT + FOR UPDATE/SHARE command, the ForeignScan operation can + again perform early locking by fetching tuples with the equivalent + of SELECT FOR UPDATE/SHARE. To perform late locking + instead, provide the callback functions defined + in . + In GetForeignRowMarkType, select rowmark option + ROW_MARK_EXCLUSIVE, ROW_MARK_NOKEYEXCLUSIVE, + ROW_MARK_SHARE, or ROW_MARK_KEYSHARE depending + on the requested lock strength. (The core code will act the same + regardless of which of these four options you choose.) + Elsewhere, you can detect whether a foreign table was specified to be + locked by this type of command by using get_plan_rowmark at + plan time, or ExecFindRowMark at execution time; you must + check not only whether a non-null rowmark struct is returned, but that + its strength field is not LCS_NONE. + + + + Lastly, for foreign tables that are used in an UPDATE, + DELETE or SELECT FOR UPDATE/SHARE command but + are not specified to be row-locked, you can override the default choice + to copy entire rows by having GetForeignRowMarkType select + option ROW_MARK_REFERENCE when it sees lock strength + LCS_NONE. This will cause RefetchForeignRow to + be called with that value for markType; it should then + re-fetch the row without acquiring any new lock. (If you have + a GetForeignRowMarkType function but don't wish to re-fetch + unlocked rows, select option ROW_MARK_COPY + for LCS_NONE.) + + + + See src/include/nodes/lockoptions.h, the comments + for RowMarkType and PlanRowMark + in src/include/nodes/plannodes.h, and the comments for + ExecRowMark in src/include/nodes/execnodes.h for + additional information. + + -- cgit v1.2.3