diff options
author | Simon Riggs <simon@2ndQuadrant.com> | 2016-04-03 17:46:09 +0100 |
---|---|---|
committer | Simon Riggs <simon@2ndQuadrant.com> | 2016-04-03 17:46:09 +0100 |
commit | 3e4b7d87988f0835f137f15f5c1a40598dd21f3d (patch) | |
tree | 46ae019188fb235d3ad4c4de25fc44db4b7510ab /src/backend/access/nbtree/nbtree.c | |
parent | 3cc38ca7d21255721d600eb75d7cc6708c14764b (diff) |
Avoid pin scan for replay of XLOG_BTREE_VACUUM in all cases
Replay of XLOG_BTREE_VACUUM during Hot Standby was previously thought to require
complex interlocking that matched the requirements on the master. This required
an O(N) operation that became a significant problem with large indexes, causing
replication delays of seconds or in some cases minutes while the
XLOG_BTREE_VACUUM was replayed.
This commit skips the pin scan that was previously required, by observing in
detail when and how it is safe to do so, with full documentation. The pin
scan is skipped only in replay; the VACUUM code path on master is not
touched here and WAL is identical.
The current commit applies in all cases, effectively replacing commit
687f2cd7a0150647794efe432ae0397cb41b60ff.
Diffstat (limited to 'src/backend/access/nbtree/nbtree.c')
-rw-r--r-- | src/backend/access/nbtree/nbtree.c | 32 |
1 files changed, 9 insertions, 23 deletions
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c index f2905cb734e..bf8ade375d1 100644 --- a/src/backend/access/nbtree/nbtree.c +++ b/src/backend/access/nbtree/nbtree.c @@ -22,7 +22,6 @@ #include "access/relscan.h" #include "access/xlog.h" #include "catalog/index.h" -#include "catalog/pg_namespace.h" #include "commands/vacuum.h" #include "storage/indexfsm.h" #include "storage/ipc.h" @@ -833,8 +832,7 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats, /* * Check to see if we need to issue one final WAL record for this index, * which may be needed for correctness on a hot standby node when - * non-MVCC index scans could take place. This now only occurs when we - * perform a TOAST scan, so only occurs for TOAST indexes. + * non-MVCC index scans could take place. * * If the WAL is replayed in hot standby, the replay process needs to get * cleanup locks on all index leaf pages, just as we've been doing here. @@ -846,7 +844,6 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats, * against the last leaf page in the index, if that one wasn't vacuumed. */ if (XLogStandbyInfoActive() && - rel->rd_rel->relnamespace == PG_TOAST_NAMESPACE && vstate.lastBlockVacuumed < vstate.lastBlockLocked) { Buffer buf; @@ -1045,25 +1042,14 @@ restart: */ if (ndeletable > 0) { - BlockNumber lastBlockVacuumed = InvalidBlockNumber; - - /* - * We may need to record the lastBlockVacuumed for use when - * non-MVCC scans might be performed on the index on a - * hot standby. See explanation in btree_xlog_vacuum(). - * - * On a hot standby, a non-MVCC scan can only take place - * when we access a Toast Index, so we need only record - * the lastBlockVacuumed if we are vacuuming a Toast Index. - */ - if (rel->rd_rel->relnamespace == PG_TOAST_NAMESPACE) - lastBlockVacuumed = vstate->lastBlockVacuumed; - /* - * Notice that the issued XLOG_BTREE_VACUUM WAL record includes an - * instruction to the replay code to get cleanup lock on all pages - * between the previous lastBlockVacuumed and this page. This - * ensures that WAL replay locks all leaf pages at some point. + * Notice that the issued XLOG_BTREE_VACUUM WAL record includes all + * information to the replay code to allow it to get a cleanup lock + * on all pages between the previous lastBlockVacuumed and this page. + * This ensures that WAL replay locks all leaf pages at some point, + * which is important should non-MVCC scans be requested. + * This is currently unused on standby, but we record it anyway, so + * that the WAL contains the required information. * * Since we can visit leaf pages out-of-order when recursing, * replay might end up locking such pages an extra time, but it @@ -1071,7 +1057,7 @@ restart: * that. */ _bt_delitems_vacuum(rel, buf, deletable, ndeletable, - lastBlockVacuumed); + vstate->lastBlockVacuumed); /* * Remember highest leaf page number we've issued a |