diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2005-05-07 21:33:21 +0000 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2005-05-07 21:33:21 +0000 |
commit | 0053e290d9e7b4046101b6c093710a30c2786a9f (patch) | |
tree | 039631ac5254bafed7c22d40af152502209d0a04 /src/backend/access/nbtree/nbtree.c | |
parent | 501ec7b64c25c8fd595020e02f1cb939f176b30c (diff) |
Repair very-low-probability race condition between relation extension
and VACUUM: in the interval between adding a new page to the relation
and formatting it, it was possible for VACUUM to come along and decide
it should format the page too. Though not harmful in itself, this would
cause data loss if a third transaction were able to insert tuples into
the vacuumed page before the original extender got control back.
Diffstat (limited to 'src/backend/access/nbtree/nbtree.c')
-rw-r--r-- | src/backend/access/nbtree/nbtree.c | 26 |
1 files changed, 25 insertions, 1 deletions
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c index 3979f79c358..07b3c8d6f68 100644 --- a/src/backend/access/nbtree/nbtree.c +++ b/src/backend/access/nbtree/nbtree.c @@ -12,7 +12,7 @@ * Portions Copyright (c) 1994, Regents of the University of California * * IDENTIFICATION - * $Header: /cvsroot/pgsql/src/backend/access/nbtree/nbtree.c,v 1.106 2003/09/29 23:40:26 tgl Exp $ + * $Header: /cvsroot/pgsql/src/backend/access/nbtree/nbtree.c,v 1.106.2.1 2005/05/07 21:33:21 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -707,11 +707,35 @@ btvacuumcleanup(PG_FUNCTION_ARGS) BlockNumber pages_deleted = 0; MemoryContext mycontext; MemoryContext oldcontext; + bool needLock; Assert(stats != NULL); + /* + * First find out the number of pages in the index. We must acquire + * the relation-extension lock while doing this to avoid a race + * condition: if someone else is extending the relation, there is + * a window where bufmgr/smgr have created a new all-zero page but + * it hasn't yet been write-locked by _bt_getbuf(). If we manage to + * scan such a page here, we'll improperly assume it can be recycled. + * Taking the lock synchronizes things enough to prevent a problem: + * either num_pages won't include the new page, or _bt_getbuf already + * has write lock on the buffer and it will be fully initialized before + * we can examine it. (See also vacuumlazy.c, which has the same issue.) + * + * We can skip locking for new or temp relations, + * however, since no one else could be accessing them. + */ + needLock = !(rel->rd_isnew || rel->rd_istemp); + + if (needLock) + LockPage(rel, 0, ExclusiveLock); + num_pages = RelationGetNumberOfBlocks(rel); + if (needLock) + UnlockPage(rel, 0, ExclusiveLock); + /* No point in remembering more than MaxFSMPages pages */ maxFreePages = MaxFSMPages; if ((BlockNumber) maxFreePages > num_pages) |