diff options
| author | Tom Lane <tgl@sss.pgh.pa.us> | 2024-01-13 13:46:27 -0500 | 
|---|---|---|
| committer | Tom Lane <tgl@sss.pgh.pa.us> | 2024-01-13 13:46:27 -0500 | 
| commit | 98e03f95743624a4418d0d4f549aa1799d4bf28e (patch) | |
| tree | 2f065e978d3772d735d840ea9bcfea7ad125d6fb /doc/src/sgml/ref/checkpoint.sgml | |
| parent | 3386afadd9a2c3c8145ffb4bc0cde97710eba2e5 (diff) | |
Cope with catcache entries becoming stale during detoasting.
We've long had a policy that any toasted fields in a catalog tuple
should be pulled in-line before entering the tuple in a catalog cache.
However, that requires access to the catalog's toast table, and we'll
typically do AcceptInvalidationMessages while opening the toast table.
So it's possible that the catalog tuple is outdated by the time we
finish detoasting it.  Since no cache entry exists yet, we can't
mark the entry stale during AcceptInvalidationMessages, and instead
we'll press forward and build an apparently-valid cache entry.  The
upshot is that we have a race condition whereby an out-of-date entry
could be made in a backend's catalog cache, and persist there
indefinitely causing indeterminate misbehavior.
To fix, use the existing systable_recheck_tuple code to recheck
whether the catalog tuple is still up-to-date after we finish
detoasting it.  If not, loop around and restart the process of
searching the catalog and constructing cache entries from the top.
The case is rare enough that this shouldn't create any meaningful
performance penalty, even in the SearchCatCacheList case where
we need to tear down and reconstruct the whole list.
Indeed, the case is so rare that AFAICT it doesn't occur during
our regression tests, and there doesn't seem to be any easy way
to build a test that would exercise it reliably.  To allow
testing of the retry code paths, add logic (in USE_ASSERT_CHECKING
builds only) that randomly pretends that the recheck failed about
one time out of a thousand.  This is enough to ensure that we'll
pass through the retry paths during most regression test runs.
By adding an extra level of looping, this commit creates a need
to reindent most of SearchCatCacheMiss and SearchCatCacheList.
I'll do that separately, to allow putting those changes in
.git-blame-ignore-revs.
Patch by me; thanks to Alexander Lakhin for having built a test
case to prove the bug is real, and to Xiaoran Wang for review.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/1393953.1698353013@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAGjhLkOoBEC9mLsnB42d3CO1vcMx71MLSEuigeABbQ8oRdA6gw@mail.gmail.com
Diffstat (limited to 'doc/src/sgml/ref/checkpoint.sgml')
0 files changed, 0 insertions, 0 deletions
