diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2022-11-22 14:40:20 -0500 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2022-11-22 14:40:20 -0500 |
commit | 9c6ad5eaa957bdc2132b900a96e0d2ec9264d39c (patch) | |
tree | d7416eb7c6a6779916045c655e753aeccb87a101 /src/backend/utils/adt | |
parent | 0538d4c0c33551029f408fdc29ee51b817632e11 (diff) |
YA attempt at taming worst-case behavior of get_actual_variable_range.
We've made multiple attempts at preventing get_actual_variable_range
from taking an unreasonable amount of time (3ca930fc3, fccebe421).
But there's still an issue for the very first planning attempt after
deletion of a large number of extremal-valued tuples. While that
planning attempt will set "killed" bits on the tuples it visits and
thereby reduce effort for next time, there's still a lot of work it
has to do to visit the heap and then set those bits. It's (usually?)
not worth it to do that much work at plan time to have a slightly
better estimate, especially in a context like this where the table
contents are known to be mutating rapidly.
Therefore, let's bound the amount of work to be done by giving up
after we've visited 100 heap pages. Giving up just means we'll
fall back on the extremal value recorded in pg_statistic, so it
shouldn't mean that planner estimates suddenly become worthless.
Note that this means we'll still gradually whittle down the problem
by setting a few more index "killed" bits in each planning attempt;
so eventually we'll reach a good state (barring further deletions),
even in the absence of VACUUM.
Simon Riggs, per a complaint from Jakub Wartak (with cosmetic
adjustments by me). Back-patch to all supported branches.
Discussion: https://postgr.es/m/CAKZiRmznOwi0oaV=4PHOCM4ygcH4MgSvt8=5cu_vNCfc8FSUug@mail.gmail.com
Diffstat (limited to 'src/backend/utils/adt')
-rw-r--r-- | src/backend/utils/adt/selfuncs.c | 45 |
1 files changed, 40 insertions, 5 deletions
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index e0aeaa69092..f116924d3c4 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -5964,7 +5964,7 @@ get_stats_slot_range(AttStatsSlot *sslot, Oid opfuncoid, FmgrInfo *opproc, * and fetching its low and/or high values. * If successful, store values in *min and *max, and return true. * (Either pointer can be NULL if that endpoint isn't needed.) - * If no data available, return false. + * If unsuccessful, return false. * * sortop is the "<" comparison operator to use. * collation is the required collation. @@ -6093,11 +6093,11 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata, } else { - /* If min not requested, assume index is nonempty */ + /* If min not requested, still want to fetch max */ have_data = true; } - /* If max is requested, and we didn't find the index is empty */ + /* If max is requested, and we didn't already fail ... */ if (max && have_data) { /* scan in the opposite direction; all else is the same */ @@ -6131,7 +6131,7 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata, /* * Get one endpoint datum (min or max depending on indexscandir) from the - * specified index. Return true if successful, false if index is empty. + * specified index. Return true if successful, false if not. * On success, endpoint value is stored to *endpointDatum (and copied into * outercontext). * @@ -6141,6 +6141,9 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata, * to probe the heap. * (We could compute these values locally, but that would mean computing them * twice when get_actual_variable_range needs both the min and the max.) + * + * Failure occurs either when the index is empty, or we decide that it's + * taking too long to find a suitable tuple. */ static bool get_actual_variable_endpoint(Relation heapRel, @@ -6157,6 +6160,8 @@ get_actual_variable_endpoint(Relation heapRel, SnapshotData SnapshotNonVacuumable; IndexScanDesc index_scan; Buffer vmbuffer = InvalidBuffer; + BlockNumber last_heap_block = InvalidBlockNumber; + int n_visited_heap_pages = 0; ItemPointer tid; Datum values[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; @@ -6199,6 +6204,12 @@ get_actual_variable_endpoint(Relation heapRel, * might get a bogus answer that's not close to the index extremal value, * or could even be NULL. We avoid this hazard because we take the data * from the index entry not the heap. + * + * Despite all this care, there are situations where we might find many + * non-visible tuples near the end of the index. We don't want to expend + * a huge amount of time here, so we give up once we've read too many heap + * pages. When we fail for that reason, the caller will end up using + * whatever extremal value is recorded in pg_statistic. */ InitNonVacuumableSnapshot(SnapshotNonVacuumable, GlobalVisTestFor(heapRel)); @@ -6213,13 +6224,37 @@ get_actual_variable_endpoint(Relation heapRel, /* Fetch first/next tuple in specified direction */ while ((tid = index_getnext_tid(index_scan, indexscandir)) != NULL) { + BlockNumber block = ItemPointerGetBlockNumber(tid); + if (!VM_ALL_VISIBLE(heapRel, - ItemPointerGetBlockNumber(tid), + block, &vmbuffer)) { /* Rats, we have to visit the heap to check visibility */ if (!index_fetch_heap(index_scan, tableslot)) + { + /* + * No visible tuple for this index entry, so we need to + * advance to the next entry. Before doing so, count heap + * page fetches and give up if we've done too many. + * + * We don't charge a page fetch if this is the same heap page + * as the previous tuple. This is on the conservative side, + * since other recently-accessed pages are probably still in + * buffers too; but it's good enough for this heuristic. + */ +#define VISITED_PAGES_LIMIT 100 + + if (block != last_heap_block) + { + last_heap_block = block; + n_visited_heap_pages++; + if (n_visited_heap_pages > VISITED_PAGES_LIMIT) + break; + } + continue; /* no visible tuple, try next index entry */ + } /* We don't actually need the heap tuple for anything */ ExecClearTuple(tableslot); |