summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPeter Geoghegan <pg@bowt.ie>2019-07-20 11:11:52 -0700
committerPeter Geoghegan <pg@bowt.ie>2019-07-20 11:11:52 -0700
commit577c8802d350e76bb0bb2b7ce5bcac551abd90d1 (patch)
tree58cc828476639c5dbd160c8aee8e3062ee84e5e4
parentee9417a04fba50b72289c746caf4cbd3fa818bb9 (diff)
Don't rely on estimates for amcheck Bloom filters.
Solely relying on a relation's reltuples/relpages estimate to size the Bloom filters used by amcheck verification makes verification less effective when the estimates are very stale. In extreme cases, verification options that use Bloom filters internally could be totally ineffective, without users receiving any clear indication that certain types of corruption might easily be missed. To fix, use RelationGetNumberOfBlocks() instead of relpages to size the downlink block Bloom filter. Use the same RelationGetNumberOfBlocks() value to derive a minimum size for the heapallindexed Bloom filter, rather than completely trusting reltuples. Verification will still be reasonably effective when the projected/estimated number of Bloom filter elements is at least 1/5 of the final number of elements, which is assured by the new sizing logic. Reported-By: Alexander Korotkov Discussion: https://postgr.es/m/CAH2-Wzk0ke2J42KrNYBKu0Xovjy-sU5ub7PWjgpbsKdAQcL4OA@mail.gmail.com Backpatch: 11-, where downlink/heapallindexed verification were added.
-rw-r--r--contrib/amcheck/verify_nbtree.c16
1 files changed, 11 insertions, 5 deletions
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 767d8e9e1e9..afbcd8ce768 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -350,11 +350,20 @@ bt_check_every_level(Relation rel, Relation heaprel, bool readonly,
if (state->heapallindexed)
{
+ int64 total_pages;
int64 total_elems;
uint64 seed;
- /* Size Bloom filter based on estimated number of tuples in index */
- total_elems = (int64) state->rel->rd_rel->reltuples;
+ /*
+ * Size Bloom filter based on estimated number of tuples in index,
+ * while conservatively assuming that each block must contain at least
+ * MaxIndexTuplesPerPage / 5 non-pivot tuples. (Non-leaf pages cannot
+ * contain non-pivot tuples. That's okay because they generally make
+ * up no more than about 1% of all pages in the index.)
+ */
+ total_pages = RelationGetNumberOfBlocks(rel);
+ total_elems = Max(total_pages * (MaxIndexTuplesPerPage / 5),
+ (int64) state->rel->rd_rel->reltuples);
/* Random seed relies on backend srandom() call to avoid repetition */
seed = random();
/* Create Bloom filter to fingerprint index */
@@ -398,8 +407,6 @@ bt_check_every_level(Relation rel, Relation heaprel, bool readonly,
}
else
{
- int64 total_pages;
-
/*
* Extra readonly downlink check.
*
@@ -410,7 +417,6 @@ bt_check_every_level(Relation rel, Relation heaprel, bool readonly,
* splits and page deletions, though. This is taken care of in
* bt_downlink_missing_check().
*/
- total_pages = (int64) state->rel->rd_rel->relpages;
state->downlinkfilter = bloom_create(total_pages, work_mem, seed);
}
}