diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2020-08-30 12:21:51 -0400 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2020-08-30 12:21:51 -0400 |
commit | 3d351d916b20534f973eda760cde17d96545d4c4 (patch) | |
tree | 133283e6cd7c7308add384bd7fd5d037801ac683 /src/backend/access/table/tableam.c | |
parent | 9511fb37ac78c77736e5483118265f7e83cd9f3c (diff) |
Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.
Historically, we've considered the state with relpages and reltuples
both zero as indicating that we do not know the table's tuple density.
This is problematic because it's impossible to distinguish "never yet
vacuumed" from "vacuumed and seen to be empty". In particular, a user
cannot use VACUUM or ANALYZE to override the planner's normal heuristic
that an empty table should not be believed to be empty because it is
probably about to get populated. That heuristic is a good safety
measure, so I don't care to abandon it, but there should be a way to
override it if the table is indeed intended to stay empty.
Hence, represent the initial state of ignorance by setting reltuples
to -1 (relpages is still set to zero), and apply the minimum-ten-pages
heuristic only when reltuples is still -1. If the table is empty,
VACUUM or ANALYZE (but not CREATE INDEX) will override that to
reltuples = relpages = 0, and then we'll plan on that basis.
This requires a bunch of fiddly little changes, but we can get rid of
some ugly kluges that were formerly needed to maintain the old definition.
One notable point is that FDWs' GetForeignRelSize methods will see
baserel->tuples = -1 when no ANALYZE has been done on the foreign table.
That seems like a net improvement, since those methods were formerly
also in the dark about what baserel->tuples = 0 really meant. Still,
it is an API change.
I bumped catversion because code predating this change would get confused
by seeing reltuples = -1.
Discussion: https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
Diffstat (limited to 'src/backend/access/table/tableam.c')
-rw-r--r-- | src/backend/access/table/tableam.c | 22 |
1 files changed, 9 insertions, 13 deletions
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c index c6383197657..6438c457161 100644 --- a/src/backend/access/table/tableam.c +++ b/src/backend/access/table/tableam.c @@ -701,18 +701,14 @@ table_block_relation_estimate_size(Relation rel, int32 *attr_widths, * doesn't happen instantaneously, and it won't happen at all for cases * such as temporary tables.) * - * We approximate "never vacuumed" by "has relpages = 0", which means this - * will also fire on genuinely empty relations. Not great, but - * fortunately that's a seldom-seen case in the real world, and it - * shouldn't degrade the quality of the plan too much anyway to err in - * this direction. + * We test "never vacuumed" by seeing whether reltuples < 0. * * If the table has inheritance children, we don't apply this heuristic. * Totally empty parent tables are quite common, so we should be willing * to believe that they are empty. */ if (curpages < 10 && - relpages == 0 && + reltuples < 0 && !rel->rd_rel->relhassubclass) curpages = 10; @@ -727,17 +723,17 @@ table_block_relation_estimate_size(Relation rel, int32 *attr_widths, } /* estimate number of tuples from previous tuple density */ - if (relpages > 0) + if (reltuples >= 0 && relpages > 0) density = reltuples / (double) relpages; else { /* - * When we have no data because the relation was truncated, estimate - * tuple width from attribute datatypes. We assume here that the - * pages are completely full, which is OK for tables (since they've - * presumably not been VACUUMed yet) but is probably an overestimate - * for indexes. Fortunately get_relation_info() can clamp the - * overestimate to the parent table's size. + * When we have no data because the relation was never yet vacuumed, + * estimate tuple width from attribute datatypes. We assume here that + * the pages are completely full, which is OK for tables but is + * probably an overestimate for indexes. Fortunately + * get_relation_info() can clamp the overestimate to the parent + * table's size. * * Note: this code intentionally disregards alignment considerations, * because (a) that would be gilding the lily considering how crude |