From b3d24cc0f0aa882ceec0a74a99f94166c6fc3247 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Mon, 16 Aug 2021 17:27:52 -0400 Subject: Revert analyze support for partitioned tables This reverts the following commits: 1b5617eb844cd2470a334c1d2eec66cf9b39c41a Describe (auto-)analyze behavior for partitioned tables 0e69f705cc1a3df273b38c9883fb5765991e04fe Set pg_class.reltuples for partitioned tables 41badeaba8beee7648ebe7923a41c04f1f3cb302 Document ANALYZE storage parameters for partitioned tables 0827e8af70f4653ba17ed773f123a60eadd9f9c9 autovacuum: handle analyze for partitioned tables There are efficiency issues in this code when handling databases with large numbers of partitions, and it doesn't look like there isn't any trivial way to handle those. There are some other issues as well. It's now too late in the cycle for nontrivial fixes, so we'll have to let Postgres 14 users continue to manually deal with ANALYZE their partitioned tables, and hopefully we can fix the issues for Postgres 15. I kept [most of] be280cdad298 ("Don't reset relhasindex for partitioned tables on ANALYZE") because while we added it due to 0827e8af70f4, it is a good bugfix in its own right, since it affects manual analyze as well as autovacuum-induced analyze, and there's no reason to revert it. I retained the addition of relkind 'p' to tables included by pg_stat_user_tables, because reverting that would require a catversion bump. Also, in pg14 only, I keep a struct member that was added to PgStat_TabStatEntry to avoid breaking compatibility with existing stat files. Backpatch to 14. Discussion: https://postgr.es/m/20210722205458.f2bug3z6qzxzpx2s@alap3.anarazel.de --- doc/src/sgml/maintenance.sgml | 6 ------ doc/src/sgml/perform.sgml | 3 +-- doc/src/sgml/ref/analyze.sgml | 40 +++++++++++--------------------------- doc/src/sgml/ref/create_table.sgml | 8 ++------ doc/src/sgml/ref/pg_restore.sgml | 6 ++---- 5 files changed, 16 insertions(+), 47 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml index 998a48fc257..36f975b1e5b 100644 --- a/doc/src/sgml/maintenance.sgml +++ b/doc/src/sgml/maintenance.sgml @@ -817,12 +817,6 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu is compared to the total number of tuples inserted, updated, or deleted since the last ANALYZE. - For partitioned tables, inserts, updates and deletes on partitions - are counted towards this threshold; however, DDL - operations such as ATTACH, DETACH - and DROP are not, so running a manual - ANALYZE is recommended if the partition added or - removed contains a statistically significant volume of data. diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml index ddd6c3ff3e0..89ff58338e5 100644 --- a/doc/src/sgml/perform.sgml +++ b/doc/src/sgml/perform.sgml @@ -1767,8 +1767,7 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse; Whenever you have significantly altered the distribution of data within a table, running ANALYZE is strongly recommended. This - includes bulk loading large amounts of data into the table as well as - attaching, detaching or dropping partitions. Running + includes bulk loading large amounts of data into the table. Running ANALYZE (or VACUUM ANALYZE) ensures that the planner has up-to-date statistics about the table. With no statistics or obsolete statistics, the planner might diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml index 176c7cb2256..c8fcebc1612 100644 --- a/doc/src/sgml/ref/analyze.sgml +++ b/doc/src/sgml/ref/analyze.sgml @@ -250,38 +250,20 @@ ANALYZE [ VERBOSE ] [ table_and_columns - If the table being analyzed is partitioned, ANALYZE - will gather statistics by sampling blocks randomly from its partitions; - in addition, it will recurse into each partition and update its statistics. - (However, in multi-level partitioning scenarios, each leaf partition - will only be analyzed once.) - By contrast, if the table being analyzed has inheritance children, - ANALYZE will gather statistics for it twice: - once on the rows of the parent table only, and a second time on the - rows of the parent table with all of its children. This second set of - statistics is needed when planning queries that traverse the entire - inheritance tree. The child tables themselves are not individually - analyzed in this case. + If the table being analyzed has one or more children, + ANALYZE will gather statistics twice: once on the + rows of the parent table only, and a second time on the rows of the + parent table with all of its children. This second set of statistics + is needed when planning queries that traverse the entire inheritance + tree. The autovacuum daemon, however, will only consider inserts or + updates on the parent table itself when deciding whether to trigger an + automatic analyze for that table. If that table is rarely inserted into + or updated, the inheritance statistics will not be up to date unless you + run ANALYZE manually. - The autovacuum daemon counts inserts, updates and deletes in the - partitions to determine if auto-analyze is needed. However, adding - or removing partitions does not affect autovacuum daemon decisions, - so triggering a manual ANALYZE is recommended - when this occurs. - - - - Tuples changed in inheritance children do not count towards analyze - on the parent table. If the parent table is empty or rarely modified, - it may never be processed by autovacuum. It's necessary to - periodically run a manual ANALYZE to keep the - statistics of the table hierarchy up to date. - - - - If any of the child tables or partitions are foreign tables whose foreign data wrappers + If any of the child tables are foreign tables whose foreign data wrappers do not support ANALYZE, those child tables are ignored while gathering inheritance statistics. diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml index 15aed2f2515..473a0a4aebd 100644 --- a/doc/src/sgml/ref/create_table.sgml +++ b/doc/src/sgml/ref/create_table.sgml @@ -1374,8 +1374,8 @@ WITH ( MODULUS numeric_literal, REM If a table parameter value is set and the equivalent toast. parameter is not, the TOAST table will use the table's parameter value. - Except where noted, these parameters are not supported on partitioned - tables; however, you can specify them on individual leaf partitions. + Specifying these parameters for partitioned tables is not supported, + but you may specify them for individual leaf partitions. @@ -1457,8 +1457,6 @@ WITH ( MODULUS numeric_literal, REM If true, the autovacuum daemon will perform automatic VACUUM and/or ANALYZE operations on this table following the rules discussed in . - This parameter can be set for partitioned tables to prevent autovacuum - from running ANALYZE on them. If false, this table will not be autovacuumed, except to prevent transaction ID wraparound. See for more about wraparound prevention. @@ -1590,7 +1588,6 @@ WITH ( MODULUS numeric_literal, REM Per-table value for parameter. - This parameter can be set for partitioned tables. @@ -1606,7 +1603,6 @@ WITH ( MODULUS numeric_literal, REM Per-table value for parameter. - This parameter can be set for partitioned tables. diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml index 35cd56297c8..93ea937ac8e 100644 --- a/doc/src/sgml/ref/pg_restore.sgml +++ b/doc/src/sgml/ref/pg_restore.sgml @@ -922,10 +922,8 @@ CREATE DATABASE foo WITH TEMPLATE template0; Once restored, it is wise to run ANALYZE on each - restored table so the optimizer has useful statistics. - If the table is a partition or an inheritance child, it may also be useful - to analyze the parent to update statistics for the table hierarchy. - See and + restored table so the optimizer has useful statistics; see + and for more information. -- cgit v1.2.3