From e2a186b03cc1a87cf26644db18f28a20f10bd739 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Mon, 16 Apr 2007 18:30:04 +0000 Subject: Add a multi-worker capability to autovacuum. This allows multiple worker processes to be running simultaneously. Also, now autovacuum processes do not count towards the max_connections limit; they are counted separately from regular processes, and are limited by the new GUC variable autovacuum_max_workers. The launcher now has intelligence to launch workers on each database every autovacuum_naptime seconds, limited only on the max amount of worker slots available. Also, the global worker I/O utilization is limited by the vacuum cost-based delay feature. Workers are "balanced" so that the total I/O consumption does not exceed the established limit. This part of the patch was contributed by ITAGAKI Takahiro. Per discussion. --- doc/src/sgml/config.sgml | 30 ++++++++++++++++++++------ doc/src/sgml/maintenance.sgml | 49 +++++++++++++++++++++++++++++++------------ 2 files changed, 60 insertions(+), 19 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 57a618faa6f..e10d2d753a3 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1,4 +1,4 @@ - + Server Configuration @@ -3166,7 +3166,7 @@ SELECT * FROM parent WHERE key = 2400; Controls whether the server should run the - autovacuum daemon. This is off by default. + autovacuum launcher daemon. This is on by default. stats_start_collector and stats_row_level must also be turned on for autovacuum to work. This parameter can only be set in the postgresql.conf @@ -3175,6 +3175,21 @@ SELECT * FROM parent WHERE key = 2400; + + autovacuum_max_workers (integer) + + autovacuum_max_workers configuration parameter + + + + Specifies the maximum number of autovacuum processes (other than the + autovacuum launcher) which may be running at any one time. The default + is three (3). This parameter can only be set in + the postgresql.conf file or on the server command line. + + + + autovacuum_naptime (integer) @@ -3182,9 +3197,9 @@ SELECT * FROM parent WHERE key = 2400; - Specifies the delay between activity rounds for the autovacuum - daemon. In each round the daemon examines one database - and issues VACUUM and ANALYZE commands + Specifies the minimum delay between autovacuum runs on any given + database. In each round the daemon examines the + database and issues VACUUM and ANALYZE commands as needed for tables in that database. The delay is measured in seconds, and the default is one minute (1m). This parameter can only be set in the postgresql.conf @@ -3318,7 +3333,10 @@ SELECT * FROM parent WHERE key = 2400; Specifies the cost limit value that will be used in automatic VACUUM operations. If -1 is specified (which is the default), the regular - value will be used. + value will be used. Note that + the value is distributed proportionally among the running autovacuum + workers, if there is more than one, so that the sum of the limits of + each worker never exceeds the limit on this variable. This parameter can only be set in the postgresql.conf file or on the server command line. This setting can be overridden for individual tables by entries in diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml index fe5369c19c3..2be11332c27 100644 --- a/doc/src/sgml/maintenance.sgml +++ b/doc/src/sgml/maintenance.sgml @@ -1,4 +1,4 @@ - + Routine Database Maintenance Tasks @@ -466,26 +466,43 @@ HINT: Stop the postmaster and use a standalone backend to VACUUM in "mydb". general information - Beginning in PostgreSQL 8.1, there is a - separate optional server process called the autovacuum - daemon, whose purpose is to automate the execution of + Beginning in PostgreSQL 8.1, there is an + optional feature called autovacuum, + whose purpose is to automate the execution of VACUUM and ANALYZE commands. - When enabled, the autovacuum daemon runs periodically and checks for + When enabled, autovacuum checks for tables that have had a large number of inserted, updated or deleted tuples. These checks use the row-level statistics collection facility; - therefore, the autovacuum daemon cannot be used unless and are set to true. Also, - it's important to allow a slot for the autovacuum process when choosing - the value of . In - the default configuration, autovacuuming is enabled and the related + linkend="guc-stats-row-level"> are set to true. + In the default configuration, autovacuuming is enabled and the related configuration parameters are appropriately set. - The autovacuum daemon, when enabled, runs every seconds. On each run, it selects - one database to process and checks each table within that database. + Beginning in PostgreSQL 8.3, autovacuum has a + multi-process architecture: there is a daemon process, called the + autovacuum launcher, which is in charge of starting + an autovacuum worker process on each database every + seconds. + + + + There is a limit of worker + processes that may be running at at any time, so if the VACUUM + and ANALYZE work to do takes too long to run, the deadline may + be failed to meet for other databases. Also, if a particular database + takes long to process, more than one worker may be processing it + simultaneously. The workers are smart enough to avoid repeating work that + other workers have done, so this is normally not a problem. Note that the + number of running workers does not count towards the nor the limits. + + + + On each run, the worker process checks each table within that database, and VACUUM or ANALYZE commands are issued as needed. @@ -581,6 +598,12 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu + + When multiple workers are running, the cost limit is "balanced" among all + the running workers, so that the total impact on the system is the same, + regardless of the number of workers actually running. + + -- cgit v1.2.3