summaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorAlvaro Herrera <alvherre@alvh.no-ip.org>2007-04-16 18:30:04 +0000
committerAlvaro Herrera <alvherre@alvh.no-ip.org>2007-04-16 18:30:04 +0000
commite2a186b03cc1a87cf26644db18f28a20f10bd739 (patch)
treea11e944e89e9757808f6d86dc9aa79ea6a94948a /doc/src
parent42dc4b66e61cde4beb466561f12fd490b6621ee3 (diff)
Add a multi-worker capability to autovacuum. This allows multiple worker
processes to be running simultaneously. Also, now autovacuum processes do not count towards the max_connections limit; they are counted separately from regular processes, and are limited by the new GUC variable autovacuum_max_workers. The launcher now has intelligence to launch workers on each database every autovacuum_naptime seconds, limited only on the max amount of worker slots available. Also, the global worker I/O utilization is limited by the vacuum cost-based delay feature. Workers are "balanced" so that the total I/O consumption does not exceed the established limit. This part of the patch was contributed by ITAGAKI Takahiro. Per discussion.
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/config.sgml30
-rw-r--r--doc/src/sgml/maintenance.sgml49
2 files changed, 60 insertions, 19 deletions
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 57a618faa6f..e10d2d753a3 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.119 2007/04/02 15:27:02 petere Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.120 2007/04/16 18:29:50 alvherre Exp $ -->
<chapter Id="runtime-config">
<title>Server Configuration</title>
@@ -3166,7 +3166,7 @@ SELECT * FROM parent WHERE key = 2400;
<listitem>
<para>
Controls whether the server should run the
- autovacuum daemon. This is off by default.
+ autovacuum launcher daemon. This is on by default.
<varname>stats_start_collector</> and <varname>stats_row_level</>
must also be turned on for autovacuum to work.
This parameter can only be set in the <filename>postgresql.conf</>
@@ -3175,6 +3175,21 @@ SELECT * FROM parent WHERE key = 2400;
</listitem>
</varlistentry>
+ <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers">
+ <term><varname>autovacuum_max_workers</varname> (<type>integer</type>)</term>
+ <indexterm>
+ <primary><varname>autovacuum_max_workers</> configuration parameter</primary>
+ </indexterm>
+ <listitem>
+ <para>
+ Specifies the maximum number of autovacuum processes (other than the
+ autovacuum launcher) which may be running at any one time. The default
+ is three (<literal>3</literal>). This parameter can only be set in
+ the <filename>postgresql.conf</> file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-autovacuum-naptime" xreflabel="autovacuum_naptime">
<term><varname>autovacuum_naptime</varname> (<type>integer</type>)</term>
<indexterm>
@@ -3182,9 +3197,9 @@ SELECT * FROM parent WHERE key = 2400;
</indexterm>
<listitem>
<para>
- Specifies the delay between activity rounds for the autovacuum
- daemon. In each round the daemon examines one database
- and issues <command>VACUUM</> and <command>ANALYZE</> commands
+ Specifies the minimum delay between autovacuum runs on any given
+ database. In each round the daemon examines the
+ database and issues <command>VACUUM</> and <command>ANALYZE</> commands
as needed for tables in that database. The delay is measured
in seconds, and the default is one minute (<literal>1m</>).
This parameter can only be set in the <filename>postgresql.conf</>
@@ -3318,7 +3333,10 @@ SELECT * FROM parent WHERE key = 2400;
Specifies the cost limit value that will be used in automatic
<command>VACUUM</> operations. If <literal>-1</> is specified (which is the
default), the regular
- <xref linkend="guc-vacuum-cost-limit"> value will be used.
+ <xref linkend="guc-vacuum-cost-limit"> value will be used. Note that
+ the value is distributed proportionally among the running autovacuum
+ workers, if there is more than one, so that the sum of the limits of
+ each worker never exceeds the limit on this variable.
This parameter can only be set in the <filename>postgresql.conf</>
file or on the server command line.
This setting can be overridden for individual tables by entries in
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index fe5369c19c3..2be11332c27 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.70 2007/02/01 19:10:24 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.71 2007/04/16 18:29:50 alvherre Exp $ -->
<chapter id="maintenance">
<title>Routine Database Maintenance Tasks</title>
@@ -466,26 +466,43 @@ HINT: Stop the postmaster and use a standalone backend to VACUUM in "mydb".
<secondary>general information</secondary>
</indexterm>
<para>
- Beginning in <productname>PostgreSQL </productname> 8.1, there is a
- separate optional server process called the <firstterm>autovacuum
- daemon</firstterm>, whose purpose is to automate the execution of
+ Beginning in <productname>PostgreSQL</productname> 8.1, there is an
+ optional feature called <firstterm>autovacuum</firstterm>,
+ whose purpose is to automate the execution of
<command>VACUUM</command> and <command>ANALYZE </command> commands.
- When enabled, the autovacuum daemon runs periodically and checks for
+ When enabled, autovacuum checks for
tables that have had a large number of inserted, updated or deleted
tuples. These checks use the row-level statistics collection facility;
- therefore, the autovacuum daemon cannot be used unless <xref
+ therefore, autovacuum cannot be used unless <xref
linkend="guc-stats-start-collector"> and <xref
- linkend="guc-stats-row-level"> are set to <literal>true</literal>. Also,
- it's important to allow a slot for the autovacuum process when choosing
- the value of <xref linkend="guc-superuser-reserved-connections">. In
- the default configuration, autovacuuming is enabled and the related
+ linkend="guc-stats-row-level"> are set to <literal>true</literal>.
+ In the default configuration, autovacuuming is enabled and the related
configuration parameters are appropriately set.
</para>
<para>
- The autovacuum daemon, when enabled, runs every <xref
- linkend="guc-autovacuum-naptime"> seconds. On each run, it selects
- one database to process and checks each table within that database.
+ Beginning in <productname>PostgreSQL</productname> 8.3, autovacuum has a
+ multi-process architecture: there is a daemon process, called the
+ <firstterm>autovacuum launcher</firstterm>, which is in charge of starting
+ an <firstterm>autovacuum worker</firstterm> process on each database every
+ <xref linkend="guc-autovacuum-naptime"> seconds.
+ </para>
+
+ <para>
+ There is a limit of <xref linkend="guc-autovacuum-max-workers"> worker
+ processes that may be running at at any time, so if the <command>VACUUM</>
+ and <command>ANALYZE</> work to do takes too long to run, the deadline may
+ be failed to meet for other databases. Also, if a particular database
+ takes long to process, more than one worker may be processing it
+ simultaneously. The workers are smart enough to avoid repeating work that
+ other workers have done, so this is normally not a problem. Note that the
+ number of running workers does not count towards the <xref
+ linkend="guc-max-connections"> nor the <xref
+ linkend="guc-superuser-reserved-connections"> limits.
+ </para>
+
+ <para>
+ On each run, the worker process checks each table within that database, and
<command>VACUUM</command> or <command>ANALYZE</command> commands are
issued as needed.
</para>
@@ -581,6 +598,12 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
</para>
</caution>
+ <para>
+ When multiple workers are running, the cost limit is "balanced" among all
+ the running workers, so that the total impact on the system is the same,
+ regardless of the number of workers actually running.
+ </para>
+
</sect2>
</sect1>