From 867e2c91a0c341111b7a5257dc4c9a2659a022dc Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Thu, 28 Jun 2007 00:02:40 +0000 Subject: Implement "distributed" checkpoints in which the checkpoint I/O is spread over a fairly long period of time, rather than being spat out in a burst. This happens only for background checkpoints carried out by the bgwriter; other cases, such as a shutdown checkpoint, are still done at full speed. Remove the "all buffers" scan in the bgwriter, and associated stats infrastructure, since this seems no longer very useful when the checkpoint itself is properly throttled. Original patch by Itagaki Takahiro, reworked by Heikki Linnakangas, and some minor API editorialization by me. --- doc/src/sgml/config.sgml | 82 +++++++++++++------------------------------- doc/src/sgml/monitoring.sgml | 37 +++++--------------- doc/src/sgml/wal.sgml | 34 +++++++++++++++--- 3 files changed, 62 insertions(+), 91 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index a38b02fd211..c7d2d395c7a 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1,4 +1,4 @@ - + Server Configuration @@ -1168,21 +1168,17 @@ SET ENABLE_SEQSCAN TO OFF; Beginning in PostgreSQL 8.0, there is a separate server - process called the background writer, whose sole function + process called the background writer, whose function is to issue writes of dirty shared buffers. The intent is that server processes handling user queries should seldom or never have to wait for a write to occur, because the background writer will do it. - This arrangement also reduces the performance penalty associated with - checkpoints. The background writer will continuously trickle out dirty - pages to disk, so that only a few pages will need to be forced out when - checkpoint time arrives, instead of the storm of dirty-buffer writes that - formerly occurred at each checkpoint. However there is a net overall + However there is a net overall increase in I/O load, because where a repeatedly-dirtied page might before have been written only once per checkpoint interval, the background writer might write it several times in the same interval. In most situations a continuous low load is preferable to periodic - spikes, but the parameters discussed in this subsection can be used to tune - the behavior for local needs. + spikes, but the parameters discussed in this subsection can be used to + tune the behavior for local needs. @@ -1242,62 +1238,14 @@ SET ENABLE_SEQSCAN TO OFF; - - - bgwriter_all_percent (floating point) - - bgwriter_all_percent configuration parameter - - - - To reduce the amount of work that will be needed at checkpoint time, - the background writer also does a circular scan through the entire - buffer pool, writing buffers that are found to be dirty. - In each round, it examines up to - bgwriter_all_percent of the buffers for this purpose. - The default value is 0.333 (0.333% of the total number - of shared buffers). With the default bgwriter_delay - setting, this will allow the entire shared buffer pool to be scanned - about once per minute. - This parameter can only be set in the postgresql.conf - file or on the server command line. - - - - - - bgwriter_all_maxpages (integer) - - bgwriter_all_maxpages configuration parameter - - - - In each round, no more than this many buffers will be written - as a result of the scan of the entire buffer pool. (If this - limit is reached, the scan stops, and resumes at the next buffer - during the next round.) - The default value is five buffers. - This parameter can only be set in the postgresql.conf - file or on the server command line. - - - - Smaller values of bgwriter_all_percent and - bgwriter_all_maxpages reduce the extra I/O load - caused by the background writer, but leave more work to be done - at checkpoint time. To reduce load spikes at checkpoints, - increase these two values. - Similarly, smaller values of bgwriter_lru_percent and + Smaller values of bgwriter_lru_percent and bgwriter_lru_maxpages reduce the extra I/O load caused by the background writer, but make it more likely that server processes will have to issue writes for themselves, delaying interactive queries. - To disable background writing entirely, - set both maxpages values and/or both - percent values to zero. @@ -1307,7 +1255,7 @@ SET ENABLE_SEQSCAN TO OFF; See also for details on WAL - tuning. + and checkpoint tuning. @@ -1565,6 +1513,22 @@ SET ENABLE_SEQSCAN TO OFF; + + checkpoint_completion_target (floating point) + + checkpoint_completion_target configuration parameter + + + + Specifies the target length of checkpoints, as a fraction of + the checkpoint interval. The default is 0.5. + + This parameter can only be set in the postgresql.conf + file or on the server command line. + + + + checkpoint_warning (integer) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 3db50198771..42816bd5d2d 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -1,4 +1,4 @@ - + Monitoring Database Activity @@ -251,9 +251,9 @@ postgres: user database host pg_stat_bgwriter One row only, showing cluster-wide statistics from the background writer: number of scheduled checkpoints, requested - checkpoints, buffers written by checkpoints, lru-scans and all-scans, - and the number of times the bgwriter aborted a round because it had - written too many buffers during lru-scans and all-scans. + checkpoints, buffers written by checkpoints and cleaning scans, + and the number of times the bgwriter stopped a cleaning scan + because it had written too many buffers. @@ -777,43 +777,24 @@ postgres: user database host - pg_stat_get_bgwriter_buf_written_lru() + pg_stat_get_bgwriter_buf_written_clean() bigint - The number of buffers written by the bgwriter when performing a - LRU scan of the buffer cache + The number of buffers written by the bgwriter for routine cleaning of + dirty pages - pg_stat_get_bgwriter_buf_written_all() + pg_stat_get_bgwriter_maxwritten_clean() bigint - The number of buffers written by the bgwriter when performing a - scan of all the buffer cache - - - - - pg_stat_get_bgwriter_maxwritten_lru() - bigint - - The number of times the bgwriter has stopped its LRU round because + The number of times the bgwriter has stopped its cleaning scan because it has written more buffers than specified in the bgwriter_lru_maxpages parameter - - pg_stat_get_bgwriter_maxwritten_all() - bigint - - The number of times the bgwriter has stopped its all-buffer round - because it has written more buffers than specified in the - bgwriter_all_maxpages parameter - - - pg_stat_clear_snapshot() void diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml index cf0c3d2e912..aaf1d0c71ef 100644 --- a/doc/src/sgml/wal.sgml +++ b/doc/src/sgml/wal.sgml @@ -1,4 +1,4 @@ - + Reliability and the Write-Ahead Log @@ -217,15 +217,41 @@ - There will be at least one WAL segment file, and will normally - not be more than 2 * checkpoint_segments + 1 + To avoid flooding the I/O system with a burst of page writes, + writing dirty buffers during a checkpoint is spread over a period of time. + That period is controlled by + , which is + given as a fraction of the checkpoint interval. + The I/O rate is adjusted so that the checkpoint finishes when the + given fraction of checkpoint_segments WAL segments + have been consumed since checkpoint start, or the given fraction of + checkpoint_timeout seconds have elapsed, + whichever is sooner. With the default value of 0.5, + PostgreSQL can be expected to complete each checkpoint + in about half the time before the next checkpoint starts. On a system + that's very close to maximum I/O throughput during normal operation, + you might want to increase checkpoint_completion_target + to reduce the I/O load from checkpoints. The disadvantage of this is that + prolonging checkpoints affects recovery time, because more WAL segments + will need to be kept around for possible use in recovery. Although + checkpoint_completion_target can be set as high as 1.0, + it is best to keep it less than that (perhaps 0.9 at most) since + checkpoints include some other activities besides writing dirty buffers. + A setting of 1.0 is quite likely to result in checkpoints not being + completed on time, which would result in performance loss due to + unexpected variation in the number of WAL segments needed. + + + + There will always be at least one WAL segment file, and will normally + not be more than (2 + checkpoint_completion_target) * checkpoint_segments + 1 files. Each segment file is normally 16 MB (though this size can be altered when building the server). You can use this to estimate space requirements for WAL. Ordinarily, when old log segment files are no longer needed, they are recycled (renamed to become the next segments in the numbered sequence). If, due to a short-term peak of log output rate, there - are more than 2 * checkpoint_segments + 1 + are more than 3 * checkpoint_segments + 1 segment files, the unneeded segment files will be deleted instead of recycled until the system gets back under this limit. -- cgit v1.2.3