From d9bae5317300cf983dd9f01cc2e561c0eecd109a Mon Sep 17 00:00:00 2001 From: Magnus Hagander Date: Wed, 26 Oct 2011 20:13:33 +0200 Subject: Implement streaming xlog for backup tools Add option for parallel streaming of the transaction log while a base backup is running, to get the logfiles before the server has removed them. Also add a tool called pg_receivexlog, which streams the transaction log into files, creating a log archive without having to wait for segments to complete, thus decreasing the window of data loss without having to waste space using archive_timeout. This works best in combination with archive_command - suggested usage docs etc coming later. --- doc/src/sgml/ref/allfiles.sgml | 1 + doc/src/sgml/ref/pg_basebackup.sgml | 65 +++++++-- doc/src/sgml/ref/pg_receivexlog.sgml | 270 +++++++++++++++++++++++++++++++++++ doc/src/sgml/reference.sgml | 1 + 4 files changed, 325 insertions(+), 12 deletions(-) create mode 100644 doc/src/sgml/ref/pg_receivexlog.sgml (limited to 'doc/src') diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 8a8616b0008..382d297bdb2 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -172,6 +172,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml index 25280045412..8c8c78f0d15 100644 --- a/doc/src/sgml/ref/pg_basebackup.sgml +++ b/doc/src/sgml/ref/pg_basebackup.sgml @@ -143,8 +143,8 @@ PostgreSQL documentation - - + + Includes the required transaction log files (WAL files) in the @@ -154,16 +154,43 @@ PostgreSQL documentation to consult the log archive, thus making this a completely standalone backup. - - - The transaction log files are collected at the end of the backup. - Therefore, it is necessary for the - parameter to be set high - enough that the log is not removed before the end of the backup. - If the log has been rotated when it's time to transfer it, the - backup will fail and be unusable. - - + + The following methods for collecting the transaction logs are + supported: + + + + f + fetch + + + The transaction log files are collected at the end of the backup. + Therefore, it is necessary for the + parameter to be set high + enough that the log is not removed before the end of the backup. + If the log has been rotated when it's time to transfer it, the + backup will fail and be unusable. + + + + + + s + stream + + + Stream the transaction log while the backup is created. This will + open a second connection to the server and start streaming the + transaction log in parallel while running the backup. Therefore, + it will use up two slots configured by the + parameter. As long as the + client can keep up with transaction log received, using this mode + requires no extra transaction logs to be saved on the master. + + + + + @@ -260,6 +287,20 @@ PostgreSQL documentation The following command-line options control the database connection parameters. + + + + + + Specifies the number of seconds between status packets sent back to the + server. This is required when streaming the transaction log (using + --xlog=stream) if replication timeout is configured + on the server, and allows for easier monitoring. The default value is + 10 seconds. + + + + diff --git a/doc/src/sgml/ref/pg_receivexlog.sgml b/doc/src/sgml/ref/pg_receivexlog.sgml new file mode 100644 index 00000000000..9a2a24ba2e3 --- /dev/null +++ b/doc/src/sgml/ref/pg_receivexlog.sgml @@ -0,0 +1,270 @@ + + + + + pg_receivexlog + 1 + Application + + + + pg_receivexlog + streams transaction logs from a PostgreSQL cluster + + + + pg_receivexlog + + + + + pg_receivexlog + option + + + + + + Description + + + pg_receivexlog is used to stream transaction log + from a running PostgreSQL cluster. The transaction + log is streamed using the streaming replication protocol, and is written + to a local directory of files. This directory can be used as the archive + location for doing a restore using point-in-time recovery (see + ). + + + + pg_receivexlog streams the transaction + log in real time as it's being generated on the server, and does not wait + for segments to complete like does. + For this reason, it is not necessary to set + when using + pg_receivexlog. + + + + The transaction log is streamed over a regular + PostgreSQL connection, and uses the + replication protocol. The connection must be + made with a user having REPLICATION permissions (see + ), and the user must be granted explicit + permissions in pg_hba.conf. The server must also + be configured with set high enough + to leave at least one session available for the stream. + + + + + Options + + + The following command-line options control the location and format of the + output. + + + + + + + + Directory to write the output to. + + + This parameter is required. + + + + + + + The following command-line options control the running of the program. + + + + + + + + Enables verbose mode. + + + + + + + + + The following command-line options control the database connection parameters. + + + + + + + + Specifies the number of seconds between status packets sent back to the + server. This is required if replication timeout is configured on the + server, and allows for easier monitoring. The default value is + 10 seconds. + + + + + + + + + + Specifies the host name of the machine on which the server is + running. If the value begins with a slash, it is used as the + directory for the Unix domain socket. The default is taken + from the PGHOST environment variable, if set, + else a Unix domain socket connection is attempted. + + + + + + + + + + Specifies the TCP port or local Unix domain socket file + extension on which the server is listening for connections. + Defaults to the PGPORT environment variable, if + set, or a compiled-in default. + + + + + + + + + + User name to connect as. + + + + + + + + + + Never issue a password prompt. If the server requires + password authentication and a password is not available by + other means such as a .pgpass file, the + connection attempt will fail. This option can be useful in + batch jobs and scripts where no user is present to enter a + password. + + + + + + + + + + Force pg_receivexlog to prompt for a + password before connecting to a database. + + + + This option is never essential, since + pg_receivexlog will automatically prompt + for a password if the server demands password authentication. + However, pg_receivexlog will waste a + connection attempt finding out that the server wants a password. + In some cases it is worth typing + + + + + + + Other, less commonly used, parameters are also available: + + + + + + + + Print the pg_receivexlog version and exit. + + + + + + + + + + Show help about pg_receivexlog command line + arguments, and exit. + + + + + + + + + + + Environment + + + This utility, like most other PostgreSQL utilities, + uses the environment variables supported by libpq + (see ). + + + + + + Notes + + + When using pg_receivexlog instead of + , the server will continue to + recycle transaction log files even if the backups are not properly + archived, since there is no command that fails. This can be worked + around by having an that fails + when the file has not been properly archived yet. + + + + + + Examples + + + To stream the transaction log from the server at + mydbserver and store it in the local directory + /usr/local/pgsql/archive: + + $ pg_receivexlog -h mydbserver -D /home/pgbackup/archive + + + + + + See Also + + + + + + + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 5fd6410991d..7326519708e 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -220,6 +220,7 @@ &pgConfig; &pgDump; &pgDumpall; + &pgReceivexlog; &pgRestore; &psqlRef; &reindexdb; -- cgit v1.2.3