From 40f908bdcdc726fc11912cd95dfd2f89603d1f37 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Fri, 15 Jan 2010 09:19:10 +0000 Subject: Introduce Streaming Replication. This includes two new kinds of postmaster processes, walsenders and walreceiver. Walreceiver is responsible for connecting to the primary server and streaming WAL to disk, while walsender runs in the primary server and streams WAL from disk to the client. Documentation still needs work, but the basics are there. We will probably pull the replication section to a new chapter later on, as well as the sections describing file-based replication. But let's do that as a separate patch, so that it's easier to see what has been added/changed. This patch also adds a new section to the chapter about FE/BE protocol, documenting the protocol used by walsender/walreceivxer. Bump catalog version because of two new functions, pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for monitoring the progress of replication. Fujii Masao, with additional hacking by me --- doc/src/sgml/backup.sgml | 228 +++++++++++++++++++++++++++++++++++- doc/src/sgml/client-auth.sgml | 4 +- doc/src/sgml/config.sgml | 47 +++++++- doc/src/sgml/func.sgml | 40 ++++++- doc/src/sgml/high-availability.sgml | 11 +- doc/src/sgml/perform.sgml | 7 +- doc/src/sgml/protocol.sgml | 62 +++++++++- 7 files changed, 385 insertions(+), 14 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index 961cbfdd1ba..da52f42af69 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -1,4 +1,4 @@ - + Backup and Restore @@ -1088,6 +1088,8 @@ restore_command = 'cp /mnt/server/archivedir/%f %p' file, and apply only for the duration of the recovery. They must be reset for any subsequent recovery you wish to perform. They cannot be changed once recovery has begun. + The parameters for streaming replication are described in . @@ -1097,7 +1099,8 @@ restore_command = 'cp /mnt/server/archivedir/%f %p' The shell command to execute to retrieve an archived segment of - the WAL file series. This parameter is required. + the WAL file series. This parameter is required for archive recovery, + but optional for streaming replication. Any %f in the string is replaced by the name of the file to retrieve from the archive, and any %p is replaced by the path name to copy @@ -1843,6 +1846,227 @@ if (!triggered) cooperation of the restore_command script with the data copying program. + + + Starting with PostgreSQL version 8.5, you can use + streaming replication (see ) to + achieve the same with less effort. + + + + + Streaming Replication + + + PostgreSQL includes a simple streaming replication + mechanism, which lets the standby server to stay more up-to-date than + file-based replication allows. The standby connects to the primary + and the primary starts streaming WAL records from where the standby + left off, and continues streaming them as they are generated, without + waiting for the WAL file to be filled. So with streaming replication, + archive_timeout does not need to be configured. + + + + Streaming replication relies on file-based continuous archiving for + making the base backup and for allowing a standby to catch up if it's + disconnected from the primary for long enough for the primary to + delete old WAL files still required by the standby. + + + + Setup + + The short procedure for configuring streaming replication is as follows. + For full details of each step, refer to other sections as noted. + + + + Set up primary and standby systems as near identically as possible, + including two identical copies of PostgreSQL at the + same release level. + + + + + Set up continuous archiving from the primary to a WAL archive located + in a directory on the standby server. Ensure that + , + and + + are set appropriately on the primary + (see ). + + + + + + Set up connections and authentication so that the standby server can + successfully connect to the pseudo replication database of + the primary server (see + ). Ensure that + and pg_hba.conf are + configured appropriately on the primary. + + + On systems that support the keepalive socket option, setting + , + and + helps you to find the + troubles with replication (e.g., the network outage or the failure of + the standby server) as soon as possible. + + + + + Set the maximum number of concurrent connections from the standby servers + (see for details). + + + + + Enable WAL archiving in the primary server because we need to make a base + backup of it later (see and + for details). + + + + + Start the PostgreSQL server on the primary. + + + + + Make a base backup of the primary server (see + ), and load this data onto the + standby. Note that all files present in pg_xlog + and pg_xlog/archive_status on the standby + server should be removed because they might be obsolete. + + + + + Set up WAL archiving, connections and authentication like the primary + server, because the standby server might work as a primary server after + failover. Ensure that your settings are consistent with the + future environment after the primary and the standby + server are interchanged by failover. If you're setting up the standby + server for e.g reporting purposes, with no plans to fail over to it, + configure the standby accordingly. + + + + + Create a recovery command file recovery.conf in the data + directory on the standby server. + + + + + standby_mode (boolean) + + + Specifies whether to start the PostgreSQL server as + a standby. If this parameter is on, the streaming + replication is enabled and the standby server will try to connect + to the primary to receive and apply WAL records continuously. The + default is off, which allows only an archive recovery + without replication. So, streaming replication requires this + parameter to be explicitly set to on. + + + + + primary_conninfo (string) + + + Specifies a connection string which is used for the standby server + to connect with the primary. This string is in the same format as + described in . If any option is + unspecified in this string, then the corresponding environment + variable (see ) is checked. If the + environment variable is not set either, then the indicated built-in + defaults are used. + + + The built-in replication requires that a host name (or host address) + or port number which the primary server listens on should be + specified in this string, respectively. Also ensure that a role with + the SUPERUSER and LOGIN privileges on the + primary is set (see + ). Note that + the password needs to be set if the primary demands password + authentication. + + + + + trigger_file (string) + + + Specifies a trigger file whose presence activates the standby. + If no trigger file is specified, the standby never exits + recovery. + + + + + + + + Start the PostgreSQL server on the standby. The standby + server will go into recovery mode and proceeds to receive WAL records + from the primary and apply them continuously. + + + + + + + Authentication + + It's very important that the access privilege for replication are set + properly so that only trusted users can read the WAL stream, because it's + easy to extract serious information from it. + + + Only superuser is allowed to connect to the primary as the replication + standby. So a role with the SUPERUSER and LOGIN + privileges needs to be created in the primary. + + + Client authentication for replication is controlled by the + pg_hba.conf record specifying replication in the + database field. For example, if the standby is running on + host IP 192.168.1.100 and the superuser's name for replication + is foo, the administrator can add the following line to the + pg_hba.conf file on the primary. + + +# Allow the user "foo" from host 192.168.1.100 to connect to the primary +# as a replication standby if the user's password is correctly supplied. +# +# TYPE DATABASE USER CIDR-ADDRESS METHOD +host replication foo 192.168.1.100/32 md5 + + + + The host name and port number of the primary, user name to connect as, + and password are specified in the recovery.conf file or + the corresponding environment variable on the standby. + For example, if the primary is running on host IP 192.168.1.50, + port 5432, the superuser's name for replication is + foo, and the password is foopass, the administrator + can add the following line to the recovery.conf file on the + standby. + + +# The standby connects to the primary that is running on host 192.168.1.50 +# and port 5432 as the user "foo" whose password is "foopass". +primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass' + + + diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml index e1d8c9503b5..85a24140009 100644 --- a/doc/src/sgml/client-auth.sgml +++ b/doc/src/sgml/client-auth.sgml @@ -1,4 +1,4 @@ - + Client Authentication @@ -181,6 +181,8 @@ hostnossl database user the requested user must be a member of the role with the same name as the requested database. (samegroup is an obsolete but still accepted spelling of samerole.) + The value replication specifies that the record + matches if streaming replication is requested. Otherwise, this is the name of a specific PostgreSQL database. Multiple database names can be supplied by separating them with diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index dbccaa75691..03667754b85 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1,4 +1,4 @@ - + Server Configuration @@ -1746,6 +1746,51 @@ archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows + + Streaming Replication + + + These settings control the behavior of the built-in + streaming replication feature. + + + + + max_wal_senders (integer) + + max_wal_senders configuration parameter + + + + Specifies the maximum number of concurrent connections from standby + servers (i.e., the maximum number of simultaneously running WAL sender + processes). The default is zero. This parameter can only be set at + server start. + + + + + wal_sender_delay (integer) + + wal_sender_delay configuration parameter + + + + Specifies the delay between activity rounds for the WAL sender. + In each round the WAL sender sends any WAL accumulated since last + round to the standby server. It then sleeps for + wal_sender_delay milliseconds, and repeats. The default + value is 200 milliseconds (200ms). + Note that on many systems, the effective resolution of sleep delays is + 10 milliseconds; setting wal_sender_delay to a value that + is not a multiple of 10 might have the same results as setting it to + the next higher multiple of 10. This parameter can only be set in the + postgresql.conf file or on the server command line. + + + + + Standby Servers diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 59ac346c142..49ca0874677 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -1,4 +1,4 @@ - + Functions and Operators @@ -12984,7 +12984,8 @@ SELECT set_config('log_statement_stats', 'off', false); The functions shown in assist in making on-line backups. - Use of the first three functions is restricted to superusers. + Use of the first three functions is restricted to superusers. The first + five functions cannot be executed during recovery. @@ -13135,11 +13136,17 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); pg_is_in_recovery + + pg_last_xlog_receive_location + + + pg_last_xlog_replay_location + The functions shown in provide information - about the current status of Hot Standby. + about the current status of the standby. These functions may be executed during both recovery and in normal running. @@ -13160,6 +13167,33 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup()); True if recovery is still in progress. + + + pg_last_xlog_receive_location() + + text + Get last transaction log location received and synced to disk during + streaming recovery. If streaming recovery is still in progress + this will increase monotonically. If streaming recovery has completed + then this value will remain static at the value of the last WAL record + received and synced to disk during that recovery. When the server has + been started without a streaming recovery then the return value will be + InvalidXLogRecPtr (0/0). + + + + + pg_last_xlog_replay_location() + + text + Get last transaction log location replayed during recovery. + If recovery is still in progress this will increase monotonically. + If recovery has completed then this value will remain static at + the value of the last WAL record applied during that recovery. + When the server has been started normally without a recovery + then the return value will be InvalidXLogRecPtr (0/0). + +
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml index efd4adced79..a8627408685 100644 --- a/doc/src/sgml/high-availability.sgml +++ b/doc/src/sgml/high-availability.sgml @@ -1,4 +1,4 @@ - + High Availability, Load Balancing, and Replication @@ -146,11 +146,16 @@ protocol to make nodes agree on a serializable transactional order. made the new master database server. This is asynchronous and can only be done for the entire database server. + + A PITR warm standby server can be kept more up-to-date using the + streaming replication feature built into PostgreSQL 8.5 + onwards. + - Master-Slave Replication + Trigger-Based Master-Slave Replication @@ -278,7 +283,7 @@ protocol to make nodes agree on a serializable transactional order. Shared Disk Failover File System Replication Warm Standby Using PITR - Master-Slave Replication + Trigger-Based Master-Slave Replication Statement-Based Replication Middleware Asynchronous Multimaster Replication Synchronous Multimaster Replication diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml index d1ae3efb330..1a9f7c1452d 100644 --- a/doc/src/sgml/perform.sgml +++ b/doc/src/sgml/perform.sgml @@ -1,4 +1,4 @@ - + Performance Tips @@ -836,8 +836,9 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse; needs to be written, because in case of an error, the files containing the newly loaded data will be removed anyway. However, this consideration does not apply when - is on, as all commands - must write WAL in that case. + is on or streaming replication + is allowed (i.e., is more + than or equal to one), as all commands must write WAL in that case.
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml index 1ec079e41c8..54e03998ff9 100644 --- a/doc/src/sgml/protocol.sgml +++ b/doc/src/sgml/protocol.sgml @@ -1,4 +1,4 @@ - + Frontend/Backend Protocol @@ -4140,6 +4140,66 @@ not line breaks. + +Streaming Replication Protocol + + +To initiate streaming replication, the frontend sends the "replication" +parameter in the startup message. This tells the backend to go into +walsender mode, where a small set of replication commands can be issued +instead of SQL statements. Only the simple query protocol can be used in +walsender mode. + +The commands accepted in walsender mode are: + + + + IDENTIFY_SYSTEM + + + Requests the server to idenfity itself. Server replies with a result + set of a single row, and two fields: + + systemid: The unique system identifier identifying the cluster. This + can be used to check that the base backup used to initialize the + slave came from the same cluster. + + timeline: Current TimelineID. Also used to check that the slave is + consistent with the master. + + + + + + START_REPLICATION XXX/XXX + + + Instructs backend to start streaming WAL, starting at point XXX/XXX. + Server can reply with an error e.g if the requested piece of WAL has + already been recycled. On success, server responds with a + CopyOutResponse message, and backend starts to stream WAL as CopyData + messages. + + + + The payload in each CopyData message consists of an XLogRecPtr, + indicating the starting point of the WAL in the message, immediately + followed by the WAL data itself. + + + A single WAL record is never split across two CopyData messages. When + a WAL record crosses a WAL page boundary, however, and is therefore + already split using continuation records, it can be split at the page + boundary. In other words, the first main WAL record and its + continuation records can be split across different CopyData messages. + + + + + + + + Summary of Changes since Protocol 2.0 -- cgit v1.2.3