summaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/backup.sgml228
-rw-r--r--doc/src/sgml/client-auth.sgml4
-rw-r--r--doc/src/sgml/config.sgml47
-rw-r--r--doc/src/sgml/func.sgml40
-rw-r--r--doc/src/sgml/high-availability.sgml11
-rw-r--r--doc/src/sgml/perform.sgml7
-rw-r--r--doc/src/sgml/protocol.sgml62
7 files changed, 385 insertions, 14 deletions
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 961cbfdd1ba..da52f42af69 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.135 2010/01/12 02:33:45 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.136 2010/01/15 09:18:56 heikki Exp $ -->
<chapter id="backup">
<title>Backup and Restore</title>
@@ -1088,6 +1088,8 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
file, and apply only for the duration of the recovery. They must be
reset for any subsequent recovery you wish to perform. They cannot be
changed once recovery has begun.
+ The parameters for streaming replication are described in <xref
+ linkend="replication-config-settings">.
</para>
<variablelist>
@@ -1097,7 +1099,8 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
<listitem>
<para>
The shell command to execute to retrieve an archived segment of
- the WAL file series. This parameter is required.
+ the WAL file series. This parameter is required for archive recovery,
+ but optional for streaming replication.
Any <literal>%f</> in the string is
replaced by the name of the file to retrieve from the archive,
and any <literal>%p</> is replaced by the path name to copy
@@ -1843,6 +1846,227 @@ if (!triggered)
cooperation of the <varname>restore_command</> script with the data
copying program.
</para>
+
+ <para>
+ Starting with <productname>PostgreSQL</> version 8.5, you can use
+ streaming replication (see <xref linkend="streaming-replication">) to
+ achieve the same with less effort.
+ </para>
+ </sect2>
+
+ <sect2 id="streaming-replication">
+ <title>Streaming Replication</title>
+
+ <para>
+ <productname>PostgreSQL</> includes a simple streaming replication
+ mechanism, which lets the standby server to stay more up-to-date than
+ file-based replication allows. The standby connects to the primary
+ and the primary starts streaming WAL records from where the standby
+ left off, and continues streaming them as they are generated, without
+ waiting for the WAL file to be filled. So with streaming replication,
+ <varname>archive_timeout</> does not need to be configured.
+ </para>
+
+ <para>
+ Streaming replication relies on file-based continuous archiving for
+ making the base backup and for allowing a standby to catch up if it's
+ disconnected from the primary for long enough for the primary to
+ delete old WAL files still required by the standby.
+ </para>
+
+ <sect3 id="streaming-replication-setup">
+ <title>Setup</title>
+ <para>
+ The short procedure for configuring streaming replication is as follows.
+ For full details of each step, refer to other sections as noted.
+ <orderedlist>
+ <listitem>
+ <para>
+ Set up primary and standby systems as near identically as possible,
+ including two identical copies of <productname>PostgreSQL</> at the
+ same release level.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Set up continuous archiving from the primary to a WAL archive located
+ in a directory on the standby server. Ensure that
+ <xref linkend="guc-archive-mode">,
+ <xref linkend="guc-archive-command"> and
+ <xref linkend="guc-archive-timeout">
+ are set appropriately on the primary
+ (see <xref linkend="backup-archiving-wal">).
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Set up connections and authentication so that the standby server can
+ successfully connect to the pseudo <literal>replication</> database of
+ the primary server (see
+ <xref linkend="streaming-replication-authentication">). Ensure that
+ <xref linkend="guc-listen-addresses"> and <filename>pg_hba.conf</> are
+ configured appropriately on the primary.
+ </para>
+ <para>
+ On systems that support the keepalive socket option, setting
+ <xref linkend="guc-tcp-keepalives-idle">,
+ <xref linkend="guc-tcp-keepalives-interval"> and
+ <xref linkend="guc-tcp-keepalives-count"> helps you to find the
+ troubles with replication (e.g., the network outage or the failure of
+ the standby server) as soon as possible.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Set the maximum number of concurrent connections from the standby servers
+ (see <xref linkend="guc-max-wal-senders"> for details).
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Enable WAL archiving in the primary server because we need to make a base
+ backup of it later (see <xref linkend="guc-archive-mode"> and
+ <xref linkend="guc-archive-command"> for details).
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Start the <productname>PostgreSQL</> server on the primary.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Make a base backup of the primary server (see
+ <xref linkend="backup-base-backup">), and load this data onto the
+ standby. Note that all files present in <filename>pg_xlog</>
+ and <filename>pg_xlog/archive_status</> on the <emphasis>standby</>
+ server should be removed because they might be obsolete.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Set up WAL archiving, connections and authentication like the primary
+ server, because the standby server might work as a primary server after
+ failover. Ensure that your settings are consistent with the
+ <emphasis>future</> environment after the primary and the standby
+ server are interchanged by failover. If you're setting up the standby
+ server for e.g reporting purposes, with no plans to fail over to it,
+ configure the standby accordingly.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Create a recovery command file <filename>recovery.conf</> in the data
+ directory on the standby server.
+ </para>
+
+ <variablelist id="replication-config-settings" xreflabel="Replication Settings">
+ <varlistentry id="standby-mode" xreflabel="standby_mode">
+ <term><varname>standby_mode</varname> (<type>boolean</type>)</term>
+ <listitem>
+ <para>
+ Specifies whether to start the <productname>PostgreSQL</> server as
+ a standby. If this parameter is <literal>on</>, the streaming
+ replication is enabled and the standby server will try to connect
+ to the primary to receive and apply WAL records continuously. The
+ default is <literal>off</>, which allows only an archive recovery
+ without replication. So, streaming replication requires this
+ parameter to be explicitly set to <literal>on</>.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry id="primary-conninfo" xreflabel="primary_conninfo">
+ <term><varname>primary_conninfo</varname> (<type>string</type>)</term>
+ <listitem>
+ <para>
+ Specifies a connection string which is used for the standby server
+ to connect with the primary. This string is in the same format as
+ described in <xref linkend="libpq-connect">. If any option is
+ unspecified in this string, then the corresponding environment
+ variable (see <xref linkend="libpq-envars">) is checked. If the
+ environment variable is not set either, then the indicated built-in
+ defaults are used.
+ </para>
+ <para>
+ The built-in replication requires that a host name (or host address)
+ or port number which the primary server listens on should be
+ specified in this string, respectively. Also ensure that a role with
+ the <literal>SUPERUSER</> and <literal>LOGIN</> privileges on the
+ primary is set (see
+ <xref linkend="streaming-replication-authentication">). Note that
+ the password needs to be set if the primary demands password
+ authentication.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry id="trigger-file" xreflabel="trigger_file">
+ <term><varname>trigger_file</varname> (<type>string</type>)</term>
+ <listitem>
+ <para>
+ Specifies a trigger file whose presence activates the standby.
+ If no trigger file is specified, the standby never exits
+ recovery.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </listitem>
+ <listitem>
+ <para>
+ Start the <productname>PostgreSQL</> server on the standby. The standby
+ server will go into recovery mode and proceeds to receive WAL records
+ from the primary and apply them continuously.
+ </para>
+ </listitem>
+ </orderedlist>
+ </para>
+ </sect3>
+ <sect3 id="streaming-replication-authentication">
+ <title>Authentication</title>
+ <para>
+ It's very important that the access privilege for replication are set
+ properly so that only trusted users can read the WAL stream, because it's
+ easy to extract serious information from it.
+ </para>
+ <para>
+ Only superuser is allowed to connect to the primary as the replication
+ standby. So a role with the <literal>SUPERUSER</> and <literal>LOGIN</>
+ privileges needs to be created in the primary.
+ </para>
+ <para>
+ Client authentication for replication is controlled by the
+ <filename>pg_hba.conf</> record specifying <literal>replication</> in the
+ <replaceable>database</> field. For example, if the standby is running on
+ host IP <literal>192.168.1.100</> and the superuser's name for replication
+ is <literal>foo</>, the administrator can add the following line to the
+ <filename>pg_hba.conf</> file on the primary.
+
+<programlisting>
+# Allow the user "foo" from host 192.168.1.100 to connect to the primary
+# as a replication standby if the user's password is correctly supplied.
+#
+# TYPE DATABASE USER CIDR-ADDRESS METHOD
+host replication foo 192.168.1.100/32 md5
+</programlisting>
+ </para>
+ <para>
+ The host name and port number of the primary, user name to connect as,
+ and password are specified in the <filename>recovery.conf</> file or
+ the corresponding environment variable on the standby.
+ For example, if the primary is running on host IP <literal>192.168.1.50</>,
+ port <literal>5432</literal>, the superuser's name for replication is
+ <literal>foo</>, and the password is <literal>foopass</>, the administrator
+ can add the following line to the <filename>recovery.conf</> file on the
+ standby.
+
+<programlisting>
+# The standby connects to the primary that is running on host 192.168.1.50
+# and port 5432 as the user "foo" whose password is "foopass".
+primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
+</programlisting>
+ </para>
+ </sect3>
</sect2>
<sect2 id="backup-incremental-updated">
diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index e1d8c9503b5..85a24140009 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/client-auth.sgml,v 1.125 2009/12/12 21:35:21 mha Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/client-auth.sgml,v 1.126 2010/01/15 09:18:56 heikki Exp $ -->
<chapter id="client-authentication">
<title>Client Authentication</title>
@@ -181,6 +181,8 @@ hostnossl <replaceable>database</replaceable> <replaceable>user</replaceable>
the requested user must be a member of the role with the same
name as the requested database. (<literal>samegroup</> is an
obsolete but still accepted spelling of <literal>samerole</>.)
+ The value <literal>replication</> specifies that the record
+ matches if streaming replication is requested.
Otherwise, this is the name of
a specific <productname>PostgreSQL</productname> database.
Multiple database names can be supplied by separating them with
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index dbccaa75691..03667754b85 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.243 2010/01/06 02:41:37 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.244 2010/01/15 09:18:58 heikki Exp $ -->
<chapter Id="runtime-config">
<title>Server Configuration</title>
@@ -1746,6 +1746,51 @@ archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows
</variablelist>
</sect2>
+ <sect2 id="runtime-config-replication">
+ <title>Streaming Replication</title>
+
+ <para>
+ These settings control the behavior of the built-in
+ <firstterm>streaming replication</> feature.
+ </para>
+
+ <variablelist>
+ <varlistentry id="guc-max-wal-senders" xreflabel="max_wal_senders">
+ <term><varname>max_wal_senders</varname> (<type>integer</type>)</term>
+ <indexterm>
+ <primary><varname>max_wal_senders</> configuration parameter</primary>
+ </indexterm>
+ <listitem>
+ <para>
+ Specifies the maximum number of concurrent connections from standby
+ servers (i.e., the maximum number of simultaneously running WAL sender
+ processes). The default is zero. This parameter can only be set at
+ server start.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry id="guc-wal-sender-delay" xreflabel="wal_sender_delay">
+ <term><varname>wal_sender_delay</varname> (<type>integer</type>)</term>
+ <indexterm>
+ <primary><varname>wal_sender_delay</> configuration parameter</primary>
+ </indexterm>
+ <listitem>
+ <para>
+ Specifies the delay between activity rounds for the WAL sender.
+ In each round the WAL sender sends any WAL accumulated since last
+ round to the standby server. It then sleeps for
+ <varname>wal_sender_delay</> milliseconds, and repeats. The default
+ value is 200 milliseconds (<literal>200ms</>).
+ Note that on many systems, the effective resolution of sleep delays is
+ 10 milliseconds; setting <varname>wal_sender_delay</> to a value that
+ is not a multiple of 10 might have the same results as setting it to
+ the next higher multiple of 10. This parameter can only be set in the
+ <filename>postgresql.conf</> file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </sect2>
<sect2 id="runtime-config-standby">
<title>Standby Servers</title>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 59ac346c142..49ca0874677 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.495 2009/12/19 17:49:50 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.496 2010/01/15 09:18:58 heikki Exp $ -->
<chapter id="functions">
<title>Functions and Operators</title>
@@ -12984,7 +12984,8 @@ SELECT set_config('log_statement_stats', 'off', false);
<para>
The functions shown in <xref
linkend="functions-admin-backup-table"> assist in making on-line backups.
- Use of the first three functions is restricted to superusers.
+ Use of the first three functions is restricted to superusers. The first
+ five functions cannot be executed during recovery.
</para>
<table id="functions-admin-backup-table">
@@ -13135,11 +13136,17 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<indexterm>
<primary>pg_is_in_recovery</primary>
</indexterm>
+ <indexterm>
+ <primary>pg_last_xlog_receive_location</primary>
+ </indexterm>
+ <indexterm>
+ <primary>pg_last_xlog_replay_location</primary>
+ </indexterm>
<para>
The functions shown in <xref
linkend="functions-recovery-info-table"> provide information
- about the current status of Hot Standby.
+ about the current status of the standby.
These functions may be executed during both recovery and in normal running.
</para>
@@ -13160,6 +13167,33 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<entry>True if recovery is still in progress.
</entry>
</row>
+ <row>
+ <entry>
+ <literal><function>pg_last_xlog_receive_location</function>()</literal>
+ </entry>
+ <entry><type>text</type></entry>
+ <entry>Get last transaction log location received and synced to disk during
+ streaming recovery. If streaming recovery is still in progress
+ this will increase monotonically. If streaming recovery has completed
+ then this value will remain static at the value of the last WAL record
+ received and synced to disk during that recovery. When the server has
+ been started without a streaming recovery then the return value will be
+ InvalidXLogRecPtr (0/0).
+ </entry>
+ </row>
+ <row>
+ <entry>
+ <literal><function>pg_last_xlog_replay_location</function>()</literal>
+ </entry>
+ <entry><type>text</type></entry>
+ <entry>Get last transaction log location replayed during recovery.
+ If recovery is still in progress this will increase monotonically.
+ If recovery has completed then this value will remain static at
+ the value of the last WAL record applied during that recovery.
+ When the server has been started normally without a recovery
+ then the return value will be InvalidXLogRecPtr (0/0).
+ </entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index efd4adced79..a8627408685 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.35 2009/04/27 16:27:35 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.36 2010/01/15 09:18:59 heikki Exp $ -->
<chapter id="high-availability">
<title>High Availability, Load Balancing, and Replication</title>
@@ -146,11 +146,16 @@ protocol to make nodes agree on a serializable transactional order.
made the new master database server. This is asynchronous and
can only be done for the entire database server.
</para>
+ <para>
+ A PITR warm standby server can be kept more up-to-date using the
+ streaming replication feature built into <productname>PostgreSQL</> 8.5
+ onwards.
+ </para>
</listitem>
</varlistentry>
<varlistentry>
- <term>Master-Slave Replication</term>
+ <term>Trigger-Based Master-Slave Replication</term>
<listitem>
<para>
@@ -278,7 +283,7 @@ protocol to make nodes agree on a serializable transactional order.
<entry>Shared Disk Failover</entry>
<entry>File System Replication</entry>
<entry>Warm Standby Using PITR</entry>
- <entry>Master-Slave Replication</entry>
+ <entry>Trigger-Based Master-Slave Replication</entry>
<entry>Statement-Based Replication Middleware</entry>
<entry>Asynchronous Multimaster Replication</entry>
<entry>Synchronous Multimaster Replication</entry>
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index d1ae3efb330..1a9f7c1452d 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.72 2009/08/07 20:54:31 alvherre Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.73 2010/01/15 09:18:59 heikki Exp $ -->
<chapter id="performance-tips">
<title>Performance Tips</title>
@@ -836,8 +836,9 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
needs to be written, because in case of an error, the files
containing the newly loaded data will be removed anyway.
However, this consideration does not apply when
- <xref linkend="guc-archive-mode"> is on, as all commands
- must write WAL in that case.
+ <xref linkend="guc-archive-mode"> is on or streaming replication
+ is allowed (i.e., <xref linkend="guc-max-wal-senders"> is more
+ than or equal to one), as all commands must write WAL in that case.
</para>
</sect2>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 1ec079e41c8..54e03998ff9 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/protocol.sgml,v 1.76 2009/12/02 04:54:10 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/protocol.sgml,v 1.77 2010/01/15 09:18:59 heikki Exp $ -->
<chapter id="protocol">
<title>Frontend/Backend Protocol</title>
@@ -4140,6 +4140,66 @@ not line breaks.
</sect1>
+<sect1 id="protocol-replication">
+<title>Streaming Replication Protocol</title>
+
+<para>
+To initiate streaming replication, the frontend sends the "replication"
+parameter in the startup message. This tells the backend to go into
+walsender mode, where a small set of replication commands can be issued
+instead of SQL statements. Only the simple query protocol can be used in
+walsender mode.
+
+The commands accepted in walsender mode are:
+
+<variablelist>
+ <varlistentry>
+ <term>IDENTIFY_SYSTEM</term>
+ <listitem>
+ <para>
+ Requests the server to idenfity itself. Server replies with a result
+ set of a single row, and two fields:
+
+ systemid: The unique system identifier identifying the cluster. This
+ can be used to check that the base backup used to initialize the
+ slave came from the same cluster.
+
+ timeline: Current TimelineID. Also used to check that the slave is
+ consistent with the master.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>START_REPLICATION XXX/XXX</term>
+ <listitem>
+ <para>
+ Instructs backend to start streaming WAL, starting at point XXX/XXX.
+ Server can reply with an error e.g if the requested piece of WAL has
+ already been recycled. On success, server responds with a
+ CopyOutResponse message, and backend starts to stream WAL as CopyData
+ messages.
+ </para>
+
+ <para>
+ The payload in each CopyData message consists of an XLogRecPtr,
+ indicating the starting point of the WAL in the message, immediately
+ followed by the WAL data itself.
+ </para>
+ <para>
+ A single WAL record is never split across two CopyData messages. When
+ a WAL record crosses a WAL page boundary, however, and is therefore
+ already split using continuation records, it can be split at the page
+ boundary. In other words, the first main WAL record and its
+ continuation records can be split across different CopyData messages.
+ </para>
+ </listitem>
+ </varlistentry>
+</variablelist>
+
+</para>
+
+</sect1>
<sect1 id="protocol-changes">
<title>Summary of Changes since Protocol 2.0</title>