From e1693e514c197b6ddd27b59678f78b9385ac0539 Mon Sep 17 00:00:00 2001
From: Bruce Momjian <bruce@momjian.us>
Date: Fri, 17 Nov 2006 16:38:44 +0000
Subject: Rename replication section "High Availability and Load Balancing".

---
 doc/src/sgml/failover.sgml          | 246 -----------------------------------
 doc/src/sgml/filelist.sgml          |   4 +-
 doc/src/sgml/high-availability.sgml | 249 ++++++++++++++++++++++++++++++++++++
 doc/src/sgml/postgres.sgml          |   4 +-
 4 files changed, 253 insertions(+), 250 deletions(-)
 delete mode 100644 doc/src/sgml/failover.sgml
 create mode 100644 doc/src/sgml/high-availability.sgml

(limited to 'doc/src')
diff --git a/doc/src/sgml/failover.sgml b/doc/src/sgml/failover.sgml
deleted file mode 100644
index 29d770c73e1..00000000000
--- a/doc/src/sgml/failover.sgml
+++ /dev/null
@@ -1,246 +0,0 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/failover.sgml,v 1.13 2006/11/17 13:29:53 momjian Exp $ -->
-
-<chapter id="failover">
- <title>Failover, Replication, Load Balancing, and Clustering Options</title>
-
- <indexterm><primary>failover</></>
- <indexterm><primary>replication</></>
- <indexterm><primary>load balancing</></>
- <indexterm><primary>clustering</></>
-
- <para>
-  Database servers can work together to allow a second server to
-  quickly take over if the primary server fails (failover), or to
-  allow several computers to serve the same data (load balancing).
-  Ideally, database servers could work together seamlessly.  Web
-  servers serving static web pages can be combined quite easily by
-  merely load-balancing web requests to multiple machines.  In
-  fact, read-only database servers can be combined relatively easily
-  too.  Unfortunately, most database servers have a read/write mix
-  of requests, and read/write servers are much harder to combine.
-  This is because though read-only data needs to be placed on each
-  server only once, a write to any server has to be propagated to
-  all servers so that future read requests to those servers return
-  consistent results.
- </para>
-
- <para>
-  This synchronization problem is the fundamental difficulty for servers
-  working together.  Because there is no single solution that eliminates
-  the impact of the sync problem for all use cases, there are multiple
-  solutions.  Each solution addresses this problem in a different way, and
-  minimizes its impact for a specific workload.
- </para>
-
- <para>
-  Some solutions deal with synchronization by allowing only one
-  server to modify the data.  Servers that can modify data are
-  called read/write or "master" servers.  Servers that can reply
-  to read-only queries are called "slave" servers.  Servers that
-  cannot be accessed until they are changed to master servers are
-  called "standby" servers.
- </para>
-
- <para>
-  Some failover and load balancing solutions are synchronous, meaning that
-  a data-modifying transaction is not considered committed until all
-  servers have committed the transaction.  This guarantees that a failover
-  will not lose any data and that all load-balanced servers will return
-  consistent results with no propagation delay. Asynchronous updating has
-  a small delay between the time of commit and its propagation to the
-  other servers, opening the possibility that some transactions might be
-  lost in the switch to a backup server, and that load balanced servers
-  might return slightly stale results.  Asynchronous communication is used
-  when synchronous would be too slow.
- </para>
-
- <para>
-  Solutions can also be categorized by their granularity.  Some solutions
-  can deal only with an entire database server, while others allow control
-  at the per-table or per-database level.
- </para>
-
- <para>
-  Performance must be considered in any failover or load balancing
-  choice.  There is usually a tradeoff between functionality and
-  performance.  For example, a full synchronous solution over a slow
-  network might cut performance by more than half, while an asynchronous
-  one might have a minimal performance impact.
- </para>
-
- <para>
-  The remainder of this section outlines various failover, replication,
-  and load balancing solutions.
- </para>
-
- <variablelist>
-
- <varlistentry>
-  <term>Shared Disk Failover</term>
-  <listitem>
-
-   <para>
-    Shared disk failover avoids synchronization overhead by having only one
-    copy of the database.  It uses a single disk array that is shared by
-    multiple servers.  If the main database server fails, the standby server
-    is able to mount and start the database as though it was recovering from
-    a database crash.  This allows rapid failover with no data loss.
-   </para>
-
-   <para>
-    Shared hardware functionality is common in network storage
-    devices.  Using a network file system is also possible, though
-    care must be taken that the file system has full POSIX behavior.
-    One significant limitation of this method is that if the shared
-    disk array fails or becomes corrupt, the primary and standby
-    servers are both nonfunctional.  Another issue is that the
-    standby server should never access the shared storage while
-    the primary server is running.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term>Warm Standby Using Point-In-Time Recovery</term>
-  <listitem>
-
-   <para>
-    A warm standby server (see <xref linkend="warm-standby">) can
-    be kept current by reading a stream of write-ahead log (WAL)
-    records.  If the main server fails, the warm standby contains
-    almost all of the data of the main server, and can be quickly
-    made the new master database server.  This is asynchronous and
-    can only be done for the entire database server.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term>Master/Slave Replication</term>
-  <listitem>
-
-   <para>
-    A master/slave replication setup sends all data modification
-    queries to the master server.  The master server asynchronously
-    sends data changes to the slave server.  The slave can answer
-    read-only queries while the master server is running.  The
-    slave server is ideal for data warehouse queries.
-   </para>
-
-   <para>
-    Slony-I is an example of this type of replication, with per-table
-    granularity, and support for multiple slaves.  Because it
-    updates the slave server asynchronously (in batches), there is
-    possible data loss during fail over.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term>Query Broadcasting</term>
-  <listitem>
-
-   <para>
-    In query broadcasting, a program intercepts every SQL query
-    and sends it to all servers.  Each server operates independently.
-    Read-only queries can be sent to a single server because there
-    is no need for all servers to process it.
-   </para>
-
-   <para>
-    One limitation of this solution is that functions like
-    <function>random()</>, <function>CURRENT_TIMESTAMP</>, and
-    sequences can have different values on different servers.  This
-    is because each server operates independently, and because SQL
-    queries are broadcast (and not actual modified rows).  If this
-    is unacceptable, applications must query such values from a
-    single server and then use those values in write queries.
-    Also, care must be taken that all transactions either commit
-    or abort on all servers, perhaps using two-phase commit (<xref
-    linkend="sql-prepare-transaction"
-    endterm="sql-prepare-transaction-title"> and <xref
-    linkend="sql-commit-prepared" endterm="sql-commit-prepared-title">.
-    Pgpool is an example of this type of replication.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term>Multi-Master Replication Using Clustering</term>
-  <listitem>
-
-   <para>
-    In clustering, each server can accept write requests, and
-    modified data is transmitted from the original server to every
-    other server before each transaction commits.  Heavy write
-    activity can cause excessive locking, leading to poor performance.
-    In fact, write performance is often worse than that of a single
-    server.  Read requests can be sent to any server.  Clustering
-    is best for mostly read workloads, though its big advantage
-    is that any server can accept write requests &mdash; there is
-    no need to partition workloads between master and slave servers,
-    and because the changes are sent from one server to another,
-    there is not a problem with non-deterministic functions like
-    <function>random()</>.
-   </para>
-
-   <para>
-    Clustering is implemented by <productname>Oracle</> in their
-    <productname><acronym>RAC</></> product.  <productname>PostgreSQL</>
-    does not offer this type of load balancing, though
-    <productname>PostgreSQL</> two-phase commit (<xref
-    linkend="sql-prepare-transaction"
-    endterm="sql-prepare-transaction-title"> and <xref
-    linkend="sql-commit-prepared" endterm="sql-commit-prepared-title">)
-    can be used to implement this in application code or middleware.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term>Data Partitioning</term>
-  <listitem>
-
-   <para>
-    Data partitioning splits tables into data sets.  Each set can
-    be modified by only one server.  For example, data can be
-    partitioned by offices, e.g. London and Paris, with a server
-    in each office.  If queries combining London and Paris data
-    are necessary, an application can query both servers, or
-    master/slave replication can be used to keep a read-only copy
-    of the other office's data on each server.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term>Clustering For Parallel Query Execution</term>
-  <listitem>
-
-   <para>
-    This allows multiple servers to work concurrently on a single
-    query.  One possible way this could work is for the data to be
-    split among servers and for each server to execute its part of
-    the query and results sent to a central server to be combined
-    and returned to the user.  There currently is no
-    <productname>PostgreSQL</> open source solution for this.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term>Commercial Solutions</term>
-  <listitem>
-
-   <para>
-    Because <productname>PostgreSQL</> is open source and easily
-    extended, a number of companies have taken <productname>PostgreSQL</>
-    and created commercial closed-source solutions with unique
-    failover, replication, and load balancing capabilities.
-   </para>
-  </listitem>
- </varlistentry>
-
- </variablelist>
-
-</chapter>
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index 945b08375fc..fe1fdfd5794 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/filelist.sgml,v 1.48 2006/10/26 15:26:54 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/filelist.sgml,v 1.49 2006/11/17 16:38:44 momjian Exp $ -->
 
 <!entity history    SYSTEM "history.sgml">
 <!entity info       SYSTEM "info.sgml">
@@ -34,7 +34,7 @@
 <!entity charset       SYSTEM "charset.sgml">
 <!entity client-auth   SYSTEM "client-auth.sgml">
 <!entity diskusage     SYSTEM "diskusage.sgml">
-<!entity failover      SYSTEM "failover.sgml">
+<!entity high-availability      SYSTEM "high-availability.sgml">
 <!entity installation  SYSTEM "installation.sgml">
 <!entity installw      SYSTEM "install-win32.sgml">
 <!entity maintenance   SYSTEM "maintenance.sgml">
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
new file mode 100644
index 00000000000..56733e38e84
--- /dev/null
+++ b/doc/src/sgml/high-availability.sgml
@@ -0,0 +1,249 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.1 2006/11/17 16:38:44 momjian Exp $ -->
+
+<chapter id="high-availability">
+ <title>High Availability and Load Balancing</title>
+
+ <indexterm><primary>high availability</></>
+ <indexterm><primary>failover</></>
+ <indexterm><primary>replication</></>
+ <indexterm><primary>load balancing</></>
+ <indexterm><primary>clustering</></>
+ <indexterm><primary>data partitioning</></>
+
+ <para>
+  Database servers can work together to allow a second server to
+  quickly take over quickly if the primary server fails (high
+  availability), or to allow several computers to serve the same
+  data (load balancing).  Ideally, database servers could work
+  together seamlessly.  Web servers serving static web pages can
+  be combined quite easily by merely load-balancing web requests
+  to multiple machines.  In fact, read-only database servers can
+  be combined relatively easily too.  Unfortunately, most database
+  servers have a read/write mix of requests, and read/write servers
+  are much harder to combine.  This is because though read-only
+  data needs to be placed on each server only once, a write to any
+  server has to be propagated to all servers so that future read
+  requests to those servers return consistent results.
+ </para>
+
+ <para>
+  This synchronization problem is the fundamental difficulty for
+  servers working together.  Because there is no single solution
+  that eliminates the impact of the sync problem for all use cases,
+  there are multiple solutions.  Each solution addresses this
+  problem in a different way, and minimizes its impact for a specific
+  workload.
+ </para>
+
+ <para>
+  Some solutions deal with synchronization by allowing only one
+  server to modify the data.  Servers that can modify data are
+  called read/write or "master" servers.  Servers that can reply
+  to read-only queries are called "slave" servers.  Servers that
+  cannot be accessed until they are changed to master servers are
+  called "standby" servers.
+ </para>
+
+ <para>
+  Some failover and load balancing solutions are synchronous, meaning that
+  a data-modifying transaction is not considered committed until all
+  servers have committed the transaction.  This guarantees that a failover
+  will not lose any data and that all load-balanced servers will return
+  consistent results with no propagation delay. Asynchronous updating has
+  a small delay between the time of commit and its propagation to the
+  other servers, opening the possibility that some transactions might be
+  lost in the switch to a backup server, and that load balanced servers
+  might return slightly stale results.  Asynchronous communication is used
+  when synchronous would be too slow.
+ </para>
+
+ <para>
+  Solutions can also be categorized by their granularity.  Some solutions
+  can deal only with an entire database server, while others allow control
+  at the per-table or per-database level.
+ </para>
+
+ <para>
+  Performance must be considered in any failover or load balancing
+  choice.  There is usually a tradeoff between functionality and
+  performance.  For example, a full synchronous solution over a slow
+  network might cut performance by more than half, while an asynchronous
+  one might have a minimal performance impact.
+ </para>
+
+ <para>
+  The remainder of this section outlines various failover, replication,
+  and load balancing solutions.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+  <term>Shared Disk Failover</term>
+  <listitem>
+
+   <para>
+    Shared disk failover avoids synchronization overhead by having only one
+    copy of the database.  It uses a single disk array that is shared by
+    multiple servers.  If the main database server fails, the standby server
+    is able to mount and start the database as though it was recovering from
+    a database crash.  This allows rapid failover with no data loss.
+   </para>
+
+   <para>
+    Shared hardware functionality is common in network storage
+    devices.  Using a network file system is also possible, though
+    care must be taken that the file system has full POSIX behavior.
+    One significant limitation of this method is that if the shared
+    disk array fails or becomes corrupt, the primary and standby
+    servers are both nonfunctional.  Another issue is that the
+    standby server should never access the shared storage while
+    the primary server is running.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Warm Standby Using Point-In-Time Recovery</term>
+  <listitem>
+
+   <para>
+    A warm standby server (see <xref linkend="warm-standby">) can
+    be kept current by reading a stream of write-ahead log (WAL)
+    records.  If the main server fails, the warm standby contains
+    almost all of the data of the main server, and can be quickly
+    made the new master database server.  This is asynchronous and
+    can only be done for the entire database server.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Master/Slave Replication</term>
+  <listitem>
+
+   <para>
+    A master/slave replication setup sends all data modification
+    queries to the master server.  The master server asynchronously
+    sends data changes to the slave server.  The slave can answer
+    read-only queries while the master server is running.  The
+    slave server is ideal for data warehouse queries.
+   </para>
+
+   <para>
+    Slony-I is an example of this type of replication, with per-table
+    granularity, and support for multiple slaves.  Because it
+    updates the slave server asynchronously (in batches), there is
+    possible data loss during fail over.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Query Broadcasting</term>
+  <listitem>
+
+   <para>
+    In query broadcasting, a program intercepts every SQL query
+    and sends it to all servers.  Each server operates independently.
+    Read-only queries can be sent to a single server because there
+    is no need for all servers to process it.
+   </para>
+
+   <para>
+    One limitation of this solution is that functions like
+    <function>random()</>, <function>CURRENT_TIMESTAMP</>, and
+    sequences can have different values on different servers.  This
+    is because each server operates independently, and because SQL
+    queries are broadcast (and not actual modified rows).  If this
+    is unacceptable, applications must query such values from a
+    single server and then use those values in write queries.
+    Also, care must be taken that all transactions either commit
+    or abort on all servers, perhaps using two-phase commit (<xref
+    linkend="sql-prepare-transaction"
+    endterm="sql-prepare-transaction-title"> and <xref
+    linkend="sql-commit-prepared" endterm="sql-commit-prepared-title">.
+    Pgpool is an example of this type of replication.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Multi-Master Replication Using Clustering</term>
+  <listitem>
+
+   <para>
+    In clustering, each server can accept write requests, and
+    modified data is transmitted from the original server to every
+    other server before each transaction commits.  Heavy write
+    activity can cause excessive locking, leading to poor performance.
+    In fact, write performance is often worse than that of a single
+    server.  Read requests can be sent to any server.  Clustering
+    is best for mostly read workloads, though its big advantage
+    is that any server can accept write requests &mdash; there is
+    no need to partition workloads between master and slave servers,
+    and because the changes are sent from one server to another,
+    there is not a problem with non-deterministic functions like
+    <function>random()</>.
+   </para>
+
+   <para>
+    Clustering is implemented by <productname>Oracle</> in their
+    <productname><acronym>RAC</></> product.  <productname>PostgreSQL</>
+    does not offer this type of load balancing, though
+    <productname>PostgreSQL</> two-phase commit (<xref
+    linkend="sql-prepare-transaction"
+    endterm="sql-prepare-transaction-title"> and <xref
+    linkend="sql-commit-prepared" endterm="sql-commit-prepared-title">)
+    can be used to implement this in application code or middleware.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Data Partitioning</term>
+  <listitem>
+
+   <para>
+    Data partitioning splits tables into data sets.  Each set can
+    be modified by only one server.  For example, data can be
+    partitioned by offices, e.g. London and Paris, with a server
+    in each office.  If queries combining London and Paris data
+    are necessary, an application can query both servers, or
+    master/slave replication can be used to keep a read-only copy
+    of the other office's data on each server.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Clustering For Parallel Query Execution</term>
+  <listitem>
+
+   <para>
+    This allows multiple servers to work concurrently on a single
+    query.  One possible way this could work is for the data to be
+    split among servers and for each server to execute its part of
+    the query and results sent to a central server to be combined
+    and returned to the user.  There currently is no
+    <productname>PostgreSQL</> open source solution for this.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Commercial Solutions</term>
+  <listitem>
+
+   <para>
+    Because <productname>PostgreSQL</> is open source and easily
+    extended, a number of companies have taken <productname>PostgreSQL</>
+    and created commercial closed-source solutions with unique
+    failover, replication, and load balancing capabilities.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+</chapter>
diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml
index bba2bcd6019..4f55c2dcacd 100644
--- a/doc/src/sgml/postgres.sgml
+++ b/doc/src/sgml/postgres.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/postgres.sgml,v 1.79 2006/10/26 15:26:54 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/postgres.sgml,v 1.80 2006/11/17 16:38:44 momjian Exp $ -->
 
 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
 
@@ -151,7 +151,7 @@
   &charset;
   &maintenance;
   &backup;
-  &failover;
+  &high-availability;
   &monitoring;
   &diskusage;
   &wal;
-- 
cgit v1.2.3