diff options
| author | Alexander Korotkov <akorotkov@postgresql.org> | 2025-11-05 11:43:55 +0200 |
|---|---|---|
| committer | Alexander Korotkov <akorotkov@postgresql.org> | 2025-11-05 11:44:13 +0200 |
| commit | 447aae13b0305780e87cac7b0dd669db6fab3d9d (patch) | |
| tree | 2ba9e956343d3f85d9d91011220047b8e80d6171 /doc/src | |
| parent | 3b4e53a075ea5671b075f8fd873241179f8e64af (diff) | |
Implement WAIT FOR command
WAIT FOR is to be used on standby and specifies waiting for
the specific WAL location to be replayed. This option is useful when
the user makes some data changes on primary and needs a guarantee to see
these changes are on standby.
WAIT FOR needs to wait without any snapshot held. Otherwise, the snapshot
could prevent the replay of WAL records, implying a kind of self-deadlock.
This is why separate utility command seems appears to be the most robust
way to implement this functionality. It's not possible to implement this as
a function. Previous experience shows that stored procedures also have
limitation in this aspect.
Discussion: https://www.postgresql.org/message-id/flat/CAPpHfdsjtZLVzxjGT8rJHCYbM0D5dwkO+BBjcirozJ6nYbOW8Q@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/flat/CABPTF7UNft368x-RgOXkfj475OwEbp%2BVVO-wEXz7StgjD_%3D6sw%40mail.gmail.com
Author: Kartyshov Ivan <i.kartyshov@postgrespro.ru>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Diffstat (limited to 'doc/src')
| -rw-r--r-- | doc/src/sgml/high-availability.sgml | 54 | ||||
| -rw-r--r-- | doc/src/sgml/ref/allfiles.sgml | 1 | ||||
| -rw-r--r-- | doc/src/sgml/ref/wait_for.sgml | 234 | ||||
| -rw-r--r-- | doc/src/sgml/reference.sgml | 1 |
4 files changed, 290 insertions, 0 deletions
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml index b47d8b4106e..742deb037b7 100644 --- a/doc/src/sgml/high-availability.sgml +++ b/doc/src/sgml/high-availability.sgml @@ -1376,6 +1376,60 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)' </sect3> </sect2> + <sect2 id="read-your-writes-consistency"> + <title>Read-Your-Writes Consistency</title> + + <para> + In asynchronous replication, there is always a short window where changes + on the primary may not yet be visible on the standby due to replication + lag. This can lead to inconsistencies when an application writes data on + the primary and then immediately issues a read query on the standby. + However, it is possible to address this without switching to synchronous + replication. + </para> + + <para> + To address this, PostgreSQL offers a mechanism for read-your-writes + consistency. The key idea is to ensure that a client sees its own writes + by synchronizing the WAL replay on the standby with the known point of + change on the primary. + </para> + + <para> + This is achieved by the following steps. After performing write + operations, the application retrieves the current WAL location using a + function call like this. + + <programlisting> +postgres=# SELECT pg_current_wal_insert_lsn(); +pg_current_wal_insert_lsn +-------------------- +0/306EE20 +(1 row) + </programlisting> + </para> + + <para> + The <acronym>LSN</acronym> obtained from the primary is then communicated + to the standby server. This can be managed at the application level or + via the connection pooler. On the standby, the application issues the + <xref linkend="sql-wait-for"/> command to block further processing until + the standby's WAL replay process reaches (or exceeds) the specified + <acronym>LSN</acronym>. + + <programlisting> +postgres=# WAIT FOR LSN '0/306EE20'; + status +-------- + success +(1 row) + </programlisting> + Once the command returns a status of success, it guarantees that all + changes up to the provided <acronym>LSN</acronym> have been applied, + ensuring that subsequent read queries will reflect the latest updates. + </para> + </sect2> + <sect2 id="continuous-archiving-in-standby"> <title>Continuous Archiving in Standby</title> diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index f5be638867a..e167406c744 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -188,6 +188,7 @@ Complete list of usable sgml source files in this directory. <!ENTITY update SYSTEM "update.sgml"> <!ENTITY vacuum SYSTEM "vacuum.sgml"> <!ENTITY values SYSTEM "values.sgml"> +<!ENTITY waitFor SYSTEM "wait_for.sgml"> <!-- applications and utilities --> <!ENTITY clusterdb SYSTEM "clusterdb.sgml"> diff --git a/doc/src/sgml/ref/wait_for.sgml b/doc/src/sgml/ref/wait_for.sgml new file mode 100644 index 00000000000..3b8e842d1de --- /dev/null +++ b/doc/src/sgml/ref/wait_for.sgml @@ -0,0 +1,234 @@ +<!-- +doc/src/sgml/ref/wait_for.sgml +PostgreSQL documentation +--> + +<refentry id="sql-wait-for"> + <indexterm zone="sql-wait-for"> + <primary>WAIT FOR</primary> + </indexterm> + + <refmeta> + <refentrytitle>WAIT FOR</refentrytitle> + <manvolnum>7</manvolnum> + <refmiscinfo>SQL - Language Statements</refmiscinfo> + </refmeta> + + <refnamediv> + <refname>WAIT FOR</refname> + <refpurpose>wait for target <acronym>LSN</acronym> to be replayed, optionally with a timeout</refpurpose> + </refnamediv> + + <refsynopsisdiv> +<synopsis> +WAIT FOR LSN '<replaceable class="parameter">lsn</replaceable>' [ WITH ( <replaceable class="parameter">option</replaceable> [, ...] ) ] + +<phrase>where <replaceable class="parameter">option</replaceable> can be:</phrase> + + TIMEOUT '<replaceable class="parameter">timeout</replaceable>' + NO_THROW +</synopsis> + </refsynopsisdiv> + + <refsect1> + <title>Description</title> + + <para> + Waits until recovery replays <parameter>lsn</parameter>. + If no <parameter>timeout</parameter> is specified or it is set to + zero, this command waits indefinitely for the + <parameter>lsn</parameter>. + On timeout, or if the server is promoted before + <parameter>lsn</parameter> is reached, an error is emitted, + unless <literal>NO_THROW</literal> is specified in the WITH clause. + If <parameter>NO_THROW</parameter> is specified, then the command + doesn't throw errors. + </para> + + <para> + The possible return values are <literal>success</literal>, + <literal>timeout</literal>, and <literal>not in recovery</literal>. + </para> + </refsect1> + + <refsect1> + <title>Parameters</title> + + <variablelist> + <varlistentry> + <term><replaceable class="parameter">lsn</replaceable></term> + <listitem> + <para> + Specifies the target <acronym>LSN</acronym> to wait for. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>WITH ( <replaceable class="parameter">option</replaceable> [, ...] )</literal></term> + <listitem> + <para> + This clause specifies optional parameters for the wait operation. + The following parameters are supported: + + <variablelist> + <varlistentry> + <term><literal>TIMEOUT</literal> '<replaceable class="parameter">timeout</replaceable>'</term> + <listitem> + <para> + When specified and <parameter>timeout</parameter> is greater than zero, + the command waits until <parameter>lsn</parameter> is reached or + the specified <parameter>timeout</parameter> has elapsed. + </para> + <para> + The <parameter>timeout</parameter> might be given as integer number of + milliseconds. Also it might be given as string literal with + integer number of milliseconds or a number with unit + (see <xref linkend="config-setting-names-values"/>). + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>NO_THROW</literal></term> + <listitem> + <para> + Specify to not throw an error in the case of timeout or + running on the primary. In this case the result status can be get from + the return value. + </para> + </listitem> + </varlistentry> + </variablelist> + </para> + </listitem> + </varlistentry> + </variablelist> + </refsect1> + + <refsect1> + <title>Outputs</title> + + <variablelist> + <varlistentry> + <term><literal>success</literal></term> + <listitem> + <para> + This return value denotes that we have successfully reached + the target <parameter>lsn</parameter>. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>timeout</literal></term> + <listitem> + <para> + This return value denotes that the timeout happened before reaching + the target <parameter>lsn</parameter>. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>not in recovery</literal></term> + <listitem> + <para> + This return value denotes that the database server is not in a recovery + state. This might mean either the database server was not in recovery + at the moment of receiving the command, or it was promoted before + reaching the target <parameter>lsn</parameter>. + </para> + </listitem> + </varlistentry> + </variablelist> + </refsect1> + + <refsect1> + <title>Notes</title> + + <para> + <command>WAIT FOR</command> command waits till + <parameter>lsn</parameter> to be replayed on standby. + That is, after this command execution, the value returned by + <function>pg_last_wal_replay_lsn</function> should be greater or equal + to the <parameter>lsn</parameter> value. This is useful to achieve + read-your-writes-consistency, while using async replica for reads and + primary for writes. In that case, the <acronym>lsn</acronym> of the last + modification should be stored on the client application side or the + connection pooler side. + </para> + + <para> + <command>WAIT FOR</command> command should be called on standby. + If a user runs <command>WAIT FOR</command> on primary, it + will error out unless <parameter>NO_THROW</parameter> is specified in the WITH clause. + However, if <command>WAIT FOR</command> is + called on primary promoted from standby and <literal>lsn</literal> + was already replayed, then the <command>WAIT FOR</command> command just + exits immediately. + </para> + +</refsect1> + + <refsect1> + <title>Examples</title> + + <para> + You can use <command>WAIT FOR</command> command to wait for + the <type>pg_lsn</type> value. For example, an application could update + the <literal>movie</literal> table and get the <acronym>lsn</acronym> after + changes just made. This example uses <function>pg_current_wal_insert_lsn</function> + on primary server to get the <acronym>lsn</acronym> given that + <varname>synchronous_commit</varname> could be set to + <literal>off</literal>. + + <programlisting> +postgres=# UPDATE movie SET genre = 'Dramatic' WHERE genre = 'Drama'; +UPDATE 100 +postgres=# SELECT pg_current_wal_insert_lsn(); +pg_current_wal_insert_lsn +-------------------- +0/306EE20 +(1 row) +</programlisting> + + Then an application could run <command>WAIT FOR</command> + with the <parameter>lsn</parameter> obtained from primary. After that the + changes made on primary should be guaranteed to be visible on replica. + +<programlisting> +postgres=# WAIT FOR LSN '0/306EE20'; + status +-------- + success +(1 row) +postgres=# SELECT * FROM movie WHERE genre = 'Drama'; + genre +------- +(0 rows) +</programlisting> + </para> + + <para> + If the target LSN is not reached before the timeout, the error is thrown. + +<programlisting> +postgres=# WAIT FOR LSN '0/306EE20' WITH (TIMEOUT '0.1s'); +ERROR: timed out while waiting for target LSN 0/306EE20 to be replayed; current replay LSN 0/306EA60 +</programlisting> + </para> + + <para> + The same example uses <command>WAIT FOR</command> with + <parameter>NO_THROW</parameter> option. +<programlisting> +postgres=# WAIT FOR LSN '0/306EE20' WITH (TIMEOUT '100ms', NO_THROW); + status +-------- + timeout +(1 row) +</programlisting> + </para> + </refsect1> +</refentry> diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index ff85ace83fc..2cf02c37b17 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -216,6 +216,7 @@ &update; &vacuum; &values; + &waitFor; </reference> |
