From 447aae13b0305780e87cac7b0dd669db6fab3d9d Mon Sep 17 00:00:00 2001 From: Alexander Korotkov Date: Wed, 5 Nov 2025 11:43:55 +0200 Subject: Implement WAIT FOR command MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit WAIT FOR is to be used on standby and specifies waiting for the specific WAL location to be replayed. This option is useful when the user makes some data changes on primary and needs a guarantee to see these changes are on standby. WAIT FOR needs to wait without any snapshot held. Otherwise, the snapshot could prevent the replay of WAL records, implying a kind of self-deadlock. This is why separate utility command seems appears to be the most robust way to implement this functionality. It's not possible to implement this as a function. Previous experience shows that stored procedures also have limitation in this aspect. Discussion: https://www.postgresql.org/message-id/flat/CAPpHfdsjtZLVzxjGT8rJHCYbM0D5dwkO+BBjcirozJ6nYbOW8Q@mail.gmail.com Discussion: https://www.postgresql.org/message-id/flat/CABPTF7UNft368x-RgOXkfj475OwEbp%2BVVO-wEXz7StgjD_%3D6sw%40mail.gmail.com Author: Kartyshov Ivan Author: Alexander Korotkov Author: Xuneng Zhou Reviewed-by: Michael Paquier Reviewed-by: Peter Eisentraut Reviewed-by: Dilip Kumar Reviewed-by: Amit Kapila Reviewed-by: Alexander Lakhin Reviewed-by: Bharath Rupireddy Reviewed-by: Euler Taveira Reviewed-by: Heikki Linnakangas Reviewed-by: Kyotaro Horiguchi Reviewed-by: jian he Reviewed-by: Álvaro Herrera Reviewed-by: Xuneng Zhou --- doc/src/sgml/high-availability.sgml | 54 +++++++++ doc/src/sgml/ref/allfiles.sgml | 1 + doc/src/sgml/ref/wait_for.sgml | 234 ++++++++++++++++++++++++++++++++++++ doc/src/sgml/reference.sgml | 1 + 4 files changed, 290 insertions(+) create mode 100644 doc/src/sgml/ref/wait_for.sgml (limited to 'doc/src') diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml index b47d8b4106e..742deb037b7 100644 --- a/doc/src/sgml/high-availability.sgml +++ b/doc/src/sgml/high-availability.sgml @@ -1376,6 +1376,60 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)' + + Read-Your-Writes Consistency + + + In asynchronous replication, there is always a short window where changes + on the primary may not yet be visible on the standby due to replication + lag. This can lead to inconsistencies when an application writes data on + the primary and then immediately issues a read query on the standby. + However, it is possible to address this without switching to synchronous + replication. + + + + To address this, PostgreSQL offers a mechanism for read-your-writes + consistency. The key idea is to ensure that a client sees its own writes + by synchronizing the WAL replay on the standby with the known point of + change on the primary. + + + + This is achieved by the following steps. After performing write + operations, the application retrieves the current WAL location using a + function call like this. + + +postgres=# SELECT pg_current_wal_insert_lsn(); +pg_current_wal_insert_lsn +-------------------- +0/306EE20 +(1 row) + + + + + The LSN obtained from the primary is then communicated + to the standby server. This can be managed at the application level or + via the connection pooler. On the standby, the application issues the + command to block further processing until + the standby's WAL replay process reaches (or exceeds) the specified + LSN. + + +postgres=# WAIT FOR LSN '0/306EE20'; + status +-------- + success +(1 row) + + Once the command returns a status of success, it guarantees that all + changes up to the provided LSN have been applied, + ensuring that subsequent read queries will reflect the latest updates. + + + Continuous Archiving in Standby diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index f5be638867a..e167406c744 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -188,6 +188,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/wait_for.sgml b/doc/src/sgml/ref/wait_for.sgml new file mode 100644 index 00000000000..3b8e842d1de --- /dev/null +++ b/doc/src/sgml/ref/wait_for.sgml @@ -0,0 +1,234 @@ + + + + + WAIT FOR + + + + WAIT FOR + 7 + SQL - Language Statements + + + + WAIT FOR + wait for target LSN to be replayed, optionally with a timeout + + + + +WAIT FOR LSN 'lsn' [ WITH ( option [, ...] ) ] + +where option can be: + + TIMEOUT 'timeout' + NO_THROW + + + + + Description + + + Waits until recovery replays lsn. + If no timeout is specified or it is set to + zero, this command waits indefinitely for the + lsn. + On timeout, or if the server is promoted before + lsn is reached, an error is emitted, + unless NO_THROW is specified in the WITH clause. + If NO_THROW is specified, then the command + doesn't throw errors. + + + + The possible return values are success, + timeout, and not in recovery. + + + + + Parameters + + + + lsn + + + Specifies the target LSN to wait for. + + + + + + WITH ( option [, ...] ) + + + This clause specifies optional parameters for the wait operation. + The following parameters are supported: + + + + TIMEOUT 'timeout' + + + When specified and timeout is greater than zero, + the command waits until lsn is reached or + the specified timeout has elapsed. + + + The timeout might be given as integer number of + milliseconds. Also it might be given as string literal with + integer number of milliseconds or a number with unit + (see ). + + + + + + NO_THROW + + + Specify to not throw an error in the case of timeout or + running on the primary. In this case the result status can be get from + the return value. + + + + + + + + + + + + Outputs + + + + success + + + This return value denotes that we have successfully reached + the target lsn. + + + + + + timeout + + + This return value denotes that the timeout happened before reaching + the target lsn. + + + + + + not in recovery + + + This return value denotes that the database server is not in a recovery + state. This might mean either the database server was not in recovery + at the moment of receiving the command, or it was promoted before + reaching the target lsn. + + + + + + + + Notes + + + WAIT FOR command waits till + lsn to be replayed on standby. + That is, after this command execution, the value returned by + pg_last_wal_replay_lsn should be greater or equal + to the lsn value. This is useful to achieve + read-your-writes-consistency, while using async replica for reads and + primary for writes. In that case, the lsn of the last + modification should be stored on the client application side or the + connection pooler side. + + + + WAIT FOR command should be called on standby. + If a user runs WAIT FOR on primary, it + will error out unless NO_THROW is specified in the WITH clause. + However, if WAIT FOR is + called on primary promoted from standby and lsn + was already replayed, then the WAIT FOR command just + exits immediately. + + + + + + Examples + + + You can use WAIT FOR command to wait for + the pg_lsn value. For example, an application could update + the movie table and get the lsn after + changes just made. This example uses pg_current_wal_insert_lsn + on primary server to get the lsn given that + synchronous_commit could be set to + off. + + +postgres=# UPDATE movie SET genre = 'Dramatic' WHERE genre = 'Drama'; +UPDATE 100 +postgres=# SELECT pg_current_wal_insert_lsn(); +pg_current_wal_insert_lsn +-------------------- +0/306EE20 +(1 row) + + + Then an application could run WAIT FOR + with the lsn obtained from primary. After that the + changes made on primary should be guaranteed to be visible on replica. + + +postgres=# WAIT FOR LSN '0/306EE20'; + status +-------- + success +(1 row) +postgres=# SELECT * FROM movie WHERE genre = 'Drama'; + genre +------- +(0 rows) + + + + + If the target LSN is not reached before the timeout, the error is thrown. + + +postgres=# WAIT FOR LSN '0/306EE20' WITH (TIMEOUT '0.1s'); +ERROR: timed out while waiting for target LSN 0/306EE20 to be replayed; current replay LSN 0/306EA60 + + + + + The same example uses WAIT FOR with + NO_THROW option. + +postgres=# WAIT FOR LSN '0/306EE20' WITH (TIMEOUT '100ms', NO_THROW); + status +-------- + timeout +(1 row) + + + + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index ff85ace83fc..2cf02c37b17 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -216,6 +216,7 @@ &update; &vacuum; &values; + &waitFor; -- cgit v1.2.3