diff options
author | Daniel Gustafsson <dgustafsson@postgresql.org> | 2022-04-05 14:45:31 +0200 |
---|---|---|
committer | Daniel Gustafsson <dgustafsson@postgresql.org> | 2022-04-05 14:45:31 +0200 |
commit | 16915126746e2d8597a92197a346fea0756f8e3e (patch) | |
tree | 22fdc1a2a8144bbd55fb5ed4fc572d9d570d03a3 /src/bin/pg_rewind/libpq_source.c | |
parent | 98fe74218d97becb2a53581304c96091409fd929 (diff) |
pg_rewind: Fetch small files according to new size.
There's a race condition if a file changes in the source system
after we have collected the file list. If the file becomes larger,
we only fetched up to its original size. That can easily result in
a truncated file. That's not a problem for relation files, files
in pg_xact, etc. because any actions on them will be replayed from
the WAL. However, configuration files are affected.
This commit mitigates the race condition by fetching small files in
whole, even if they have grown. A test is added in which an extra
file copied is concurrently grown with the output of pg_rewind thus
guaranteeing it to have changed in size during the operation. This
is not a full fix: we still believe the original file size for files
larger than 1 MB. That should be enough for configuration files,
and doing more than that would require big changes to the chunking
logic in libpq_source.c.
This mitigates the race condition if the file is modified between
the original scan of files and copying the file, but there's still
a race condition if a file is changed while it's being copied.
That's a much smaller window, though, and pg_basebackup has the
same issue.
This race can be seen with pg_auto_failover, which frequently uses
ALTER SYSTEM, which updates postgresql.auto.conf. Often, pg_rewind
will fail, because the postgresql.auto.conf file changed concurrently
and a partial version of it was copied to the target. The partial
file would fail to parse, preventing the server from starting up.
Author: Heikki Linnakangas
Reviewed-by: Cary Huang
Discussion: https://postgr.es/m/f67feb24-5833-88cb-1020-19a4a2b83ac7%40iki.fi
Diffstat (limited to 'src/bin/pg_rewind/libpq_source.c')
-rw-r--r-- | src/bin/pg_rewind/libpq_source.c | 32 |
1 files changed, 32 insertions, 0 deletions
diff --git a/src/bin/pg_rewind/libpq_source.c b/src/bin/pg_rewind/libpq_source.c index 997d4e2b482..011c9cce6eb 100644 --- a/src/bin/pg_rewind/libpq_source.c +++ b/src/bin/pg_rewind/libpq_source.c @@ -63,6 +63,7 @@ static void process_queued_fetch_requests(libpq_source *src); /* public interface functions */ static void libpq_traverse_files(rewind_source *source, process_file_callback_t callback); +static void libpq_queue_fetch_file(rewind_source *source, const char *path, size_t len); static void libpq_queue_fetch_range(rewind_source *source, const char *path, off_t off, size_t len); static void libpq_finish_fetch(rewind_source *source); @@ -88,6 +89,7 @@ init_libpq_source(PGconn *conn) src->common.traverse_files = libpq_traverse_files; src->common.fetch_file = libpq_fetch_file; + src->common.queue_fetch_file = libpq_queue_fetch_file; src->common.queue_fetch_range = libpq_queue_fetch_range; src->common.finish_fetch = libpq_finish_fetch; src->common.get_current_wal_insert_lsn = libpq_get_current_wal_insert_lsn; @@ -308,6 +310,36 @@ libpq_traverse_files(rewind_source *source, process_file_callback_t callback) } /* + * Queue up a request to fetch a file from remote system. + */ +static void +libpq_queue_fetch_file(rewind_source *source, const char *path, size_t len) +{ + /* + * Truncate the target file immediately, and queue a request to fetch it + * from the source. If the file is small, smaller than MAX_CHUNK_SIZE, + * request fetching a full-sized chunk anyway, so that if the file has + * become larger in the source system, after we scanned the source + * directory, we still fetch the whole file. This only works for files up + * to MAX_CHUNK_SIZE, but that's good enough for small configuration files + * and such that are changed every now and then, but not WAL-logged. For + * larger files, we fetch up to the original size. + * + * Even with that mechanism, there is an inherent race condition if the + * file is modified at the same instant that we're copying it, so that we + * might copy a torn version of the file with one half from the old + * version and another half from the new. But pg_basebackup has the same + * problem, and it hasn't been a problem in practice. + * + * It might seem more natural to truncate the file later, when we receive + * it from the source server, but then we'd need to track which + * fetch-requests are for a whole file. + */ + open_target_file(path, true); + libpq_queue_fetch_range(source, path, 0, Max(len, MAX_CHUNK_SIZE)); +} + +/* * Queue up a request to fetch a piece of a file from remote system. */ static void |