summaryrefslogtreecommitdiff
path: root/src/backend/access/transam
diff options
context:
space:
mode:
authorThomas Munro <tmunro@postgresql.org>2018-11-19 13:40:57 +1300
committerThomas Munro <tmunro@postgresql.org>2018-11-19 13:54:00 +1300
commitb9cce9ddfa1748c76d1d35c62aa35fe70ddc2dbe (patch)
treef7bfffd0123a509cfd36d2bcb5570780382c7599 /src/backend/access/transam
parenta4d6d6a25c12ca9a19a5e7ec6f76115bced69521 (diff)
PANIC on fsync() failure.
On some operating systems, it doesn't make sense to retry fsync(), because dirty data cached by the kernel may have been dropped on write-back failure. In that case the only remaining copy of the data is in the WAL. A subsequent fsync() could appear to succeed, but not have flushed the data. That means that a future checkpoint could apparently complete successfully but have lost data. Therefore, violently prevent any future checkpoint attempts by panicking on the first fsync() failure. Note that we already did the same for WAL data; this change extends that behavior to non-temporary data files. Provide a GUC data_sync_retry to control this new behavior, for users of operating systems that don't eject dirty data, and possibly forensic/testing uses. If it is set to on and the write-back error was transient, a later checkpoint might genuinely succeed (on a system that does not throw away buffers on failure); if the error is permanent, later checkpoints will continue to fail. The GUC defaults to off, meaning that we panic. Back-patch to all supported releases. There is still a narrow window for error-loss on some operating systems: if the file is closed and later reopened and a write-back error occurs in the intervening time, but the inode has the bad luck to be evicted due to memory pressure before we reopen, we could miss the error. A later patch will address that with a scheme for keeping files with dirty data open at all times, but we judge that to be too complicated to back-patch. Author: Craig Ringer, with some adjustments by Thomas Munro Reported-by: Craig Ringer Reviewed-by: Robert Haas, Thomas Munro, Andres Freund Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de
Diffstat (limited to 'src/backend/access/transam')
-rw-r--r--src/backend/access/transam/slru.c2
-rw-r--r--src/backend/access/transam/timeline.c4
-rw-r--r--src/backend/access/transam/xlog.c2
3 files changed, 4 insertions, 4 deletions
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index 9789d470780..e708967e2f3 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -921,7 +921,7 @@ SlruReportIOError(SlruCtl ctl, int pageno, TransactionId xid)
path, offset)));
break;
case SLRU_FSYNC_FAILED:
- ereport(ERROR,
+ ereport(data_sync_elevel(ERROR),
(errcode_for_file_access(),
errmsg("could not access status of transaction %u", xid),
errdetail("Could not fsync file \"%s\": %m.",
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index bd91573708b..5d73cc942cd 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -402,7 +402,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
}
if (pg_fsync(fd) != 0)
- ereport(ERROR,
+ ereport(data_sync_elevel(ERROR),
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
@@ -478,7 +478,7 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size)
}
if (pg_fsync(fd) != 0)
- ereport(ERROR,
+ ereport(data_sync_elevel(ERROR),
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d7505aa454b..004700febeb 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3283,7 +3283,7 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
}
if (pg_fsync(fd) != 0)
- ereport(ERROR,
+ ereport(data_sync_elevel(ERROR),
(errcode_for_file_access(),
errmsg("could not fsync file \"%s\": %m", tmppath)));