summaryrefslogtreecommitdiff
path: root/src/backend/utils/misc/timeout.c
AgeCommit message (Collapse)Author
2025-01-01Update copyright for 2025Bruce Momjian
Backpatch-through: 13
2024-03-04Remove unused #include's from backend .c filesPeter Eisentraut
as determined by include-what-you-use (IWYU) While IWYU also suggests to *add* a bunch of #include's (which is its main purpose), this patch does not do that. In some cases, a more specific #include replaces another less specific one. Some manual adjustments of the automatic result: - IWYU currently doesn't know about includes that provide global variable declarations (like -Wmissing-variable-declarations), so those includes are being kept manually. - All includes for port(ability) headers are being kept for now, to play it safe. - No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the patch from exploding in size. Note that this patch touches just *.c files, so nothing declared in header files changes in hidden ways. As a small example, in src/backend/access/transam/rmgr.c, some IWYU pragma annotations are added to handle a special case there. Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
2024-02-14Centralize logic for restoring errno in signal handlers.Nathan Bossart
Presently, we rely on each individual signal handler to save the initial value of errno and then restore it before returning if needed. This is easily forgotten and, if missed, often goes undetected for a long time. In commit 3b00fdba9f, we introduced a wrapper signal handler function that checks whether MyProcPid matches getpid(). This commit moves the aforementioned errno restoration code from the individual signal handlers to the new wrapper handler so that we no longer need to worry about missing it. Reviewed-by: Andres Freund, Noah Misch Discussion: https://postgr.es/m/20231121212008.GA3742740%40nathanxps13
2024-01-03Update copyright for 2024Bruce Momjian
Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
2023-01-02Update copyright for 2023Bruce Momjian
Backpatch-through: 11
2022-02-10Make timeout.c more robust against missed timer interrupts.Tom Lane
Commit 09cf1d522 taught schedule_alarm() to not do anything if the next requested event is after when we expect the next interrupt to fire. However, if somehow an interrupt gets lost, we'll continue to not do anything indefinitely, even after the "next interrupt" time is obviously in the past. Thus, one missed interrupt can break timeout scheduling for the life of the session. Michael Harris reported a scenario where a bug in a user-defined function caused this to happen, so you don't even need to assume kernel bugs exist to think this is worth fixing. We can make things more robust at little cost by detecting the case where signal_due_at is before "now" and forcing a new setitimer call to occur. This isn't a completely bulletproof fix of course; but in our typical usage pattern where we frequently set timeouts and clear them before they are reached, the interrupt will get re-enabled after at most one timeout interval, which with a little luck will be before we really need it. While here, let's mark signal_due_at as volatile, since the signal handler can both examine and set it. I'm not sure there's any actual risk given that signal_pending is already volatile, but it's surely questionable. Backpatch to v14 where this logic came in. Michael Harris and Tom Lane Discussion: https://postgr.es/m/CADofcAWbMrvgwSMqO4iG_iD3E2v8ZUrC-_crB41my=VMM02-CA@mail.gmail.com
2022-01-07Update copyright for 2022Bruce Momjian
Backpatch-through: 10
2021-10-25Add enable_timeout_every() to fire the same timeout repeatedly.Robert Haas
enable_timeout_at() and enable_timeout_after() can still be used when you want to fire a timeout just once. Patch by me, per a suggestion from Tom Lane. Discussion: http://postgr.es/m/2992585.1632938816@sss.pgh.pa.us Discussion: http://postgr.es/m/CA+TgmoYqSF5sCNrgTom9r3Nh=at4WmYFD=gsV-omStZ60S0ZUQ@mail.gmail.com
2021-01-06Improve commentary in timeout.c.Tom Lane
On re-reading I realized that I'd missed one race condition in the new timeout code. It's safe, but add a comment explaining it. Discussion: https://postgr.es/m/CA+hUKG+o6pbuHBJSGnud=TadsuXySWA7CCcPgCt2QE9F6_4iHQ@mail.gmail.com
2021-01-06Improve timeout.c's handling of repeated timeout set/cancel.Tom Lane
A very common usage pattern is that we set a timeout that we don't expect to reach, cancel it after a little bit, and later repeat. With the original implementation of timeout.c, this results in one setitimer() call per timeout set or cancel. We can do a lot better by being lazy about changing the timeout interrupt request, namely: (1) never cancel the outstanding interrupt, even when we have no active timeout events; (2) if we need to set an interrupt, but there already is one pending at or before the required time, leave it alone. When the interrupt happens, the signal handler will reschedule it at whatever time is then needed. For example, with a one-second setting for statement_timeout, this method results in having to interact with the kernel only a little more than once a second, no matter how many statements we execute in between. The mainline code might never call setitimer() at all after the first time, while each time the signal handler fires, it sees that the then-pending request is most of a second away, and that's when it sets the next interrupt request for. Each mainline timeout-set request after that will observe that the time it wants is past the pending interrupt request time, and do nothing. This also works pretty well for cases where a few different timeout lengths are in use, as long as none of them are very short. But that describes our usage well. Idea and original patch by Thomas Munro; I fixed a race condition and improved the comments. Discussion: https://postgr.es/m/CA+hUKG+o6pbuHBJSGnud=TadsuXySWA7CCcPgCt2QE9F6_4iHQ@mail.gmail.com
2021-01-02Update copyright for 2021Bruce Momjian
Backpatch-through: 9.5
2020-01-01Update copyrights for 2020Bruce Momjian
Backpatch-through: update all files in master, backpatch legal files through 9.4
2019-10-25Improve management of statement timeouts.Tom Lane
Commit f8e5f156b added private state in postgres.c to track whether a statement timeout is running. This seems like bad design to me; timeout.c's private state should be the single source of truth about that. We already fixed one bug associated with failure to keep those states in sync (cf. be42015fc), and I've got little faith that we won't find more in future. So get rid of postgres.c's local variable by exposing a way to ask timeout.c whether a timeout is running. (Obviously, such an inquiry is subject to race conditions, but it seems fine for the purpose at hand.) To make get_timeout_active() as cheap as possible, add a flag in the per-timeout struct showing whether that timeout is active. This allows some small savings elsewhere in timeout.c, mainly elimination of unnecessary searches of the active_timeouts array. While at it, fix enable_statement_timeout to not call disable_timeout when statement_timeout is 0 and the timeout is not running. This avoids a useless deschedule-and-reschedule-timeouts cycle, which represents a significant savings (at least one kernel call) when there is any other active timeout. Right now, there usually isn't, but there are proposals around to change that. Discussion: https://postgr.es/m/16035-456e6e69ebfd4374@postgresql.org
2019-01-02Update copyright for 2019Bruce Momjian
Backpatch-through: certain files through 9.4
2018-01-02Update copyright for 2018Bruce Momjian
Backpatch-through: certain files through 9.3
2017-09-07Reduce excessive dereferencing of function pointersPeter Eisentraut
It is equivalent in ANSI C to write (*funcptr) () and funcptr(). These two styles have been applied inconsistently. After discussion, we'll use the more verbose style for plain function pointer variables, to make it clear that it's a variable, and the shorter style when the function pointer is in a struct (s.func() or s->func()), because then it's clear that it's not a plain function name, and otherwise the excessive punctuation makes some of those invocations hard to read. Discussion: https://www.postgresql.org/message-id/f52c16db-14ed-757d-4b48-7ef360b1631d@2ndquadrant.com
2017-01-03Update copyright via script for 2017Bruce Momjian
2016-05-27Be more predictable about reporting "lock timeout" vs "statement timeout".Tom Lane
If both timeout indicators are set when we arrive at ProcessInterrupts, we've historically just reported "lock timeout". However, some buildfarm members have been observed to fail isolationtester's timeouts test by reporting "lock timeout" when the statement timeout was expected to fire first. The cause seems to be that the process is allowed to sleep longer than expected (probably due to heavy machine load) so that the lock timeout happens before we reach the point of reporting the error, and then this arbitrary tiebreak rule does the wrong thing. We can improve matters by comparing the scheduled timeout times to decide which error to report. I had originally proposed greatly reducing the 1-second window between the two timeouts in the test cases. On reflection that is a bad idea, at least for the case where the lock timeout is expected to fire first, because that would assume that it takes negligible time to get from statement start to the beginning of the lock wait. Thus, this patch doesn't completely remove the risk of test failures on slow machines. Empirically, however, the case this handles is the one we are seeing in the buildfarm. The explanation may be that the other case requires the scheduler to take the CPU away from a busy process, whereas the case fixed here only requires the scheduler to not give the CPU back right away to a process that has been woken from a multi-second sleep (and, perhaps, has been swapped out meanwhile). Back-patch to 9.3 where the isolationtester timeouts test was added. Discussion: <8693.1464314819@sss.pgh.pa.us>
2016-01-02Update copyright for 2016Bruce Momjian
Backpatch certain files through 9.1
2015-02-03Remove remnants of ImmediateInterruptOK handling.Andres Freund
Now that nothing sets ImmediateInterruptOK to true anymore, we can remove all the supporting code. Reviewed-By: Heikki Linnakangas
2015-01-14Add a default local latch for use in signal handlers.Andres Freund
To do so, move InitializeLatchSupport() into the new common process initialization functions, and add a new global variable MyLatch. MyLatch is usable as soon InitPostmasterChild() has been called (i.e. very early during startup). Initially it points to a process local latch that exists in all processes. InitProcess/InitAuxiliaryProcess then replaces that local latch with PGPROC->procLatch. During shutdown the reverse happens. This is primarily advantageous for two reasons: For one it simplifies dealing with the shared process latch, especially in signal handlers, because instead of having to check for MyProc, MyLatch can be used unconditionally. For another, a later patch that makes FEs/BE communication use latches, now can rely on the existence of a latch, even before having gone through InitProcess. Discussion: 20140927191243.GD5423@alap3.anarazel.de
2015-01-06Update copyright for 2015Bruce Momjian
Backpatch certain files through 9.0
2014-05-06pgindent run for 9.4Bruce Momjian
This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
2014-01-07Update copyright for 2014Bruce Momjian
Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
2013-12-13Don't let timeout interrupts happen unless ImmediateInterruptOK is set.Tom Lane
Serious oversight in commit 16e1b7a1b7f7ffd8a18713e83c8cd72c9ce48e07: we should not allow an interrupt to take control away from mainline code except when ImmediateInterruptOK is set. Just to be safe, let's adopt the same save-clear-restore dance that's been used for many years in HandleCatchupInterrupt and HandleNotifyInterrupt, so that nothing bad happens if a timeout handler invokes code that tests or even manipulates ImmediateInterruptOK. Per report of "stuck spinlock" failures from Christophe Pettus, though many other symptoms are possible. Diagnosis by Andres Freund.
2013-11-29Fix assorted race conditions in the new timeout infrastructure.Tom Lane
Prevent handle_sig_alarm from losing control partway through due to a query cancel (either an asynchronous SIGINT, or a cancel triggered by one of the timeout handler functions). That would at least result in failure to schedule any required future interrupt, and might result in actual corruption of timeout.c's data structures, if the interrupt happened while we were updating those. We could still lose control if an asynchronous SIGINT arrives just as the function is entered. This wouldn't break any data structures, but it would have the same effect as if the SIGALRM interrupt had been silently lost: we'd not fire any currently-due handlers, nor schedule any new interrupt. To forestall that scenario, forcibly reschedule any pending timer interrupt during AbortTransaction and AbortSubTransaction. We can avoid any extra kernel call in most cases by not doing that until we've allowed LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events. Another hazard is that some platforms (at least Linux and *BSD) block a signal before calling its handler and then unblock it on return. When we longjmp out of the handler, the unblock doesn't happen, and the signal is left blocked indefinitely. Again, we can fix that by forcibly unblocking signals during AbortTransaction and AbortSubTransaction. These latter two problems do not manifest when the longjmp reaches postgres.c, because the error recovery code there kills all pending timeout events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal mask is restored. So errors thrown outside any transaction should be OK already, and cleaning up in AbortTransaction and AbortSubTransaction should be enough to fix these issues. (We're assuming that any code that catches a query cancel error and doesn't re-throw it will do at least a subtransaction abort to clean up; but that was pretty much required already by other subsystems.) Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when disabling that event: if a lock timeout interrupt happened after the lock was granted, the ensuing query cancel is still going to happen at the next CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user cancel. Per reports from Dan Wood. Back-patch to 9.3 where the new timeout handling infrastructure was introduced. We may at some point decide to back-patch the signal unblocking changes further, but I'll desist from that until we hear actual field complaints about it.
2013-03-17Improve signal-handler lockout mechanism in timeout.c.Tom Lane
Rather than doing a fairly-expensive setitimer() call to prevent interrupts from happening, let's just invent a simple boolean flag that the signal handler is required to check. This is not only faster but considerably more robust than before, since the previous code effectively assumed that only ITIMER_REAL events would ever fire the SIGALRM handler, which is obviously something that can be broken easily by third-party code. Zoltán Böszörményi and Tom Lane
2013-03-17Move pqsignal() to libpgport.Tom Lane
We had two copies of this function in the backend and libpq, which was already pretty bogus, but it turns out that we need it in some other programs that don't use libpq (such as pg_test_fsync). So put it where it probably should have been all along. The signal-mask-initialization support in src/backend/libpq/pqsignal.c stays where it is, though, since we only need that in the backend.
2013-03-16Add lock_timeout configuration parameter.Tom Lane
This GUC allows limiting the time spent waiting to acquire any one heavyweight lock. In support of this, improve the recently-added timeout infrastructure to permit efficiently enabling or disabling multiple timeouts at once. That reduces the performance hit from turning on lock_timeout, though it's still not zero. Zoltán Böszörményi, reviewed by Tom Lane, Stephen Frost, and Hari Babu
2013-01-01Update copyrights for 2013Bruce Momjian
Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
2012-07-16Introduce timeout handling frameworkAlvaro Herrera
Management of timeouts was getting a little cumbersome; what we originally had was more than enough back when we were only concerned about deadlocks and query cancel; however, when we added timeouts for standby processes, the code got considerably messier. Since there are plans to add more complex timeouts, this seems a good time to introduce a central timeout handling module. External modules register their timeout handlers during process initialization, and later enable and disable them as they see fit using a simple API; timeout.c is in charge of keeping track of which timeouts are in effect at any time, installing a common SIGALRM signal handler, and calling setitimer() as appropriate to ensure timely firing of external handlers. timeout.c additionally supports pluggable modules to add their own timeouts, though this capability isn't exercised anywhere yet. Additionally, as of this commit, walsender processes are aware of timeouts; we had a preexisting bug there that made those ignore SIGALRM, thus being subject to unhandled deadlocks, particularly during the authentication phase. This has already been fixed in back branches in commit 0bf8eb2a, which see for more details. Main author: Zoltán Böszörményi Some review and cleanup by Álvaro Herrera Extensive reworking by Tom Lane