| Age | Commit message (Collapse) | Author |
|
The main optimization is for LockBufHdr() to delay initializing
SpinDelayStatus, similar to what LWLockWaitListLock already did. The
initialization is sufficiently expensive & buffer header lock acquisitions are
sufficiently frequent, to make it worthwhile to instead have a fastpath (via a
likely() branch) that does not initialize the SpinDelayStatus.
While LWLockWaitListLock() already the aforementioned optimization, it did not
use likely(), and inspection of the assembly shows that this indeed leads to
worse code generation (also observed in a microbenchmark). Fix that by adding
the likely().
While the LockBufHdr() improvement is a small gain on its own, it mainly is
aimed at preventing a regression after a future commit, which requires
additional locking to set hint bits.
While touching both, also make the comments more similar to each other.
Reviewed-by: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff
|
|
The idea is to encourage more the use of these new routines across the
tree, as these offer stronger type safety guarantees than palloc().
This batch of changes includes most of the trivial changes suggested by
the author for src/backend/.
A total of 334 files are updated here. Among these files, 48 of them
have their build change slightly; these are caused by line number
changes as the new allocation formulas are simpler, shaving around 100
lines of code in total.
Similar work has been done in 0c3c5c3b06a3 and 31d3847a37be.
Author: David Geier <geidav.pg@gmail.com>
Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com
|
|
This removes some casts where the input already has the same type as
the type specified by the cast. Their presence could cause risks of
hiding actual type mismatches in the future or silently discarding
qualifiers. It also improves readability. Same kind of idea as
7f798aca1d5 and ef8fe693606. (This does not change all such
instances, but only those hand-picked by the author.)
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/aSQy2JawavlVlEB0%40ip-10-97-1-34.eu-west-3.compute.internal
|
|
Accidentally the code in LWLockWakeup() checked the list of to-be-woken up
processes to see if LW_FLAG_HAS_WAITERS should be unset. That means that
HAS_WAITERS would not get unset immediately, but only during the next,
unnecessary, call to LWLockWakeup().
Luckily, as the code stands, this is just a small efficiency issue.
However, if there were (as in a patch of mine) a case in which LWLockWakeup()
would not find any backend to wake, despite the wait list not being empty,
we'd wrongly unset LW_FLAG_HAS_WAITERS, leading to potentially hanging.
While the consequences in the backbranches are limited, the code as-is
confusing, and it is possible that there are workloads where the additional
wait list lock acquisitions hurt, therefore backpatch.
Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff
Backpatch-through: 14
|
|
Their presence causes (small) risks of hiding actual type mismatches
or silently discarding qualifiers. Some have been missed in
7f798aca1d5 and some are new ones along the same lines.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/aR8Yv%2BuATLKbJCgI%40ip-10-97-1-34.eu-west-3.compute.internal
|
|
These files relied on transitive inclusion via port/atomics.h for
constants CHAR_BIT and INT_MAX.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/536409d2-c9df-4ef3-808d-1ffc3182868c@iki.fi
|
|
Before commit e25626677f, spinlocks were implemented using semaphores
on some platforms (--disable-spinlocks). That made it necessary to
initialize semaphores early, before any spinlocks could be used. Now
that we don't support --disable-spinlocks anymore, we can allocate the
shared memory needed for semaphores the same way as other shared
memory structures. Since the semaphores are used only in the PGPROC
array, move the semaphore shmem size estimation and initialization
calls to ProcGlobalShmemSize() and InitProcGlobal().
Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAExHW5seSZpPx-znjidVZNzdagGHOk06F+Ds88MpPUbxd1kTaA@mail.gmail.com
|
|
WAIT FOR is to be used on standby and specifies waiting for
the specific WAL location to be replayed. This option is useful when
the user makes some data changes on primary and needs a guarantee to see
these changes are on standby.
WAIT FOR needs to wait without any snapshot held. Otherwise, the snapshot
could prevent the replay of WAL records, implying a kind of self-deadlock.
This is why separate utility command seems appears to be the most robust
way to implement this functionality. It's not possible to implement this as
a function. Previous experience shows that stored procedures also have
limitation in this aspect.
Discussion: https://www.postgresql.org/message-id/flat/CAPpHfdsjtZLVzxjGT8rJHCYbM0D5dwkO+BBjcirozJ6nYbOW8Q@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/flat/CABPTF7UNft368x-RgOXkfj475OwEbp%2BVVO-wEXz7StgjD_%3D6sw%40mail.gmail.com
Author: Kartyshov Ivan <i.kartyshov@postgrespro.ru>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
|
|
This is a follow up 991295f. I searched over src/ and made all
ItemPointer arguments as const as much as possible.
Note: We cut out from the original patch the pieces that would have
created incompatibilities in the index or table AM APIs. Those could
be considered separately.
Author: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAEoWx2nBaypg16Z5ciHuKw66pk850RFWw9ACS2DqqJ_AkKeRsw%40mail.gmail.com
|
|
When shared memory is re-initialized after a crash, the named
LWLock tranche request array that was copied to shared memory will
no longer be accessible. To fix, save the pointer to the original
array in postmaster's local memory, and switch to it when
re-initializing the LWLock-related shared memory.
Oversight in commit ed1aad15e0. Per buildfarm member batta.
Reported-by: Michael Paquier <michael@paquier.xyz>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aMoejB3iTWy1SxfF%40paquier.xyz
Discussion: https://postgr.es/m/f8ca018f-3479-49f6-a92c-e31db9f849d7%40gmail.com
|
|
If someone is stuck behind a lock for more than a second, that is
almost always a problem that is worth a log entry.
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-By: Michael Banck <mbanck@gmx.net>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Christoph Berg <myon@debian.org>
Reviewed-By: Stephen Frost <sfrost@snowman.net>
Discussion: https://postgr.es/m/b8b8502915e50f44deb111bc0b43a99e2733e117.camel%40cybertec.at
|
|
Per discussion, this compiler suite is no longer maintained, and
it has not been able to compile PostgreSQL since at least PostgreSQL
17.
This removes all the remaining support code for this compiler.
Note that the Solaris operating system continues to be supported, but
using GCC as the compiler.
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/a0f817ee-fb86-483a-8a14-b6f7f5991b6e%40eisentraut.org
|
|
In EXEC_BACKEND builds, GetNamedLWLockTranche() can segfault when
called outside of the postmaster process, as it might access
NamedLWLockTrancheRequestArray, which won't be initialized. Given
the lack of reports, this is apparently unusual, presumably because
it is usually called from a shmem_startup_hook like this:
mystruct = ShmemInitStruct(..., &found);
if (!found)
{
mystruct->locks = GetNamedLWLockTranche(...);
...
}
This genre of shmem_startup_hook evades the aforementioned
segfaults because the struct is initialized in the postmaster, so
all other callers skip the !found path.
We considered modifying the documentation or requiring
GetNamedLWLockTranche() to be called from the postmaster, but
ultimately we decided to simply move the request array to shared
memory (and add it to the BackendParameters struct), thereby
allowing calls outside postmaster on all platforms. Since the main
shared memory segment is initialized after accepting LWLock tranche
requests, postmaster builds the request array in local memory first
and then copies it to shared memory later.
Given the lack of reports, back-patching seems unnecessary.
Reported-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0v1_15QPg5Sqd2Qz5rh_qcsyCeHHmRDY89xVHcy2yt5BQ%40mail.gmail.com
|
|
Commit 38b602b028 modified this function to allocate enough space
for MAX_NAMED_TRANCHES (256) requests, which is likely far more
than most clusters need. This commit reverts that change so that
it first allocates enough space for only 16 requests and resizes
the array when necessary. While at it, remove the check for too
many tranches from this function. We can now rely on
InitializeLWLocks() to do that check via its calls to
LWLockNewTrancheId() for the named tranches.
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/aLmzwC2dRbqk14y6%40nathan
|
|
There are two ways for shared libraries to allocate their own
LWLock tranches. One way is to call RequestNamedLWLockTranche() in
a shmem_request_hook, which requires the library to be loaded via
shared_preload_libraries. The other way is to call
LWLockNewTrancheId(), which is not subject to the same
restrictions. However, LWLockNewTrancheId() does require each
backend to store the tranche's name in backend-local memory via
LWLockRegisterTranche(). This API is a little cumbersome and leads
to things like unhelpful pg_stat_activity.wait_event values in
backends that haven't loaded the library.
This commit moves these LWLock tranche names to shared memory, thus
eliminating the need for each backend to call
LWLockRegisterTranche(). Instead, the tranche name must be
provided to LWLockNewTrancheId(), which immediately makes the name
available to all backends. Since the tranche name array is
append-only, lookups can ordinarily avoid locking as long as their
local copy of the LWLock counter is greater than the requested
tranche ID.
One downside of this approach is that we now have a hard limit on
both the length of tranche names (NAMEDATALEN-1 bytes) and the
number of dynamically-allocated tranches (256). Besides a limit of
NAMEDATALEN-1 bytes for tranche names registered via
RequestNamedLWLockTranche(), no such limits previously existed. We
could avoid these new limits by using dynamic shared memory, but
the complexity involved didn't seem worth it. We briefly
considered making the tranche limit user-configurable but
ultimately decided against that, too. Since there is still a lot
of time left in the v19 development cycle, it's possible we will
revisit this choice.
Author: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAA5RZ0vvED3naph8My8Szv6DL4AxOVK3eTPS0qXsaKi%3DbVdW2A%40mail.gmail.com
|
|
Show the requested lock level and the object being waited on,
in the same format we use for deadlock reports and similar errors.
This is particularly helpful for debugging lock-timeout errors,
since otherwise the user has very little to go on about which
lock timed out. The performance cost of setting up the callback
should be negligible compared to the other tracing support already
present in WaitOnLock.
As in the deadlock-report case, we just show numeric object OIDs,
because it seems too scary to try to perform catalog lookups
in this context.
Reported-by: Steve Baldwin <steve.baldwin@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1602369.1752167154@sss.pgh.pa.us
|
|
Using the LWLockCounter requires first calculating its address in
shared memory like this:
LWLockCounter = (int *) ((char *) MainLWLockArray - sizeof(int));
Commit 82e861fbe1 started this trend in order to fix EXEC_BACKEND
builds, but it could also be fixed by adding it to the
BackendParameters struct. The current approach is somewhat
difficult to follow, so this commit switches to the latter. While
at it, swap around the code in LWLockShmemSize() to match the order
of assignments in CreateLWLocks() for added readability.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aLDLnan9gNCS9fHx%40nathan
|
|
The functions LockTuple, ConditionalLockTuple, UnlockTuple, and
XactLockTableWait take an ItemPointer argument that they do not
modify, so the argument can be const-qualified to better convey intent
and allow the compiler to enforce immutability.
Author: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAEoWx2m9e4rECHBwpRE4%2BGCH%2BpbYZXLh2f4rB1Du5hDfKug%2BOg%40mail.gmail.com
|
|
This code was relying on "long", which is signed 8 bytes everywhere
except on Windows where it is 4 bytes, that could potentially expose it
to overflows, even if the current uses in the code are fine as far as I
know. This code is now able to rely on the same sizeof() variable
everywhere, with int64. long was used for sizes, partition counts and
entry counts.
Some callers of the dynahash.c routines used long declarations, that can
be cleaned up to use int64 instead. There was one shortcut based on
SIZEOF_LONG, that can be removed. long is entirely removed from
dynahash.c and hsearch.h.
Similar work was done in b1e5c9fa9ac4.
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aKQYp-bKTRtRauZ6@paquier.xyz
|
|
This comment mentions that pg_buffercache locks all buffer
partitions simultaneously, but it hasn't done so since v10.
Oversight in commit 6e654546fb.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/aKTuAHVEuYCUmmIy%40nathan
|
|
Add various missing conversions from and to Datum. The previous code
mostly relied on implicit conversions or its own explicit casts
instead of using the correct DatumGet*() or *GetDatum() functions.
We think these omissions are harmless. Some actual bugs that were
discovered during this process have been committed
separately (80c758a2e1d, fd2ab03fea2).
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org
|
|
lwlock.c, lwlock.h, and wait_event_names.txt each contain a list of
built-in LWLock tranches. It is easy to miss one or the other when
adding or removing tranches, and discrepancies have adverse effects
(e.g., breaking JOINs between pg_stat_activity and pg_wait_events).
This commit moves the lists of built-in tranches in lwlock.{c,h} to
lwlocklist.h and adds a cross-check to the script that generates
lwlocknames.h. If the lists do not match exactly, building will
fail.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aHpOgwuFQfcFMZ/B%40ip-10-97-1-34.eu-west-3.compute.internal
|
|
Oversight in commit da952b415f.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aHpOgwuFQfcFMZ/B%40ip-10-97-1-34.eu-west-3.compute.internal
|
|
The terms used in wait_event_names.txt and lwlock.c were inconsistent
for MultiXactOffsetSLRU and MultiXactMemberSLRU, which could cause joins
between pg_wait_events and pg_stat_activity to fail. lwlock.c is
adjusted in this commit to what the historical name of the event has
always been, and what is documented.
Oversight in 53c2a97a9266. 08b9b9e043bb has fixed a similar
inconsistency some time ago.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/aHdxN0D0hKXzHFQG@ip-10-97-1-34.eu-west-3.compute.internal
Backpatch-through: 17
|
|
This refactoring is a follow-up of the work done in 5a1dfde8334b, that
has switched 2PC file names to use FullTransactionIds when written on
disk. This will help with the integration of a follow-up solution
related to the handling of two-phase files during recovery, to address
older defects while reading these from disk after a crash.
This change is useful in itself as it reduces the need to build the
file names from epoch numbers and TransactionIds, because we can use
directly FullTransactionIds from which the 2PC file names are guessed.
So this avoids a lot of back-and-forth between the FullTransactionIds
retrieved from the file names and how these are passed around in the
internal 2PC logic.
Note that the core of the change is the use of a FullTransactionId
instead of a TransactionId in GlobalTransactionData, that tracks 2PC
file information in shared memory. The change in TwoPhaseCallback makes
this commit unfit for stable branches.
Noah has contributed a good chunk of this patch. I have spent some time
on it as well while working on the issues with two-phase state files and
recovery.
Author: Noah Misch <noah@leadboat.com>
Co-Authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/Z5sd5O9JO7NYNK-C@paquier.xyz
Discussion: https://postgr.es/m/20250116205254.65.nmisch@google.com
|
|
This commit renames the GUC log_lock_failure to log_lock_failures
to align with the existing similar setting log_lock_waits, which uses
the plural form. This improves naming consistency across related GUCs.
Suggested-by: Peter Eisentraut <peter@eisentraut.org>
Author: Fujii Masao <masao.fujii@gmail.com
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/7a8198b6-d5b8-4910-b41e-8d3efcbb015d@eisentraut.org
|
|
Previously, XactLockTableWait() and ConditionalXactLockTableWait() could enter
a non-interruptible loop when they successfully acquired a lock on a transaction
but the transaction still appeared to be running. Since this loop continued
until the transaction completed, it could result in long, uninterruptible waits.
Although this scenario is generally unlikely since XactLockTableWait() and
ConditionalXactLockTableWait() can basically acquire a transaction lock
only when the transaction is not running, it can occur in a hot standby.
In such cases, the transaction may still appear active due to
the KnownAssignedXids list, even while no lock on the transaction exists.
For example, this situation can happen when creating a logical replication
slot on a standby.
The cause of the non-interruptible loop was the absence of CHECK_FOR_INTERRUPTS()
within it. This commit adds CHECK_FOR_INTERRUPTS() to the loop in both functions,
ensuring they can be interrupted safely.
Back-patch to all supported branches.
Author: Kevin K Biju <kevinkbiju@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAM45KeELdjhS-rGuvN=ZLJ_asvZACucZ9LZWVzH7bGcD12DDwg@mail.gmail.com
Backpatch-through: 13
|
|
Due to concerns raised about the approach, and memory leaks found
in sensitive contexts the functionality is reverted. This reverts
commits 45e7e8ca9, f8c115a6c, d2a1ed172, 55ef7abf8 and 042a66291
for v18 with an intent to revisit this patch for v19.
Discussion: https://postgr.es/m/594293.1747708165@sss.pgh.pa.us
|
|
c4d5cb71d2 adjusted the fast-path locking code to allow some
configuration of the number of fast-path locking slots via the
max_locks_per_transaction GUC. In that commit the FAST_PATH_REL_GROUP()
macro used integer division to determine the fast-path locking group slot
to use for the lock.
The divisor in this case is always a power-of-two value. Here we swap
out the divide by a bitwise-AND, which is a significantly faster
operation to perform.
In passing, adjust the code that's setting FastPathLockGroupsPerBackend
so that it's more clear that the value being set is a power-of-two.
Also, adjust some comments in the area which contained some magic
numbers. It seems better to justify the 1024 upper limit in the
location where the #define is made instead of where it is used.
Author: David Rowley <drowleyml@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAApHDvodr3bcnpxcs7+k-3cFwYR0tP-BYhyd2PpDhe-bCx9i=g@mail.gmail.com
|
|
Commit 0bada39c83a150079567a6e97b1a25a198f30ea3 fixed a bug of this kind,
which existed in all branches for six days before detection. While the
probability of reaching the trouble was low, the disruption was extreme. No
new backends could start, and service restoration needed an immediate
shutdown. Hence, add this to catch the next bug like it.
The new check in RelationIdGetRelation() suffices to make autovacuum detect
the bug in commit 243e9b40f1b2dd09d6e5bf91ebf6e822a2cd3704 that led to commit
0bada39. This also checks in a number of similar places. It replaces each
Assert(IsTransactionState()) that pertained to a conditional catalog read.
No back-patch for now, but a back-patch of commit 243e9b4 should back-patch
this, too. A back-patch could omit the src/test/regress changes, since back
branches won't gain new index columns.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/20250410191830.0e.nmisch@google.com
Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com
|
|
This fixes typos in docs and comments introduced during the v18
development cycle, to keep them from ending up in backbranches.
Author: Jacob Brazeal <jacob.brazeal@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CA+COZaCgGua25f2hSrjrDLJcJJAHkwoKgTTqUy-wyL1=64JNjw@mail.gmail.com
|
|
This adds a function for retrieving memory context statistics
and information from backends as well as auxiliary processes.
The intended usecase is cluster debugging when under memory
pressure or unanticipated memory usage characteristics.
When calling the function it sends a signal to the specified
process to submit statistics regarding its memory contexts
into dynamic shared memory. Each memory context is returned
in detail, followed by a cumulative total in case the number
of contexts exceed the max allocated amount of shared memory.
Each process is limited to use at most 1Mb memory for this.
A summary can also be explicitly requested by the user, this
will return the TopMemoryContext and a cumulative total of
all lower contexts.
In order to not block on busy processes the caller specifies
the number of seconds during which to retry before timing out.
In the case where no statistics are published within the set
timeout, the last known statistics are returned, or NULL if
no previously published statistics exist. This allows dash-
board type queries to continually publish even if the target
process is temporarily congested. Context records contain a
timestamp to indicate when they were submitted.
Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Discussion: https://postgr.es/m/CAH2L28v8mc9HDt8QoSJ8TRmKau_8FM_HKS41NeO9-6ZAkuZKXw@mail.gmail.com
|
|
Various places allocated shared memory by first allocating a small chunk
using ShmemInitStruct(), followed by ShmemAlloc() calls to allocate more
memory. Unfortunately, ShmemAlloc() does not update ShmemIndex, so this
affected pg_shmem_allocations - it only shown the initial chunk.
This commit modifies the following allocations, to allocate everything
as a single chunk, and then split it internally.
- PredXactList
- RWConflictPool
- PGPROC structures
- Fast-Path Lock Array
The fast-path lock array is allocated separately, not as a part of the
PGPROC structures allocation.
Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAH2L28vHzRankszhqz7deXURxKncxfirnuW68zD7+hVAqaS5GQ@mail.gmail.com
|
|
The refactoring in commit 3c0fd64fec removed the clearing of
awaitedLock from LockErrorCleanup(). It's still needed, otherwise
LockErrorCleanup() during abort processing will try to update the
LOCALLOCK struct even after the lock has already been released. Put it
back.
Reported-by: Richard Guo <guofenglinux@gmail.com>
Reported-by: Robins Tharakan <tharakan@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4_dNX1SzBmvFdoY-LxJh_4W_BjtVd5i008ihfU-wFF=eg@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/18832-38e5575b1bbd7277@postgresql.org
Discussion: https://www.postgresql.org/message-id/e11a30e5-c0d8-491d-8546-3a1b50c10ad4@gmail.com
|
|
Performing AIO using io_uring can be considerably faster than
io_method=worker, particularly when lots of small IOs are issued, as
a) the context-switch overhead for worker based AIO becomes more significant
b) the number of IO workers can become limiting
io_uring, however, is linux specific and requires an additional compile-time
dependency (liburing).
This implementation is fairly simple and there are substantial optimization
opportunities.
The description of the existing AIO_IO_COMPLETION wait event is updated to
make the difference between it and the new AIO_IO_URING_EXECUTION clearer.
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de
Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
|
|
This commit introduces a new GUC, log_lock_failure, which controls whether
a detailed log message is produced when a lock acquisition fails. Currently,
it only supports logging lock failures caused by SELECT ... NOWAIT.
The log message includes information about all processes holding or
waiting for the lock that couldn't be acquired, helping users analyze and
diagnose the causes of lock failures.
Currently, this option does not log failures from SELECT ... SKIP LOCKED,
as that could generate excessive log messages if many locks are skipped,
causing unnecessary noise.
This mechanism can be extended in the future to support for logging
lock failures from other commands, such as LOCK TABLE ... NOWAIT.
Author: Yuki Seino <seinoyu@oss.nttdata.com>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/411280a186cc26ef7034e0f2dfe54131@oss.nttdata.com
|
|
This commit improves efficiency in FastPathTransferRelationLocks()
and GetLockConflicts(), which iterate over PGPROCs to search for
fast-path locks.
Previously, these functions recalculated the fast-path group during
every loop iteration, even though it remained constant. This update
optimizes the process by calculating the group once and reusing it
throughout the loop.
The functions also now skip empty fast-path groups, avoiding
unnecessary scans of their slots. Additionally, groups belonging to
inactive backends (with pid=0) are always empty, so checking
the group is sufficient to bypass these backends, further enhancing
performance.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/07d5fd6a-71f1-4ce8-8602-4cc6883f4bd1@oss.nttdata.com
|
|
We can make the output look a bit better by aligning each lock's
definition, so add some padding space to achieve that. This change
makes no practical difference, but casual onlookers will be less
distracted by (lack of) whitespace.
Author: Gurjeet Singh <gurjeet@singh.im>
Discussion: https://postgr.es/m/CABwTF4VxfwDtRV-H22_XK4XeDogaV-Vaobu+af5U=8ZAZn9ZZQ@mail.gmail.com
|
|
Teach parallel nbtree index scans to use an LWLock (not a spinlock) to
protect the scan's shared descriptor state.
Preparation for an upcoming patch that will add skip scan optimizations
to nbtree. That patch will create the need to occasionally allocate
memory while the scan descriptor is locked, while copying datums that
were serialized by another backend.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
|
|
The FP_LOCK_SLOTS_PER_BACKEND macro looks like a constant, but it
depends on the max_locks_per_transaction GUC, and thus can change. This
is non-obvious and confusing, so make it look more like a function by
renaming it to FastPathLockSlotsPerBackend().
While at it, use the macro when initializing fast-path shared memory,
instead of using the formula.
Reported-by: Andres Freund
Discussion: https://postgr.es/m/ffiwtzc6vedo6wb4gbwelon5nefqg675t5c7an2ta7pcz646cg%40qwmkdb3l4ett
|
|
So far the various dependencies were documented in the comment above
MAX_BACKENDS, but not checked.
Discussion: https://postgr.es/m/CA+COZaBO_s3LfALq=b+HcBHFSOEGiApVjrRacCe4VP9m7CJsNQ@mail.gmail.com
|
|
Jacob reported that comments for LW_SHARED_MASK referenced a MAX_BACKENDS
limit of 2^23-1, but that MAX_BACKENDS is actually limited to 2^18-1. The
limit was lowered in 48354581a49c, but the comment in lwlock.c wasn't updated.
Instead of just fixing the comment, it seems better to directly base the
lwlock defines on MAX_BACKENDS and add static assertions to ensure that there
is enough space. That way there's no comment that can go out of sync in the
future.
As part of that change I noticed that for some reason the high bit wasn't used
for flags, which seems somewhat odd. Redefine the flag values to start at the
highest bit.
Reported-by: Jacob Brazeal <jacob.brazeal@gmail.com>
Reviewed-by: Jacob Brazeal <jacob.brazeal@gmail.com>
Discussion: https://postgr.es/m/CA+COZaBO_s3LfALq=b+HcBHFSOEGiApVjrRacCe4VP9m7CJsNQ@mail.gmail.com
|
|
MAX_BACKENDS influences many things besides postmaster. I e.g. noticed that we
don't have static assertions ensuring BUF_REFCOUNT_MASK is big enough for
MAX_BACKENDS, adding them would require including postmaster.h in
buf_internals.h which doesn't seem right.
While at that, add MAX_BACKENDS_BITS, as that's useful in various places for
static assertions (to be added in subsequent commits).
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/wptizm4qt6yikgm2pt52xzyv6ycmqiutloyvypvmagn7xvqkce@d4xuv3mylpg4
|
|
To implement AIO writes, the backend initiating writes needs to transfer the
lock ownership to the AIO subsystem, so the lock held during the write can be
released in another backend.
Other backends need to be able to "complete" an asynchronously started IO to
avoid deadlocks (consider e.g. one backend starting IO for a buffer and then
waiting for a heavyweight lock held by another relation followed by the
current holder of the heavyweight lock waiting for the IO to complete).
To that end, this commit adds LWLockDisown() and LWLockReleaseDisowned(). If
code uses LWLockDisown() it's the code's responsibility to ensure that the
lock is released in case of errors.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/1f6b50a7-38ef-4d87-8246-786d39f46ab9@iki.fi
|
|
Some places spelled it "it's", which is short for "it is".
In passing, fix a couple other nearby grammatical errors.
Author: Jacob Brazeal <jacob.brazeal@gmail.com>
Discussion: https://postgr.es/m/CA+COZaAO8g1KJCV0T48=CkJMjAnnfTGLWOATz+2aCh40c2Nm+g@mail.gmail.com
|
|
This commit introduces a new parameter named
autovacuum_worker_slots that controls how many autovacuum worker
slots to reserve during server startup. Modifying this new
parameter's value does require a server restart, but it should
typically be set to the upper bound of what you might realistically
need to set autovacuum_max_workers. With that new parameter in
place, autovacuum_max_workers can now be changed with a SIGHUP
(e.g., pg_ctl reload).
If autovacuum_max_workers is set higher than
autovacuum_worker_slots, a WARNING is emitted, and the server will
only start up to autovacuum_worker_slots workers at a given time.
If autovacuum_max_workers is set to a value less than the number of
currently-running autovacuum workers, the existing workers will
continue running, but no new workers will be started until the
number of running autovacuum workers drops below
autovacuum_max_workers.
Reviewed-by: Sami Imseih, Justin Pryzby, Robert Haas, Andres Freund, Yogesh Sharma
Discussion: https://postgr.es/m/20240410212344.GA1824549%40nathanxps13
|
|
Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/5812a0b9-b0cf-4151-9a14-d9f00e4f2858@gmail.com
|
|
Backpatch-through: 13
|
|
Commit 34486b609 effectively redefined isBackgroundWorker as meaning
"not a regular backend", whereas before it had the narrower
meaning of AmBackgroundWorkerProcess(). For clarity, rename the
field to isRegularBackend and invert its sense.
Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
|
|
Cause parallel workers to not check datallowconn, rolcanlogin, and
ACL_CONNECT privileges. The leader already checked these things
(except for rolcanlogin which might have been checked for a different
role). Re-checking can accomplish little except to induce unexpected
failures in applications that might not even be aware that their query
has been parallelized. We already had the principle that parallel
workers rely on their leader to pass a valid set of authorization
information, so this change just extends that a bit further.
Also, modify the ReservedConnections, datconnlimit and rolconnlimit
logic so that these limits are only enforced against regular backends,
and only regular backends are counted while checking if the limits
were already reached. Previously, background processes that had an
assigned database or role were subject to these limits (with rather
random exclusions for autovac workers and walsenders), and the set of
existing processes that counted against each limit was quite haphazard
as well. The point of these limits, AFAICS, is to ensure the
availability of PGPROC slots for regular backends. Since all other
types of processes have their own separate pools of PGPROC slots, it
makes no sense either to enforce these limits against them or to count
them while enforcing the limit.
While edge-case failures of these sorts have been possible for a
long time, the problem got a good deal worse with commit 5a2fed911
(CVE-2024-10978), which caused parallel workers to make some of these
checks using the leader's current role where before we had used its
AuthenticatedUserId, thus allowing parallel queries to fail after
SET ROLE. The previous behavior was fairly accidental and I have
no desire to return to it.
This patch includes reverting 73c9f91a1, which was an emergency hack
to suppress these same checks in some cases. It wasn't complete,
as shown by a recent bug report from Laurenz Albe. We can also revert
fd4d93d26 and 492217301, which hacked around the same problems in one
regression test.
In passing, remove the special case for autovac workers in
CheckMyDatabase; it seems cleaner to have AutoVacWorkerMain pass
the INIT_PG_OVERRIDE_ALLOW_CONNS flag, now that that does what's
needed.
Like 5a2fed911, back-patch to supported branches (which sadly no
longer includes v12).
Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
|