summaryrefslogtreecommitdiff
path: root/src/backend/storage
AgeCommit message (Collapse)Author
11 hoursAdd missing Datum conversionsPeter Eisentraut
Add various missing conversions from and to Datum. The previous code mostly relied on implicit conversions or its own explicit casts instead of using the correct DatumGet*() or *GetDatum() functions. We think these omissions are harmless. Some actual bugs that were discovered during this process have been committed separately (80c758a2e1d, fd2ab03fea2). Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org
18 hoursRemove obsolete comment.Thomas Munro
Remove a comment about potential for AIO in StartReadBuffersImpl(), because that change happened.
3 daysSuppress maybe-uninitialized warning.Masahiko Sawada
Following commit e035863c9a0, building with -O0 began triggering warnings about potentially uninitialized 'workbuf' usage. While theoretically the initialization isn't necessary since VARDATA() doesn't access the contents of the pointed-to object, this commit explicitly initializes the workbuf variable to suppress the warning. Buildfarm members adder and flaviventris have shown the warning. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAD21AoCOZxfqnNgfM5yVKJZYnOq5m2Q96fBGy1fovEqQ9V4OZA@mail.gmail.com
4 daysFix various hash function usesPeter Eisentraut
These instances were using Datum-returning functions where a lower-level function returning uint32 would be more appropriate. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org
6 daysFix MemoryContextAllocAligned's interaction with Valgrind.Tom Lane
Arrange that only the "aligned chunk" part of the allocated space is included in a Valgrind vchunk. This suppresses complaints about that vchunk being possibly lost because PG is retaining only pointers to the aligned chunk. Also make sure that trailing wasted space is marked NOACCESS. As a tiny performance improvement, arrange that MCXT_ALLOC_ZERO zeroes only the returned "aligned chunk", not the wasted padding space. In passing, fix GetLocalBufferStorage to use MemoryContextAllocAligned instead of rolling its own implementation, which was equally broken according to Valgrind. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
10 daysHandle cancel requests with PID 0 gracefullyHeikki Linnakangas
If the client sent a query cancel request with backend PID 0, it tripped an assertion. With assertions disabled, you got this in the log instead: LOG: invalid cancel request with PID 0 LOG: wrong key in cancel request for process 0 Query cancellations don't even require authentication, so we better tolerate bogus requests. Fix by turning the assertion into a regular runtime check. Spotted while testing libpq behavior with a modified server that didn't send BackendKeyData to the client. Backpatch-through: 18
11 daysRun pgindent.Robert Haas
Per buildfarm member koel, Nathan Bossart, and David Rowley.
12 daysRemove misleading hint for "unexpected data beyond EOF" error.Robert Haas
Commit ffae5cc5a6024b4e25ec920ed5c4dfac649605f8 added this hint in 2006, but it's now obsolete and doesn't reflect what users should really check in this situation. We were not able to agree on a new hint, so just delete the existing one and update the comments to mention one possibility that is known to cause problems of this kind: something other than PostgreSQL is modifying files in the PostgreSQL data directory. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Robert Haas <rhaas@postgresql.org> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/CAKZiRmxNbcaL76x=09Sxf7aUmrRQJBf8drzDdUHo+j9_eM+VMg@mail.gmail.com
2025-07-25Fix assertion failure with latch wait in single-user modeMichael Paquier
LatchWaitSetPostmasterDeathPos, the latch event position for the postmaster death event, is initialized under IsUnderPostmaster. WaitLatch() considered it as a valid wait target in single-user mode (!IsUnderPostmaster), which was incorrect. One code path found to fail with an assertion failure is a database drop in single-user mode while waiting in WaitForProcSignalBarrier() after the drop. Oversight in commit 84e5b2f07a5e. Author: Patrick Stählin <me@packi.ch> Co-authored-by: Ronan Dunklau <ronan.dunklau@aiven.io> Discussion: https://postgr.es/m/18996-3a2744c8140488de@postgresql.org Backpatch-through: 18
2025-07-23Cross-check lists of built-in LWLock tranches.Nathan Bossart
lwlock.c, lwlock.h, and wait_event_names.txt each contain a list of built-in LWLock tranches. It is easy to miss one or the other when adding or removing tranches, and discrepancies have adverse effects (e.g., breaking JOINs between pg_stat_activity and pg_wait_events). This commit moves the lists of built-in tranches in lwlock.{c,h} to lwlocklist.h and adds a cross-check to the script that generates lwlocknames.h. If the lists do not match exactly, building will fail. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aHpOgwuFQfcFMZ/B%40ip-10-97-1-34.eu-west-3.compute.internal
2025-07-23Preserve conflict-relevant data during logical replication.Amit Kapila
Logical replication requires reliable conflict detection to maintain data consistency across nodes. To achieve this, we must prevent premature removal of tuples deleted by other origins and their associated commit_ts data by VACUUM, which could otherwise lead to incorrect conflict reporting and resolution. This patch introduces a mechanism to retain deleted tuples on the subscriber during the application of concurrent transactions from remote nodes. Retaining these tuples allows us to correctly ignore concurrent updates to the same tuple. Without this, an UPDATE might be misinterpreted as an INSERT during resolutions due to the absence of the original tuple. Additionally, we ensure that origin metadata is not prematurely removed by vacuum freeze, which is essential for detecting update_origin_differs and delete_origin_differs conflicts. To support this, a new replication slot named pg_conflict_detection is created and maintained by the launcher on the subscriber. Each apply worker tracks its own non-removable transaction ID, which the launcher aggregates to determine the appropriate xmin for the slot, thereby retaining necessary tuples. Conflict information retention (deleted tuples and commit_ts) can be enabled per subscription via the retain_conflict_info option. This is disabled by default to avoid unnecessary overhead for configurations that do not require conflict resolution or logging. During upgrades, if any subscription on the old cluster has retain_conflict_info enabled, a conflict detection slot will be created to protect relevant tuples from deletion when the new cluster starts. This is a foundational work to correctly detect update_deleted conflict which will be done in a follow-up patch. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-07-22aio: Fix assertion, clarify READMEAndres Freund
The assertion wouldn't have triggered for a long while yet, but this won't accidentally fail to detect the issue if/when it occurs. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAEze2Wj-43JV4YufW23gm=Uwr7Lkj+p0yKctKHxNm1rwFC+_DQ@mail.gmail.com Backpatch-through: 18
2025-07-18Remove unused variable in generate-lwlocknames.pl.Nathan Bossart
Oversight in commit da952b415f. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aHpOgwuFQfcFMZ/B%40ip-10-97-1-34.eu-west-3.compute.internal
2025-07-17Fix inconsistent LWLock tranche names for MultiXact*Michael Paquier
The terms used in wait_event_names.txt and lwlock.c were inconsistent for MultiXactOffsetSLRU and MultiXactMemberSLRU, which could cause joins between pg_wait_events and pg_stat_activity to fail. lwlock.c is adjusted in this commit to what the historical name of the event has always been, and what is documented. Oversight in 53c2a97a9266. 08b9b9e043bb has fixed a similar inconsistency some time ago. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/aHdxN0D0hKXzHFQG@ip-10-97-1-34.eu-west-3.compute.internal Backpatch-through: 17
2025-07-12Remove long-unused TransactionIdIsActive()Andres Freund
TransactionIdIsActive() has not been used since bb38fb0d43c, in 2014. There are no known uses in extensions either and it's hard to see valid uses for it. Therefore remove TransactionIdIsActive(). Discussion: https://postgr.es/m/odgftbtwp5oq7cxjgf4kjkmyq7ypoftmqy7eqa7w3awnouzot6@hrwnl5tdqrgu
2025-07-12aio: Fix configuration reload in IO workers.Thomas Munro
method_worker.c installed SignalHandlerForConfigReload, but it failed to actually process reload requests. That hasn't yet produced any concrete problem reports in terms of GUC changes it should have cared about in v18, but it was inconsistent. It did cause problems for a couple of patches in development that need IO workers to react to ALTER SYSTEM + pg_reload_conf(). Fix extracted from one of those patches. Back-patch to 18. Reported-by: Dmitry Dolgov <9erthalion6@gmail.com> Discussion: https://postgr.es/m/sh5uqe4a4aqo5zkkpfy5fobe2rg2zzouctdjz7kou4t74c66ql%40yzpkxb7pgoxf
2025-07-12aio: Regularize IO worker internal naming.Thomas Munro
Adopt PgAioXXX convention for pgaio module type names. Rename a function that didn't use a pgaio_worker_ submodule prefix. Rename the internal submit function's arguments to match the indirectly relevant function pointer declaration and nearby examples. Rename the array of handle IDs in PgAioSubmissionQueue to sqes, a term of art seen in the systems it emulates, also clarifying that they're not IO handle pointers as the old name might imply. No change in behavior, just type, variable and function name cleanup. Back-patch to 18. Discussion: https://postgr.es/m/CA%2BhUKG%2BwbaZZ9Nwc_bTopm4f-7vDmCwLk80uKDHj9mq%2BUp0E%2Bg%40mail.gmail.com
2025-07-12Fix stale idle flag when IO workers exit.Thomas Munro
Otherwise we could choose a worker that has exited and crash while trying to wake it up. Back-patch to 18. Reported-by: Tomas Vondra <tomas@vondra.me> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/t5aqjhkj6xdkido535pds7fk5z4finoxra4zypefjqnlieevbg%40357aaf6u525j
2025-07-11Rename CHECKPOINT_IMMEDIATE to CHECKPOINT_FAST.Nathan Bossart
The new name more accurately reflects the effects of this flag on a requested checkpoint. Checkpoint-related log messages (i.e., those controlled by the log_checkpoints configuration parameter) will now say "fast" instead of "immediate", too. Likewise, references to "immediate" checkpoints in the documentation have been updated to say "fast". This is preparatory work for a follow-up commit that will add a MODE option to the CHECKPOINT command. Author: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de
2025-07-11Rename CHECKPOINT_FLUSH_ALL to CHECKPOINT_FLUSH_UNLOGGED.Nathan Bossart
The new name more accurately relects the effects of this flag on a requested checkpoint. Checkpoint-related log messages (i.e., those controlled by the log_checkpoints configuration parameter) will now say "flush-unlogged" instead of "flush-all", too. This is preparatory work for a follow-up commit that will add a FLUSH_UNLOGGED option to the CHECKPOINT command. Author: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de
2025-07-09Introduce pg_dsm_registry_allocations view.Nathan Bossart
This commit adds a new system view that provides information about entries in the dynamic shared memory (DSM) registry. Specifically, it returns the name, type, and size of each entry. Note that since we cannot discover the size of dynamic shared memory areas (DSAs) and hash tables backed by DSAs (dshashes) without first attaching to them, the size column is left as NULL for those. Bumps catversion. Author: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Sungwoo Chang <swchangdev@gmail.com> Discussion: https://postgr.es/m/4D445D3E-81C5-4135-95BB-D414204A0AB4%40gmail.com
2025-07-07aio: Combine io_uring memory mappings, if supportedAndres Freund
By default io_uring creates a shared memory mapping for each io_uring instance, leading to a large number of memory mappings. Unfortunately a large number of memory mappings slows things down, backend exit is particularly affected. To address that, newer kernels (6.5) support using user-provided memory for the memory. By putting the relevant memory into shared memory we don't need any additional mappings. On a system with a new enough kernel and liburing, there is no discernible overhead when doing a pgbench -S -C anymore. Reported-by: MARK CALLAGHAN <mdcallag@gmail.com> Reviewed-by: "Burd, Greg" <greg@burd.me> Reviewed-by: Jim Nasby <jnasby@upgrade.com> Discussion: https://postgr.es/m/CAFbpF8OA44_UG+RYJcWH9WjF7E3GA6gka3gvH6nsrSnEe9H0NA@mail.gmail.com Backpatch-through: 18
2025-07-07Standardize LSN formatting by zero paddingÁlvaro Herrera
This commit standardizes the output format for LSNs to ensure consistent representation across various tools and messages. Previously, LSNs were inconsistently printed as `%X/%X` in some contexts, while others used zero-padding. This often led to confusion when comparing. To address this, the LSN format is now uniformly set to `%X/%08X`, ensuring the lower 32-bit part is always zero-padded to eight hexadecimal digits. Author: Japin Li <japinli@hotmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/ME0P300MB0445CA53CA0E4B8C1879AF84B641A@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-07-07Integrate FullTransactionIds deeper into two-phase codeMichael Paquier
This refactoring is a follow-up of the work done in 5a1dfde8334b, that has switched 2PC file names to use FullTransactionIds when written on disk. This will help with the integration of a follow-up solution related to the handling of two-phase files during recovery, to address older defects while reading these from disk after a crash. This change is useful in itself as it reduces the need to build the file names from epoch numbers and TransactionIds, because we can use directly FullTransactionIds from which the 2PC file names are guessed. So this avoids a lot of back-and-forth between the FullTransactionIds retrieved from the file names and how these are passed around in the internal 2PC logic. Note that the core of the change is the use of a FullTransactionId instead of a TransactionId in GlobalTransactionData, that tracks 2PC file information in shared memory. The change in TwoPhaseCallback makes this commit unfit for stable branches. Noah has contributed a good chunk of this patch. I have spent some time on it as well while working on the issues with two-phase state files and recovery. Author: Noah Misch <noah@leadboat.com> Co-Authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/Z5sd5O9JO7NYNK-C@paquier.xyz Discussion: https://postgr.es/m/20250116205254.65.nmisch@google.com
2025-07-04Speed up truncation of temporary relations.Fujii Masao
Previously, truncating a temporary relation required scanning the entire local buffer pool once per relation fork to invalidate buffers. This could be slow, especially with a large local buffers, as the scan was repeated multiple times. A similar issue with regular tables (shared buffers) was addressed in commit 6d05086c0a7 by scanning the buffer pool only once for all forks. This commit applies the same optimization to temporary relations, improving truncation performance. Author: Daniil Davydov <3danissimo@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Maxim Orlov <orlovmg@gmail.com> Discussion: https://postgr.es/m/CAJDiXggNqsJOH7C5co4jA8nDk8vw-=sokyh5s1_TENWnC6Ofcg@mail.gmail.com
2025-07-02Add GetNamedDSA() and GetNamedDSHash().Nathan Bossart
Presently, the dynamic shared memory (DSM) registry only provides GetNamedDSMSegment(), which allocates a fixed-size segment. To use the DSM registry for more sophisticated things like dynamic shared memory areas (DSAs) or a hash table backed by a DSA (dshash), users need to create a DSM segment that stores various handles and LWLock tranche IDs and to write fairly complicated initialization code. Furthermore, there is likely little variation in this initialization code between libraries. This commit introduces functions that simplify allocating a DSA or dshash within the DSM registry. These functions are very similar to GetNamedDSMSegment(). Notable differences include the lack of an initialization callback parameter and the prohibition of calling the functions more than once for a given entry in each backend (which should be trivially avoidable in most circumstances). While at it, this commit bumps the maximum DSM registry entry name length from 63 bytes to 127 bytes. Also note that even though one could presumably detach/destroy the DSAs and dshashes created in the registry, such use-cases are not yet well-supported, if for no other reason than the associated DSM registry entries cannot be removed. Adding such support is left as a future exercise. The test_dsm_registry test module contains tests for the new functions and also serves as a complete usage example. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Discussion: https://postgr.es/m/aEC8HGy2tRQjZg_8%40nathan
2025-07-01Make safeguard against incorrect flags for fsync more portable.Tom Lane
The existing code assumed that O_RDONLY is defined as 0, but this is not required by POSIX and is not true on GNU Hurd. We can avoid the assumption by relying on O_ACCMODE to mask the fcntl() result. (Hopefully, all supported platforms define that.) Author: Michael Banck <mbanck@gmx.net> Co-authored-by: Samuel Thibault Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/6862e8d1.050a0220.194b8d.76fa@mx.google.com Discussion: https://postgr.es/m/68480868.5d0a0220.1e214d.68a6@mx.google.com Backpatch-through: 13
2025-07-01Silence valgrind about pg_numa_touch_mem_if_requiredTomas Vondra
When querying NUMA status of pages in shared memory, we need to touch the memory first to get valid results. This may trigger valgrind reports, because some of the memory (e.g. unpinned buffers) may be marked as noaccess. Solved by adding a valgrind suppresion. An alternative would be to adjust the access/noaccess status before touching the memory, but that seems far too invasive. It would require all those places to have detailed knowledge of what the shared memory stores. The pg_numa_touch_mem_if_required() macro is replaced with a function. Macros are invisible to suppressions, so it'd have to suppress reports for the caller - e.g. pg_get_shmem_allocations_numa(). So we'd suppress reports for the whole function, and that seems to heavy-handed. It might easily hide other valid issues. Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aEtDozLmtZddARdB@msg.df7cb.de Backpatch-through: 18
2025-06-16aio: Add missing memory barrier when waiting for IO handleAndres Freund
Previously there was no memory barrier enforcing correct memory ordering when waiting for a free IO handle. However, in the much more common case of waiting for IO to complete, memory barriers already were present. On strongly ordered architectures like x86 this had no negative consequences, but on some armv8 hardware (observed on Apple hardware), it was possible for the update, in the IO worker, to PgAioHandle->state to become visible before ->distilled_result becoming visible, leading to rather confusing assertion failures. The failures were rare enough that the bug sometimes took days to reproduce when running 027_stream_regress in a loop. Once finally debugged, it was easy enough to come up with a much quicker repro: Trigger a lot of very fast IO by limiting io_combine_limit to 1 and ensure that we always have to wait for a free handle by setting io_max_concurrency to 1. Triggering lots of concurrent seqscans in that setup triggers the issue within seconds. One reason this was hard to debug was that the assertion failure most commonly happened in WaitReadBuffers(), rather than in the AIO subsystem itself. The assertions added in this commit make problems like this easier to understand. Also add a comment to the IO worker explaining that we rely on the lwlock acquisition for correct memory ordering. I think it'd be good to add a tap test that stress tests buffer IO, but that's material for a separate patch. Thanks a lot to Alexander and Konstantin for all the debugging help. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Alexander Lakhin <exclusion@gmail.com> Investigated-by: Andres Freund <andres@anarazel.de> Investigated-by: Alexander Lakhin <exclusion@gmail.com> Investigated-by: Konstantin Knizhnik <knizhnik@garret.ru> Discussion: https://postgr.es/m/2dkz7azclpeiqcmouamdixyn5xhlzy4rvikxrbovyzvi6rnv5c@pz7o7osv2ahf
2025-06-13Replace %llu by PRIu64 in AIO io_uring codeMichael Paquier
This is a continuation of 15a79c73111f, cleaning up the AIO io_uring code that has been committed after that while still using %llu. The code changed here is new in v18, so cleaning things now means less conflicts if this area of the code changes on backpatch once the 18 stable branch is created. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/aEZcGCnYFq642q8k@paquier.xyz
2025-06-05Fix copy-pasto with process count calculation in method_io_uring.cMichael Paquier
This commit replaces the formula used for "TotalProcs" with a call to pgaio_uring_procs() in pgaio_uring_shmem_init() for the shared memory initialization, which is exactly the same, removing a duplication. pgaio_uring_procs() is used for shared memory sizing and a sanity check, and it has some documentation explaining some reasoning behind the formula. Author: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/ME0P300MB044521067A1EDDA9EDEC3793B66DA@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-06-03Fix incorrect format placeholdersPeter Eisentraut
2025-06-03Rename log_lock_failure GUC to log_lock_failures for consistency.Fujii Masao
This commit renames the GUC log_lock_failure to log_lock_failures to align with the existing similar setting log_lock_waits, which uses the plural form. This improves naming consistency across related GUCs. Suggested-by: Peter Eisentraut <peter@eisentraut.org> Author: Fujii Masao <masao.fujii@gmail.com Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/7a8198b6-d5b8-4910-b41e-8d3efcbb015d@eisentraut.org
2025-06-02Fix incorrect format placeholdersPeter Eisentraut
Fixes for return type of dclist_count().
2025-05-31Make XactLockTableWait() and ConditionalXactLockTableWait() interruptable more.Fujii Masao
Previously, XactLockTableWait() and ConditionalXactLockTableWait() could enter a non-interruptible loop when they successfully acquired a lock on a transaction but the transaction still appeared to be running. Since this loop continued until the transaction completed, it could result in long, uninterruptible waits. Although this scenario is generally unlikely since XactLockTableWait() and ConditionalXactLockTableWait() can basically acquire a transaction lock only when the transaction is not running, it can occur in a hot standby. In such cases, the transaction may still appear active due to the KnownAssignedXids list, even while no lock on the transaction exists. For example, this situation can happen when creating a logical replication slot on a standby. The cause of the non-interruptible loop was the absence of CHECK_FOR_INTERRUPTS() within it. This commit adds CHECK_FOR_INTERRUPTS() to the loop in both functions, ensuring they can be interrupted safely. Back-patch to all supported branches. Author: Kevin K Biju <kevinkbiju@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAM45KeELdjhS-rGuvN=ZLJ_asvZACucZ9LZWVzH7bGcD12DDwg@mail.gmail.com Backpatch-through: 13
2025-05-23Revert function to get memory context stats for processesDaniel Gustafsson
Due to concerns raised about the approach, and memory leaks found in sensitive contexts the functionality is reverted. This reverts commits 45e7e8ca9, f8c115a6c, d2a1ed172, 55ef7abf8 and 042a66291 for v18 with an intent to revisit this patch for v19. Discussion: https://postgr.es/m/594293.1747708165@sss.pgh.pa.us
2025-05-21Adjust operation names of pg_aios to match the documentationMichael Paquier
pg_aios used the terms "read" and "write" for vectored I/O read and write operations, respectively. The documentation refers to them as "readv" and "writev", and the code uses internally the terms PGAIO_OP_READV and PGAIO_OP_WRITEV for them, as of "vectored". This commit adjusts these operation names to match with the code and the documentation. Oversight in 8e293e689bab. Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Discussion: https://postgr.es/m/6df1e949d1d759ad2767c18e5845963e@oss.nttdata.com
2025-05-19aio: Fix possible state confusions due to interrupt processingAndres Freund
elog()/ereport() process interrupts, iff the log message is < ERROR and the log message will be emitted. aio's debug messages are emitted via ereport(), but in some places the code is not ready for interrupts to be processed. Fix the issue using a few different methods: 1) handle interrupts arriving concurrently - in some places it's easy to detect that by fetching the handle's generation a bit earlier 2) Check if interrupts made the work needing to be done obsolete 3) Disallow interrupts, as there's no sane way to make interrupt processing safe To prevent some similar issues from being re-introduced, assert that interrupts are held in pgaio_io_update_state(). This commit also fixes the contents of a debug message I added in 039bfc457e4. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/mvpm7ga3dfgz7bvum22hmuz26cariylmcppb3irayftc7bwk3r@l7gb6gr7azhc
2025-05-10Remove GLOBALTABLESPACE_OID assert for locked buffers.Noah Misch
Commit f4ece891fc2f3f96f0571720a1ae30db8030681b added the assertion in an attempt to catch some defects even after VACUUM FULL or REINDEX. However, IsCatalogTextUniqueIndexOid(tag.relNumber) always returns false after a relfilenode change, provoking unintended assertion failures. Reported-by: Adam Guo <adamguo@amazon.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Bug: #18912 Discussion: https://postgr.es/m/18912-a41c9bd0e0ad19b1@postgresql.org
2025-05-10aio: Use runtime arguments with injections points in testsMichael Paquier
This cleans up the code related to the testing infrastructure of AIO that used injection points, switching the test code to use the new facility for injection points added by 371f2db8b05e rather than tweaks to pass and reset arguments to the callbacks run. This removes all the dependencies to USE_INJECTION_POINTS in the AIO code. pgaio_io_call_inj(), pgaio_inj_io_get() and pgaio_inj_cur_handle are now gone. Reviewed-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/Z_y9TtnXubvYAApS@paquier.xyz
2025-05-10Add support for runtime arguments in injection pointsMichael Paquier
The macros INJECTION_POINT() and INJECTION_POINT_CACHED() are extended with an optional argument that can be passed down to the callback attached when an injection point is run, giving to callbacks the possibility to manipulate a stack state given by the caller. The existing callbacks in modules injection_points and test_aio have their declarations adjusted based on that. da7226993fd4 (core AIO infrastructure) and 93bc3d75d8e1 (test_aio) and been relying on a set of workarounds where a static variable called pgaio_inj_cur_handle is used as runtime argument in the injection point callbacks used by the AIO tests, in combination with a TRY/CATCH block to reset the argument value. The infrastructure introduced in this commit will be reused for the AIO tests, simplifying them. Reviewed-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/Z_y9TtnXubvYAApS@paquier.xyz
2025-05-08Use 'void *' for arbitrary buffers, 'uint8 *' for byte arraysHeikki Linnakangas
A 'void *' argument suggests that the caller might pass an arbitrary struct, which is appropriate for functions like libc's read/write, or pq_sendbytes(). 'uint8 *' is more appropriate for byte arrays that have no structure, like the cancellation keys or SCRAM tokens. Some places used 'char *', but 'uint8 *' is better because 'char *' is commonly used for null-terminated strings. Change code around SCRAM, MD5 authentication, and cancellation key handling to follow these conventions. Discussion: https://www.postgresql.org/message-id/61be9e31-7b7d-49d5-bc11-721800d89d64@eisentraut.org
2025-05-07Fix some comments related to IO workersMichael Paquier
IO workers are treated as auxiliary processes. The comments fixed in this commit stated that there could be only one auxiliary process of each BackendType at the same time. This is not true for IO workers, as up to MAX_IO_WORKERS of them can co-exist at the same time. Author: Cédric Villemain <Cedric.Villemain@data-bene.io> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/e4a3ac45-abce-4b58-a043-b4a31cd11113@Data-Bene.io
2025-04-27Don't use double-quotes in #include's of system headers, redux.Tom Lane
This cleans up some loose ends left by commit e8ca9ed1d. I hadn't looked closely enough at these places before, but now I have. The use of double-quoted #includes for Perl headers in plperl_system.h seems to be simply a mistake introduced in 6c944bf3c and faithfully copied forward since then. (I had thought possibly it was required by some weird Windows build setup, but there's no evidence of that in our history.) The occurrences in SectionMemoryManager.h and SectionMemoryManager.cpp evidently stem from those files' origin as LLVM code. It's understandable that LLVM would treat their own files as needing double-quoted #includes; but they're still system headers to us. I also applied the same check to *.c files, and found a few other random incorrect usages in both directions. Our ECPG headers and test files routinely use angle brackets to refer to ECPG headers. I left those usages alone, since it seems reasonable for an ECPG user to regard those headers as system headers.
2025-04-27Eliminate divide in new fast-path locking codeDavid Rowley
c4d5cb71d2 adjusted the fast-path locking code to allow some configuration of the number of fast-path locking slots via the max_locks_per_transaction GUC. In that commit the FAST_PATH_REL_GROUP() macro used integer division to determine the fast-path locking group slot to use for the lock. The divisor in this case is always a power-of-two value. Here we swap out the divide by a bitwise-AND, which is a significantly faster operation to perform. In passing, adjust the code that's setting FastPathLockGroupsPerBackend so that it's more clear that the value being set is a power-of-two. Also, adjust some comments in the area which contained some magic numbers. It seems better to justify the 1024 upper limit in the location where the #define is made instead of where it is used. Author: David Rowley <drowleyml@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAApHDvodr3bcnpxcs7+k-3cFwYR0tP-BYhyd2PpDhe-bCx9i=g@mail.gmail.com
2025-04-25aio: Improve debug logging around waiting for IOsAndres Freund
Trying to investigate a bug report by Alexander Lakhin made it apparent that the debug logging around waiting for IO completion is insufficient. Fix that. Discussion: https://postgr.es/m/h4in2db37vepagmi2oz5vvqymjasc5gyb4lpqkunj4eusu274i@37jpd3c2spd3
2025-04-25Fix bug allowing io_combine_limit > io_max_combine_combine limitAndres Freund
10f66468475 intended to limit the value of io_combine_limit to the minimum of io_combine_limit and io_max_combine_limit. To avoid issues with interdependent GUCs, it introduced io_combine_limit_guc and set io_combine_limit in assign hooks. That plan was thwarted by guc_tables.c accidentally still referencing io_combine_limit, instead of io_combine_limit_guc. That lead to the GUC machinery overriding the work done in the assign hooks, potentially leaving io_combine_limit with a too high value. The consequence of this bug was that when running with io_combine_limit > io_combine_limit_guc the AIO machinery would not have reserved large enough iovec and IO data arrays, with one IO's arrays overlapping with another IO's, leading to total confusion. To make such a problem easier to detect in the future, add assertions to pgaio_io_set_handle_data_* checking the length is smaller than io_max_combine_limit (not just PG_IOV_MAX). It'd be nice to have a few tests for this, but it's not entirely obvious how to do so portably. As remarked upon by Tom, the GUC assignment hooks really shouldn't set the underlying variable, that's the job of the GUC machinery. Change that as well. Discussion: https://postgr.es/m/c5jyqnuwrpigd35qe7xdypxsisdjrdba5iw63mhcse4mzjogxo@qdjpv22z763f
2025-04-25aio: Fix crash potential for pg_aios views due to late state updateAndres Freund
pgaio_io_reclaim() reset the fields in PgAioHandle before updating the state to IDLE or incrementing the generation. For most things that's OK, but for pg_get_aios() it is not - if it copied the PgAioHandle while fields were being reset, we wouldn't detect that and could call pgaio_io_get_target_description() with ioh->target == PGAIO_TID_INVALID, leading to a crash. Fix this issue by incrementing the generation and state earlier, before resetting. Also add an assertion to pgaio_io_get_target_description() for the target to be valid - that'd have made this case a bit easier to debug. While at it, add/update a few related assertions. Author: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/062daca9-dfad-4750-9da8-b13388301ad9@gmail.com
2025-04-21Fix a few more duplicate words in commentsDavid Rowley
Similar to 84fd3bc14 but these ones were found using a regex that can span multiple lines. Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvrMcr8XD107H3NV=WHgyBcu=sx5+7=WArr-n_cWUqdFXQ@mail.gmail.com
2025-04-19Fix typos and grammar in the codeMichael Paquier
The large majority of these have been introduced by recent commits done in the v18 development cycle. Author: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/9a7763ab-5252-429d-a943-b28941e0e28b@gmail.com