summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2024-11-11Ensure cached plans are correctly marked as dependent on role.Nathan Bossart
If a CTE, subquery, sublink, security invoker view, or coercion projection references a table with row-level security policies, we neglected to mark the plan as potentially dependent on which role is executing it. This could lead to later executions in the same session returning or hiding rows that should have been hidden or returned instead. Reported-by: Wolfgang Walther Reviewed-by: Noah Misch Security: CVE-2024-10976 Backpatch-through: 12
2024-11-11Block environment variable mutations from trusted PL/Perl.Noah Misch
Many process environment variables (e.g. PATH), bypass the containment expected of a trusted PL. Hence, trusted PLs must not offer features that achieve setenv(). Otherwise, an attacker having USAGE privilege on the language often can achieve arbitrary code execution, even if the attacker lacks a database server operating system user. To fix PL/Perl, replace trusted PL/Perl %ENV with a tied hash that just replaces each modification attempt with a warning. Sites that reach these warnings should evaluate the application-specific implications of proceeding without the environment modification: Can the application reasonably proceed without the modification? If no, switch to plperlu or another approach. If yes, the application should change the code to stop attempting environment modifications. If that's too difficult, add "untie %main::ENV" in any code executed before the warning. For example, one might add it to the start of the affected function or even to the plperl.on_plperl_init setting. In passing, link to Perl's guidance about the Perl features behind the security posture of PL/Perl. Back-patch to v12 (all supported versions). Andrew Dunstan and Noah Misch Security: CVE-2024-10979
2024-11-11Translation updatesPeter Eisentraut
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: c3f75b7ae5afec17f770aec259e92dc654704465
2024-11-11libpq: Bail out during SSL/GSS negotiation errorsMichael Paquier
This commit changes libpq so that errors reported by the backend during the protocol negotiation for SSL and GSS are discarded by the client, as these may include bytes that could be consumed by the client and write arbitrary bytes to a client's terminal. A failure with the SSL negotiation now leads to an error immediately reported, without a retry on any other methods allowed, like a fallback to a plaintext connection. A failure with GSS discards the error message received, and we allow a fallback as it may be possible that the error is caused by a connection attempt with a pre-11 server, GSS encryption having been introduced in v12. This was a problem only with v17 and newer versions; older versions discard the error message already in this case, assuming a failure caused by a lack of support for GSS encryption. Author: Jacob Champion Reviewed-by: Peter Eisentraut, Heikki Linnakangas, Michael Paquier Security: CVE-2024-10977 Backpatch-through: 12
2024-11-10Fix broken collate.icu.utf8 test in v12 branch.Tom Lane
collate.icu.utf8.sql is not run by default before v13, and commits 9c4757491/46d9be5ef evidently didn't bother to test it, with the completely predictable result that it was broken. We didn't have enable_incremental_sort in v12, plus EXPLAIN makes different table alias choices than later branches. Per buildfarm member copperhead.
2024-11-08Improve fix for not entering parallel mode when holding interrupts.Tom Lane
Commit ac04aa84a put the shutoff for this into the planner, which is not ideal because it doesn't prevent us from re-using a previously made parallel plan. Revert the planner change and instead put the shutoff into InitializeParallelDSM, modeling it on the existing code there for recovering from failure to allocate a DSM segment. However, that code path is mostly untested, and testing a bit harder showed there's at least one bug: ExecHashJoinReInitializeDSM is not prepared for us to have skipped doing parallel DSM setup. I also thought the Assert in ReinitializeParallelWorkers is pretty ill-advised, and replaced it with a silent Min() operation. The existing test case added by ac04aa84a serves fine to test this version of the fix, so no change needed there. Patch by me, but thanks to Noah Misch for the core idea that we could shut off worker creation when !INTERRUPTS_CAN_BE_PROCESSED. Back-patch to v12, as ac04aa84a was. Discussion: https://postgr.es/m/CAC-SaSzHUKT=vZJ8MPxYdC_URPfax+yoA1hKTcF4ROz_Q6z0_Q@mail.gmail.com
2024-11-08Disallow partitionwise join when collations don't matchAmit Langote
If the collation of any join key column doesn’t match the collation of the corresponding partition key, partitionwise joins can yield incorrect results. For example, rows that would match under the join key collation might be located in different partitions due to the partitioning collation. In such cases, a partitionwise join would yield different results from a non-partitionwise join, so disallow it in such cases. Reported-by: Tender Wang <tndrwang@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/CAHewXNno_HKiQ6PqyLYfuqDtwp7KKHZiH1J7Pqyz0nr+PS2Dwg@mail.gmail.com Backpatch-through: 12
2024-11-08Disallow partitionwise grouping when collations don't matchAmit Langote
If the collation of any grouping column doesn’t match the collation of the corresponding partition key, partitionwise grouping can yield incorrect results. For example, rows that would be grouped under the grouping collation may end up in different partitions under the partitioning collation. In such cases, full partitionwise grouping would produce results that differ from those without partitionwise grouping, so disallowed that. Partial partitionwise aggregation is still allowed, as the Finalize step reconciles partition-level aggregates with grouping requirements across all partitions, ensuring that the final output remains consistent. This commit also fixes group_by_has_partkey() by ensuring the RelabelType node is stripped from grouping expressions when matching them to partition key expressions to avoid false mismatches. Bug: #18568 Reported-by: Webbo Han <1105066510@qq.com> Author: Webbo Han <1105066510@qq.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/18568-2a9afb6b9f7e6ed3@postgresql.org Discussion: https://postgr.es/m/tencent_9D9103CDA420C07768349CC1DFF88465F90A@qq.com Discussion: https://postgr.es/m/CAHewXNno_HKiQ6PqyLYfuqDtwp7KKHZiH1J7Pqyz0nr+PS2Dwg@mail.gmail.com Backpatch-through: 12
2024-11-08Message style improvementPeter Eisentraut
Backpatch the part of edee0c621de that applies to a90bdd7a44d, which was also backpatched. That way, the message is consistent in all branches.
2024-11-06Monkey-patch LLVM code to fix ARM relocation bug.Thomas Munro
Supply a new memory manager for RuntimeDyld, to avoid crashes in generated code caused by memory placement that can overflow a 32 bit data type. This is a drop-in replacement for the llvm::SectionMemoryManager class in the LLVM library, with Michael Smith's proposed fix from https://www.github.com/llvm/llvm-project/pull/71968. We hereby slurp it into our own source tree, after moving into a new namespace llvm::backport and making some minor adjustments so that it can be compiled with older LLVM versions as far back as 12. It's harder to make it work on even older LLVM versions, but it doesn't seem likely that people are really using them so that is not investigated for now. The problem could also be addressed by switching to JITLink instead of RuntimeDyld, and that is the LLVM project's recommended solution as the latter is about to be deprecated. We'll have to do that soon enough anyway, and then when the LLVM version support window advances far enough in a few years we'll be able to delete this code. Unfortunately that wouldn't be enough for PostgreSQL today: in most relevant versions of LLVM, JITLink is missing or incomplete. Several other projects have already back-ported this fix into their fork of LLVM, which is a vote of confidence despite the lack of commit into LLVM as of today. We don't have our own copy of LLVM so we can't do exactly what they've done; instead we have a copy of the whole patched class so we can pass an instance of it to RuntimeDyld. The LLVM project hasn't chosen to commit the fix yet, and even if it did, it wouldn't be back-ported into the releases of LLVM that most of our users care about, so there is not much point in waiting any longer for that. If they make further changes and commit it to LLVM 19 or 20, we'll still need this for older versions, but we may want to resynchronize our copy and update some comments. The changes that we've had to make to our copy can be seen by diffing our SectionMemoryManager.{h,cpp} files against the ones in the tree of the pull request. Per the LLVM project's license requirements, a copy is in SectionMemoryManager.LICENSE. This should fix the spate of crash reports we've been receiving lately from users on large memory ARM systems. Back-patch to all supported releases. Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> (license aspects) Reported-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
2024-11-02Suppress new "may be used uninitialized" warning.Noah Misch
Buildfarm member mamba fails to deduce that the function never uses this variable without initializing it. Back-patch to v12, like commit b412f402d1e020c5dac94f3bf4a005db69519b99.
2024-11-02Move I/O before the index_update_stats() buffer lock region.Noah Misch
Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 enlarged the work done here under the pg_class heap buffer lock. Two preexisting actions are best done before holding that lock. Both RelationGetNumberOfBlocks() and visibilitymap_count() do I/O, and the latter might exclusive-lock a visibility map buffer. Moving these reduces contention and risk of undetected LWLock deadlock. Back-patch to v12, like that commit. Discussion: https://postgr.es/m/20241031200139.b4@rfd.leadboat.com
2024-11-02Revert "For inplace update, send nontransactional invalidations."Noah Misch
This reverts commit 95c5acb3fc261067ab65ddc0b2dca8e162f09442 (v17) and counterparts in each other non-master branch. If released, that commit would have caused a worst-in-years minor release regression, via undetected LWLock self-deadlock. This commit and its self-deadlock fix warrant more bake time in the master branch. Reported by Alexander Lakhin. Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com
2024-11-02Revert "WAL-log inplace update before revealing it to other sessions."Noah Misch
This reverts commit bfd5c6e279c8e1702eea882439dc7ebdf4d4b3a5 (v17) and counterparts in each other non-master branch. This unblocks reverting a commit on which it depends. Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com
2024-10-29Unpin buffer before inplace update waits for an XID to end.Noah Misch
Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 changed inplace updates to wait for heap_update() commands like GRANT TABLE and GRANT DATABASE. By keeping the pin during that wait, a sequence of autovacuum workers and an uncommitted GRANT starved one foreground LockBufferForCleanup() for six minutes, on buildfarm member sarus. Prevent, at the cost of a bit of complexity. Back-patch to v12, like the earlier commit. That commit and heap_inplace_lock() have not yet appeared in any release. Discussion: https://postgr.es/m/20241026184936.ae.nmisch@google.com
2024-10-29Update time zone data files to tzdata release 2024b.Tom Lane
Historical corrections for Mexico, Mongolia, and Portugal. Notably, Asia/Choibalsan is now an alias for Asia/Ulaanbaatar rather than being a separate zone, mainly because the differences between those zones were found to be based on untrustworthy data.
2024-10-25WAL-log inplace update before revealing it to other sessions.Noah Misch
A buffer lock won't stop a reader having already checked tuple visibility. If a vac_update_datfrozenid() and then a crash happened during inplace update of a relfrozenxid value, datfrozenxid could overtake relfrozenxid. That could lead to "could not access status of transaction" errors. Back-patch to v12 (all supported versions). In v14 and earlier, this also back-patches the assertion removal from commit 7fcf2faf9c7dd473208fd6d5565f88d7f733782b. Discussion: https://postgr.es/m/20240620012908.92.nmisch@google.com
2024-10-25For inplace update, send nontransactional invalidations.Noah Misch
The inplace update survives ROLLBACK. The inval didn't, so another backend's DDL could then update the row without incorporating the inplace update. In the test this fixes, a mix of CREATE INDEX and ALTER TABLE resulted in a table with an index, yet relhasindex=f. That is a source of index corruption. Back-patch to v12 (all supported versions). The back branch versions don't change WAL, because those branches just added end-of-recovery SIResetAll(). All branches change the ABI of extern function PrepareToInvalidateCacheTuple(). No PGXN extension calls that, and there's no apparent use case in extensions. Reviewed by Nitin Motiani and (in earlier versions) Andres Freund. Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com
2024-10-25At end of recovery, reset all sinval-managed caches.Noah Misch
An inplace update's invalidation messages are part of its transaction's commit record. However, the update survives even if its transaction aborts or we stop recovery before replaying its transaction commit. After recovery, a backend that started in recovery could update the row without incorporating the inplace update. That could result in a table with an index, yet relhasindex=f. That is a source of index corruption. This bulk invalidation avoids the functional consequences. A future change can fix the !RecoveryInProgress() scenario without changing the WAL format. Back-patch to v17 - v12 (all supported versions). v18 will instead add invalidations to WAL. Discussion: https://postgr.es/m/20240618152349.7f.nmisch@google.com
2024-10-24Stop reading uninitialized memory in heap_inplace_lock().Noah Misch
Stop computing a never-used value. This removes the read; the read had no functional implications. Back-patch to v12, like commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257. Reported by Alexander Lakhin. Discussion: https://postgr.es/m/6c92f59b-f5bc-e58c-9bdd-d1f21c17c786@gmail.com
2024-10-23Remove unnecessary word in a commentAmit Langote
Relations opened by the executor are only closed once in ExecCloseRangeTableRelations(), so the word "again" in the comment for ExecGetRangeTableRelation() is misleading and unnecessary. Discussion: https://postgr.es/m/CA+HiwqHnw-zR+u060i3jp4ky5UR0CjByRFQz50oZ05de7wUg=Q@mail.gmail.com Backpatch-through: 12
2024-10-23ecpg: Fix out-of-bound read in DecodeDateTime()Michael Paquier
It was possible for the code to read out-of-bound data from the "day_tab" table with some crafted input data. Let's treat these as invalid input as the month number is incorrect. A test is added to test this case with a check on the errno returned by the decoding routine. A test close to the new one added in this commit was testing for a failure, but did not look at the errno generated, so let's use this commit to also change it, adding a check on the errno returned by DecodeDateTime(). Like the other test scripts, dt_test should likely be expanded to include more checks based on the errnos generated in these code paths. This is left as future work. This issue exists since 2e6f97560a83, so backpatch all the way down. Reported-by: Pavel Nekrasov Author: Bruce Momjian, Pavel Nekrasov Discussion: https://postgr.es/m/18614-6bbe00117352309e@postgresql.org Backpatch-through: 12
2024-10-21Fix wrong assertion and poor error messages in "COPY (query) TO".Tom Lane
If the query is rewritten into a NOTIFY command by a DO INSTEAD rule, we'd get an assertion failure, or in non-assert builds issue a rather confusing error message. Improve that. Also fix a longstanding grammar mistake in a nearby error message. Per bug #18664 from Alexander Lakhin. Back-patch to all supported branches. Tender Wang and Tom Lane Discussion: https://postgr.es/m/18664-ffd0ebc2386598df@postgresql.org
2024-10-21Fix race condition in committing a serializable transactionHeikki Linnakangas
The finished transaction list can contain XIDs that are older than the serializable global xmin. It's a short-lived state; ClearOldPredicateLocks() removes any such transactions from the list, and it's called whenever the global xmin advances. But if another backend calls SummarizeOldestCommittedSxact() in that window, it will call SerialAdd() on an XID that's older than the global xmin, or if there are no more transactions running, when global xmin is invalid. That trips the assertion in SerialAdd(). Fixes bug #18658 reported by Andrew Bille. Thanks to Alexander Lakhin for analysis. Backpatch to all versions. Discussion: https://www.postgresql.org/message-id/18658-7dab125ec688c70b%40postgresql.org
2024-10-17Fix extreme skew detection in Parallel Hash Join.Thomas Munro
After repartitioning the inner side of a hash join that would have exceeded the allowed size, we check if all the tuples from a parent partition moved to one child partition. That is evidence that it contains duplicate keys and later attempts to repartition will also fail, so we should give up trying to limit memory (for lack of a better fallback strategy). A thinko prevented the check from working correctly in partition 0 (the one that is partially loaded into memory already). After repartitioning, we should check for extreme skew if the *parent* partition's space_exhausted flag was set, not the child partition's. The consequence was repeated futile repartitioning until per-partition data exceeded various limits including "ERROR: invalid DSA memory alloc request size 1811939328", OS allocation failure, or temporary disk space errors. (We could also do something about some of those symptoms, but that's material for separate patches.) This problem only became likely when PostgreSQL 16 introduced support for Parallel Hash Right/Full Join, allowing NULL keys into the hash table. Repartitioning always leaves NULL in partition 0, no matter how many times you do it, because the hash value is all zero bits. That's unlikely for other hashed values, but they might still have caused wasted extra effort before giving up. Back-patch to all supported releases. Reported-by: Craig Milhiser <craig@milhiser.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/CA%2BwnhO1OfgXbmXgC4fv_uu%3DOxcDQuHvfoQ4k0DFeB0Qqd-X-rQ%40mail.gmail.com
2024-10-16Further refine _SPI_execute_plan's rule for atomic execution.Tom Lane
Commit 2dc1deaea turns out to have been still a brick shy of a load, because CALL statements executing within a plpgsql exception block could still pass the wrong snapshot to stable functions within the CALL's argument list. That happened because standard_ProcessUtility forces isAtomicContext to true if IsTransactionBlock is true, which it always will be inside a subtransaction. Then ExecuteCallStmt would think it does not need to push a new snapshot --- but _SPI_execute_plan didn't do so either, since it thought it was in nonatomic mode. The best fix for this seems to be for _SPI_execute_plan to operate in atomic execution mode if IsSubTransaction() is true, even when the SPI context as a whole is non-atomic. This makes _SPI_execute_plan have the same rules about when non-atomic execution is allowed as _SPI_commit/_SPI_rollback have about when COMMIT/ROLLBACK are allowed, which seems appropriately symmetric. (If anyone ever tries to allow COMMIT/ROLLBACK inside a subtransaction, this would all need to be rethought ... but I'm unconvinced that such a thing could be logically consistent at all.) For further consistency, also check IsSubTransaction() in SPI_inside_nonatomic_context. That does not matter for its one present-day caller StartTransaction, which can't be reached inside a subtransaction. But if any other callers ever arise, they'd presumably want this definition. Per bug #18656 from Alexander Alehin. Back-patch to all supported branches, like previous fixes in this area. Discussion: https://postgr.es/m/18656-cade1780866ef66c@postgresql.org
2024-10-16Reduce memory block size for decoded tuple storage to 8kB.Masahiko Sawada
Commit a4ccc1cef introduced the Generation Context and modified the logical decoding process to use a Generation Context with a fixed block size of 8MB for storing tuple data decoded during logical decoding (i.e., rb->tup_context). Several reports have indicated that the logical decoding process can be terminated due to out-of-memory (OOM) situations caused by excessive memory usage in rb->tup_context. This issue can occur when decoding a workload involving several concurrent transactions, including a long-running transaction that modifies tuples. By design, the Generation Context does not free a memory block until all chunks within that block are released. Consequently, if tuples modified by the long-running transaction are stored across multiple memory blocks, these blocks remain allocated until the long-running transaction completes, leading to substantial memory fragmentation. The memory usage during logical decoding, tracked by rb->size, does not account for memory fragmentation, resulting in potentially much higher memory consumption than the value of the logical_decoding_work_mem parameter. Various improvement strategies were discussed in the relevant thread. This change reduces the block size of the Generation Context used in rb->tup_context from 8MB to 8kB. This modification significantly decreases the likelihood of substantial memory fragmentation occurring and is relatively straightforward to backport. Performance testing across multiple platforms has confirmed that this change will not introduce any performance degradation that would impact actual operation. Backport to all supported branches. Reported-by: Alex Richman, Michael Guissine, Avi Weinberg Reviewed-by: Amit Kapila, Fujii Masao, David Rowley Tested-by: Hayato Kuroda, Shlok Kyal Discussion: https://postgr.es/m/CAD21AoBTY1LATZUmvSXEssvq07qDZufV4AF-OHh9VD2pC0VY2A%40mail.gmail.com Backpatch-through: 12
2024-10-09Remove incorrect function import from pgindentDaniel Gustafsson
Commit 149ac7d4559 which re-implemented pgindent in Perl explicitly imported the devnull function from File::Spec, but the module does not export anything. In recent versions of Perl calling a missing import function cause a warning, which combined with warnings being fatal cause pgindent to error out. Backpatch to all supported versions. Author: Erik Wienhold <ewie@ewie.name> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discusson: https://postgr.es/m/2372cd74-11b0-46f9-b28e-8f9627215d19@ewie.name Backpatch-through: v12
2024-10-07vacuumdb: Schema-qualify operator in catalog query's WHERE clause.Nathan Bossart
Commit 1ab67c9dfa, which modified this catalog query so that it doesn't return temporary relations, forgot to schema-qualify the operator. A comment earlier in the function implores us to fully qualify everything in the query: * Since we execute the constructed query with the default search_path * (which could be unsafe), everything in this query MUST be fully * qualified. This commit fixes that. While at it, add a newline for consistency with surrounding code. Reviewed-by: Noah Misch Discussion: https://postgr.es/m/ZwQJYcuPPUsF0reU%40nathan Backpatch-through: 12
2024-10-07Fix Y2038 issues with MyStartTime.Nathan Bossart
Several places treat MyStartTime as a "long", which is only 32 bits wide on some platforms. In reality, MyStartTime is a pg_time_t, i.e., a signed 64-bit integer. This will lead to interesting bugs on the aforementioned systems in 2038 when signed 32-bit integers are no longer sufficient to store Unix time (e.g., "pg_ctl start" hanging). To fix, ensure that MyStartTime is handled as a 64-bit value everywhere. (Of course, users will need to ensure that time_t is 64 bits wide on their system, too.) Co-authored-by: Max Johnson Discussion: https://postgr.es/m/CO1PR07MB905262E8AC270FAAACED66008D682%40CO1PR07MB9052.namprd07.prod.outlook.com Backpatch-through: 12
2024-10-06Ignore not-yet-defined Portals in pg_cursors view.Tom Lane
pg_cursor() supposed that any Portal it finds in the hash table must have sourceText set up, but there's an edge case where that is not so. A newly-created Portal has sourceText = NULL, and that doesn't change until PortalDefineQuery is called. In SPI_cursor_open_internal, we perform GetCachedPlan between CreatePortal and PortalDefineQuery, and it's possible for user-defined code to execute during that planning and cause a fetch from the pg_cursors view, resulting in a null-pointer-dereference crash. (It looks like the same could happen in exec_bind_message, but I've not tried to provoke a failure there.) I considered trying to fix this by setting sourceText sooner, but there may be instances of this same calling pattern in extensions, and we couldn't be sure they'd get the memo promptly. It seems better to redefine pg_cursor as not showing Portals that have not yet had PortalDefineQuery called on them, which we can do by just skipping them if sourceText is still NULL. (Before a1c692358, pg_cursor would instead return a row with NULL in the statement column. We could revert to that behavior but it doesn't really seem like a better definition, especially since our documentation doesn't suggest that the column could be NULL.) Per report from PetSerAl. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAKygsHTBXLXjwV43kpZa+Cs+XTiaeeJiZdL4cPBm9f4MTdw7wg@mail.gmail.com
2024-10-02Parse libpq's "keepalives" option more like other integer options.Tom Lane
Use pqParseIntParam (nee parse_int_param) instead of using strtol directly. This allows trailing whitespace, which the previous coding didn't, and makes the spelling of the error message consistent with other similar cases. This seems to be an oversight in commit e7a221797, which introduced parse_int_param. That fixed places that were using atoi(), but missed this place which was randomly using strtol() instead. Ordinarily I'd consider this minor cleanup not worth back-patching. However, it seems that ecpg assumes it can add trailing whitespace to URL parameters, so that use of the keepalives option fails in that context. Perhaps that's worth improving as a separate matter. In the meantime, back-patch this to all supported branches. Yuto Sasaki (some further cleanup by me) Discussion: https://postgr.es/m/TY2PR01MB36286A7B97B9A15793335D18C1772@TY2PR01MB3628.jpnprd01.prod.outlook.com
2024-10-01Fix race condition in COMMIT PREPARED causing orphaned 2PC filesMichael Paquier
COMMIT PREPARED removes on-disk 2PC files near its end, but the state checked if a file is on-disk or not gets read from shared memory while not holding the two-phase state lock. Because of that, there was a small window where a second backend doing a PREPARE TRANSACTION could reuse the GlobalTransaction put back into the 2PC free list by the COMMIT PREPARED, overwriting the "ondisk" flag read afterwards by the COMMIT PREPARED to decide if its on-disk two-phase state file should be removed, preventing the file deletion. This commit fixes this issue so as the "ondisk" flag in the GlobalTransaction is read while holding the two-phase state lock, not from shared memory after its entry has been added to the free list. Orphaned two-phase state files flushed to disk after a checkpoint are discarded at the beginning of recovery. However, a truncation of pg_xact/ would make the startup process issue a FATAL when it cannot read the SLRU page holding the state of the transaction whose 2PC file was orphaned, which is a necessary step to decide if the 2PC file should be removed or not. Removing manually the file would be necessary in this case. Issue introduced by effe7d9552dd, so backpatch all the way down. Mea culpa. Author: wuchengwen Discussion: https://postgr.es/m/tencent_A7F059B5136A359625C7B2E4A386B3C3F007@qq.com Backpatch-through: 12
2024-09-29Remove NULL dereference from RenameRelationInternal().Noah Misch
Defect in last week's commit aac2c9b4fde889d13f859c233c2523345e72d32b, per Coverity. Reaching this would need catalog corruption. Back-patch to v12, like that commit.
2024-09-27Avoid 037_invalid_database.pl hang under debug_discard_caches.Noah Misch
Back-patch to v12 (all supported versions).
2024-09-27Fix incorrect memory access in VACUUM FULL with invalid toast indexesMichael Paquier
An invalid toast index is skipped in reindex_relation(). These would be remnants of a failed REINDEX CONCURRENTLY and they should never been rebuilt as there can only be one valid toast index at a time. REINDEX_REL_SUPPRESS_INDEX_USE, used by CLUSTER and VACUUM FULL, needs to maintain a list of the indexes being processed. The list of indexes is retrieved from the relation cache, and includes invalid indexes. The code has missed that invalid toast indexes are ignored in reindex_relation() as this leads to a hard failure in reindex_index(), and they were left in the reindex pending list, making the list inconsistent when rechecked. The incorrect memory access was happening when scanning pg_class for the refresh of pg_database.datfrozenxid, when doing a scan of pg_class. This issue exists since REINDEX CONCURRENTLY exists, where invalid toast indexes can exist, so backpatch all the way down. Reported-by: Alexander Lakhin Author: Tender Wang Discussion: https://postgr.es/m/18630-9aed99c38830657d@postgresql.org Backpatch-through: 12
2024-09-26tests: Restrict pg_locks queries in advisory_locks.sql to current databaseAndres Freund
Otherwise testing an existing installation can fail, if there are other locks, e.g. from one of the isolation tests. This was originally applied as c3315a7da57b in 16~, but it is possible to see this test fail depending on the concurrent activity for older active branches. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/20221003234111.4ob7yph6r4g4ywhu@awork3.anarazel.de Backpatch-through: 12
2024-09-25vacuumdb: Skip temporary tables in query to build list of relationsMichael Paquier
Running vacuumdb with a non-superuser while another user has created a temporary table would lead to a mid-flight permission failure, interrupting the operation. vacuum_rel() skips temporary relations of other backends, and it makes no sense for vacuumdb to know about these relations, so let's switch it to ignore temporary relations entirely. Adding a qual in the query based on relpersistence simplifies the generation of its WHERE clause in vacuum_one_database(), per se the removal of "has_where". Author: VaibhaveS, Michael Paquier Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/CAM_eQjwfAR=y3G1fGyS1U9FTmc+FyJm9amNfY2QCZBnDDbNPZg@mail.gmail.com Backpatch-through: 12
2024-09-24For inplace update durability, make heap_update() callers wait.Noah Misch
The previous commit fixed some ways of losing an inplace update. It remained possible to lose one when a backend working toward a heap_update() copied a tuple into memory just before inplace update of that tuple. In catalogs eligible for inplace update, use LOCKTAG_TUPLE to govern admission to the steps of copying an old tuple, modifying it, and issuing heap_update(). This includes MERGE commands. To avoid changing most of the pg_class DDL, don't require LOCKTAG_TUPLE when holding a relation lock sufficient to exclude inplace updaters. Back-patch to v12 (all supported versions). In v13 and v12, "UPDATE pg_class" or "UPDATE pg_database" can still lose an inplace update. The v14+ UPDATE fix needs commit 86dc90056dfdbd9d1b891718d2e5614e3e432f35, and it wasn't worth reimplementing that fix without such infrastructure. Reviewed by Nitin Motiani and (in earlier versions) Heikki Linnakangas. Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com
2024-09-24Fix data loss at inplace update after heap_update().Noah Misch
As previously-added tests demonstrated, heap_inplace_update() could instead update an unrelated tuple of the same catalog. It could lose the update. Losing relhasindex=t was a source of index corruption. Inplace-updating commands like VACUUM will now wait for heap_update() commands like GRANT TABLE and GRANT DATABASE. That isn't ideal, but a long-running GRANT already hurts VACUUM progress more just by keeping an XID running. The VACUUM will behave like a DELETE or UPDATE waiting for the uncommitted change. For implementation details, start at the systable_inplace_update_begin() header comment and README.tuplock. Back-patch to v12 (all supported versions). In back branches, retain a deprecated heap_inplace_update(), for extensions. Reported by Smolkin Grigory. Reviewed by Nitin Motiani, (in earlier versions) Heikki Linnakangas, and (in earlier versions) Alexander Lakhin. Discussion: https://postgr.es/m/CAMp+ueZQz3yDk7qg42hk6-9gxniYbp-=bG2mgqecErqR5gGGOA@mail.gmail.com
2024-09-24Warn if LOCKTAG_TUPLE is held at commit, under debug_assertions.Noah Misch
The current use always releases this locktag. A planned use will continue that intent. It will involve more areas of code, making unlock omissions easier. Warn under debug_assertions, like we do for various resource leaks. Back-patch to v12 (all supported versions), the plan for the commit of the new use. Reviewed by Heikki Linnakangas. Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com
2024-09-24Back-patch "Refactor code in tablecmds.c to check and process tablespace moves"Noah Misch
Back-patch commits 4c9c359d38ff1e2de388eedd860785be6a49201c and 24843297a96d7be16cc3f4b090aacfc6e5e6839e to v13 and v12. Before those commits, we held the modifiable copy of the relation's pg_class row throughout a table_relation_copy_data(). That can last long enough to copy MaxBlockNumber of data. A subsequent fix will hold LockTuple() for the lifespan of that modifiable copy. By back-patching this first, we avoid a needless long-duration LOCKTAG_TUPLE. Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com
2024-09-24Drop global objects after completed testDaniel Gustafsson
Project policy is to not leave global objects behind after a regress test run. This was found as a result of the development of a patch to make pg_regress detect such leftovers automatically, which in the end was withdrawn due to issues with parallel runs. This was originally committed as 936e3fa3787a, but the issue also exists in the 12~16 range. Discussion: https://postgr.es/m/E1phvk7-000VAH-7k@gemulon.postgresql.org Backpatch-through: 12
2024-09-19Improve Perl script which adds commit links to release notesBruce Momjian
Reported-by: Andrew Dunstan Discussion: https://postgr.es/m/b2465837-56df-4794-a0b5-5e6ed44ed870@dunslane.net Author: Andrew Dunstan Backpatch-through: 12
2024-09-17Don't enter parallel mode when holding interrupts.Noah Misch
Doing so caused the leader to hang in wait_event=ParallelFinish, which required an immediate shutdown to resolve. Back-patch to v12 (all supported versions). Francesco Degrassi Discussion: https://postgr.es/m/CAC-SaSzHUKT=vZJ8MPxYdC_URPfax+yoA1hKTcF4ROz_Q6z0_Q@mail.gmail.com
2024-09-16scripts: add Perl script to add links to release notesBruce Momjian
Reported-by: jian he Discussion: https://postgr.es/m/ZuYsS5XdA7hVcV9l@momjian.us Backpatch-through: 12
2024-09-15Replace usages of xmlXPathCompile() with xmlXPathCtxtCompile().Tom Lane
In existing releases of libxml2, xmlXPathCompile can be driven to stack overflow because it fails to protect itself against too-deeply-nested input. While there is an upstream fix as of yesterday, it will take years for that to propagate into all shipping versions. In the meantime, we can protect our own usages basically for free by calling xmlXPathCtxtCompile instead. (The actual bug is that libxml2 keeps its nesting counter in the xmlXPathContext, and its parsing code was willing to just skip counting nesting levels if it didn't have a context. So if we supply a context, all is well. It seems odd actually that it works at all to not supply a context, because this means that XPath parsing does not have access to XML namespace info. Apparently libxml2 never checks namespaces until runtime? Anyway, this seems like good future-proofing even if its only immediate effect is to dodge a bug.) Sadly, this hack only offers protection with libxml2 2.9.11 and newer. Before that there are multiple similar problems, so if you are processing untrusted XML it behooves you to get a newer version. But we have some pretty old libxml2 in the buildfarm, so it seems impractical to add a regression test to verify this fix. Per bug #18617 from Jingzhou Fu. Back-patch to all supported versions. Discussion: https://postgr.es/m/18617-1cee4d2ed1f4e7ae@postgresql.org Discussion: https://gitlab.gnome.org/GNOME/libxml2/-/issues/799
2024-09-14Run regression tests with timezone America/Los_Angeles.Tom Lane
Historically we've used timezone "PST8PDT", but the recent release 2024b of tzdb changes the definition of that zone in a way that breaks many test cases concerned with dates before 1970. Although we've not yet adopted 2024b into our own tree, this is already problematic for people using --with-system-tzdata if their platform has already adopted 2024b. To work with both older and newer versions of tzdb, switch to using "America/Los_Angeles", accepting the ensuing changes in regression test results. Back-patch to all supported branches. Per report and patch from Wolfgang Walther. Discussion: https://postgr.es/m/0a997455-5aba-4cf2-a354-d26d8bcbfae6@technowledgy.de
2024-09-14Only define NO_THREAD_SAFE_LOCALE for MSVC plperl when requiredAndrew Dunstan
Latest versions of Strawberry Perl define USE_THREAD_SAFE_LOCALE, and we therefore get a handshake error when building against such instances. The solution is to perform a test to see if USE_THREAD_SAFE_LOCALE is defined and only define NO_THREAD_SAFE_LOCALE if it isn't. Backpatch the meson.build fix back to release 16 and apply the same logic to Mkvcbuild.pm in releases 12 through 16. Original report of the issue from Muralikrishna Bandaru.
2024-09-13Allow _h_indexbuild() to be interrupted.Tom Lane
When we are building a hash index that is large enough to need pre-sorting (larger than either maintenance_work_mem or NBuffers), the initial sorting phase is interruptible, but the insertion phase wasn't. Add the missing CHECK_FOR_INTERRUPTS(). Per bug #18616 from Alexander Lakhin. Back-patch to all supported branches. Pavel Borisov Discussion: https://postgr.es/m/18616-acbb9e5caf41e964@postgresql.org