summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2025-05-23Fix per-relation memory leakage in autovacuum.Tom Lane
PgStat_StatTabEntry and AutoVacOpts structs were leaked until the end of the autovacuum worker's run, which is bad news if there are a lot of relations in the database. Note: pfree'ing the PgStat_StatTabEntry structs here seems a bit risky, because pgstat_fetch_stat_tabentry_ext does not guarantee anything about whether its result is long-lived. It appears okay so long as autovacuum forces PGSTAT_FETCH_CONSISTENCY_NONE, but I think that API could use a re-think. Also ensure that the VacuumRelation structure passed to vacuum() is in recoverable storage. Back-patch to v15 where we started to manage table statistics this way. (The AutoVacOpts leakage is probably older, but I'm not excited enough to worry about just that part.) Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us Backpatch-through: 15
2025-05-23Fix AlignedAllocRealloc to cope sanely with OOM.Tom Lane
If the inner allocation call returns NULL, we should restore the previous state and return NULL. Previously this code pfree'd the old chunk anyway, which is surely wrong. Also, make it call MemoryContextAllocationFailure rather than summarily returning NULL. The fact that we got control back from the inner call proves that MCXT_ALLOC_NO_OOM was passed, so this change is just cosmetic, but someday it might be less so. This is just a latent bug at present: AFAICT no in-core callers use this function at all, let alone call it with MCXT_ALLOC_NO_OOM. Still, it's the kind of bug that might bite back-patched code pretty hard someday, so let's back-patch to v17 where the bug was introduced (by commit 743112a2e). Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us Backpatch-through: 17
2025-05-22Fix memory leak in XMLSERIALIZE(... INDENT).Tom Lane
xmltotext_with_options sometimes tries to replace the existing root node of a libxml2 document. In that case xmlDocSetRootElement will unlink and return the old root node; if we fail to free it, it's leaked for the remainder of the session. The amount of memory at stake is not large, a couple hundred bytes per occurrence, but that could still become annoying in heavy usage. Our only other xmlDocSetRootElement call is not at risk because it's working on a just-created document, but let's modify that code too to make it clear that it's dependent on that. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Discussion: https://postgr.es/m/1358967.1747858817@sss.pgh.pa.us Backpatch-through: 16
2025-05-21Fix incorrect WAL description for PREPARE TRANSACTION record.Fujii Masao
Since commit 8b1dccd37c7, the PREPARE TRANSACTION WAL record includes information about dropped statistics entries. However, the WAL resource manager description function for PREPARE TRANSACTION record failed to parse this information correctly and always assumed there were no such entries. As a result, for example, pg_waldump could not display the dropped statistics entries stored in PREPARE TRANSACTION records. The root cause was that ParsePrepareRecord() did not set the number of statistics entries to drop on commit or abort. These values remained zero-initialized and were never updated from the parsed record. This commit fixes the issue by properly setting those values during parsing. With this fix, pg_waldump can now correctly report dropped statistics entries in PREPARE TRANSACTION records. Back-patch to v15, where commit 8b1dccd37c7 was introduced. Author: Daniil Davydov <3danissimo@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAJDiXgh-6Epb2XiJe4uL0zF-cf0_s_7Lw1TfEHDMLzYjEmfGOw@mail.gmail.com Backpatch-through: 15
2025-05-20Fix cross-version upgrade test failureHeikki Linnakangas
Commit 29f7ce6fe7 added another view that needs adjustment in the cross-version upgrade test. This should fix the XversionUpgrade failures in the buildfarm. Backpatch-through: 16 Discussion: https://www.postgresql.org/message-id/18929-077d6b7093b176e2@postgresql.org
2025-05-19Fix deparsing FETCH FIRST <expr> ROWS WITH TIESHeikki Linnakangas
In the grammar, <expr> is a c_expr, which accepts only a limited set of integer literals and simple expressions without parens. The deparsing logic didn't quite match the grammar rule, and failed to use parens e.g. for "5::bigint". To fix, always surround the expression with parens. Would be nice to omit the parens in simple cases, but unfortunately it's non-trivial to detect such simple cases. Even if the expression is a simple literal 123 in the original query, after parse analysis it becomes a FuncExpr with COERCE_IMPLICIT_CAST rather than a simple Const. Reported-by: yonghao lee Backpatch-through: 13 Discussion: https://www.postgresql.org/message-id/18929-077d6b7093b176e2@postgresql.org
2025-05-19Don't retreat slot's confirmed_flush LSN.Amit Kapila
Prevent moving the confirmed_flush backwards, as this could lead to data duplication issues caused by replicating already replicated changes. This can happen when a client acknowledges an LSN it doesn't have to do anything for, and thus didn't store persistently. After a restart, the client can send the prior LSN that it stored persistently as an acknowledgement, but we need to ignore such an LSN to avoid retreating confirm_flush LSN. Diagnosed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Author: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Tested-by: Nisha Moond <nisha.moond412@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/CAJpy0uDZ29P=BYB1JDWMCh-6wXaNqMwG1u1mB4=10Ly0x7HhwQ@mail.gmail.com Discussion: https://postgr.es/m/OS0PR01MB57164AB5716AF2E477D53F6F9489A@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-05-18Make our usage of memset_s() conform strictly to the C11 standard.Tom Lane
Per the letter of the C11 standard, one must #define __STDC_WANT_LIB_EXT1__ as 1 before including <string.h> in order to have access to memset_s(). It appears that many platforms are lenient about this, because we weren't doing it and yet the code appeared to work anyway. But we now find that with -std=c11, macOS is strict and doesn't declare memset_s, leading to compile failures since we try to use it anyway. (Given the lack of prior reports, perhaps this is new behavior in the latest SDK? No matter, we're clearly in the wrong.) In addition to the immediate problem, which could be fixed merely by adding the needed #define to explicit_bzero.c, it seems possible that our configure-time probe for memset_s() could fail in case a platform implements the function in some odd way due to this spec requirement. This concern can be fixed in largely the same way that we dealt with strchrnul() in 6da2ba1d8: switch to using a declaration-based configure probe instead of a does-it-link probe. Back-patch to v13 where we started using memset_s(). Reported-by: Lakshmi Narayana Velayudam <dev.narayana.v@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAA4pTnLcKGG78xeOjiBr5yS7ZeE-Rh=FaFQQGOO=nPzA1L8yEA@mail.gmail.com Backpatch-through: 13
2025-05-15Fix Assert failure in XMLTABLE parserRichard Guo
In an XMLTABLE expression, columns can be marked NOT NULL, and the parser internally fabricates an option named "is_not_null" to represent this. However, the parser also allows users to specify arbitrary option names. This creates a conflict: a user can explicitly use "is_not_null" as an option name and assign it a non-Boolean value, which violates internal assumptions and triggers an assertion failure. To fix, this patch checks whether a user-supplied name collides with the internally reserved option name and raises an error if so. Additionally, the internal name is renamed to "__pg__is_not_null" to further reduce the risk of collision with user-defined names. Reported-by: Евгений Горбанев <gorbanyoves@basealt.ru> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/6bac9886-65bf-4cec-96bd-e304159f28db@basealt.ru Backpatch-through: 15
2025-05-13Fix order of parameters in POD documentationDaniel Gustafsson
The documentation for log_check() had the parameters in the wrong order. Also while there, rename %parameters to %params to better documentation for similar functions which use %params. Backpatch down to v14 where this was introduced. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/9F503B5-32F2-45D7-A0AE-952879AD65F1@yesql.se Backpatch-through: 14
2025-05-11Fix comment of tsquerysend()Álvaro Herrera
The comment describes the order in which fields are sent, and it had one of the fields in the wrong place. This has been wrong since e6dbcb72fafa (2008), so backpatch all the way back. Author: Emre Hasegeli <emre@hasegeli.com> Discussion: https://postgr.es/m/CAE2gYzzf38bR_R=izhpMxAmqHXKeM5ajkmukh4mNs_oXfxcMCA@mail.gmail.com
2025-05-10Fix incorrect "return NULL" in BumpAllocLarge().Tom Lane
This must be "return MemoryContextAllocationFailure(context, size, flags)" instead. The effect of this oversight is that if we got a malloc failure right here, the code would act as though MCXT_ALLOC_NO_OOM had been specified, whether it was or not. That would likely lead to a null-pointer-dereference crash at the unsuspecting call site. Noted while messing with a patch to improve our Valgrind leak detection support. Back-patch to v17 where this code came in.
2025-05-09Skip RSA-PSS ssl test when using LibreSSL.Tom Lane
Presently, LibreSSL does not have working support for RSA-PSS, so disable that test. Per discussion at https://marc.info/?l=libressl&m=174664225002441&w=2 they do intend to fix this, but it's a ways off yet. Reported-by: Thomas Munro <thomas.munro@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA+hUKG+fLqyweHqFSBcErueUVT0vDuSNWui-ySz3+d_APmq7dw@mail.gmail.com Backpatch-through: 15
2025-05-09Centralize ssl tests' check for whether we're using LibreSSL.Tom Lane
Right now there's only one caller, so that this is merely an exercise in shoving code from one module to another, but there will shortly be another one. It seems better to avoid having two copies of this highly-subject-to-change test. Back-patch to v15, where we first introduced some tests that don't work with LibreSSL. Reported-by: Thomas Munro <thomas.munro@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA+hUKG+fLqyweHqFSBcErueUVT0vDuSNWui-ySz3+d_APmq7dw@mail.gmail.com Backpatch-through: 15
2025-05-05With GB18030, prevent SIGSEGV from reading past end of allocation.Noah Misch
With GB18030 as source encoding, applications could crash the server via SQL functions convert() or convert_from(). Applications themselves could crash after passing unterminated GB18030 input to libpq functions PQescapeLiteral(), PQescapeIdentifier(), PQescapeStringConn(), or PQescapeString(). Extension code could crash by passing unterminated GB18030 input to jsonapi.h functions. All those functions have been intended to handle untrusted, unterminated input safely. A crash required allocating the input such that the last byte of the allocation was the last byte of a virtual memory page. Some malloc() implementations take measures against that, making the SIGSEGV hard to reach. Back-patch to v13 (all supported versions). Author: Noah Misch <noah@leadboat.com> Author: Andres Freund <andres@anarazel.de> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Backpatch-through: 13 Security: CVE-2025-4207
2025-05-05Refactor test_escape.c for additional ways of testing.Noah Misch
Start the file with static functions not specific to pe_test_vectors tests. This way, new tests can use them without disrupting the file's layout. Change report_result() PQExpBuffer arguments to plain strings. Back-patch to v13 (all supported versions), for the next commit. Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Backpatch-through: 13 Security: CVE-2025-4207
2025-05-05Translation updatesPeter Eisentraut
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: ff466b83eb6fbf9434ff087426546e3dc988135d
2025-05-03Fix typos in comments.Etsuro Fujita
Also adjust the phrasing in the comments. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Author: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Gurjeet Singh <gurjeet@singh.im> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAPmGK17%3DPHSDZ%2B0G6jcj12buyyE1bQQc3sbp1Wxri7tODT-SDw%40mail.gmail.com Backpatch-through: 15
2025-05-02Handle self-referencing FKs correctly in partitioned tablesÁlvaro Herrera
For self-referencing foreign keys in partitioned tables, we weren't handling creation of pg_constraint rows during CREATE TABLE PARTITION AS as well as ALTER TABLE ATTACH PARTITION. This is an old bug -- mostly, we broke this in 614a406b4ff1 while trying to fix it (so 12.13, 13.9, 14.6 and 15.0 and up all behave incorrectly). This commit reverts part of that with additional fixes for full correctness, and installs more tests to verify the parts we broke, not just the catalog contents but also the user-visible behavior. Backpatch to all live branches. In branches 13 and 14, commit 46a8c27a7226 changed the behavior during DETACH to drop a FK constraint rather than trying to repair it, because the complete fix of repairing catalog constraints was problematic due to lack of previous fixes. For this reason, the test behavior in those branches is a bit different. However, as best as I can tell, the fix works correctly there. In release notes we have to recommend that all self-referencing foreign keys on partitioned tables be recreated if partitions have been created or attached after the FK was created, keeping in mind that violating rows might already be present on the referencing side. Reported-by: Guillaume Lelarge <guillaume@lelarge.info> Reported-by: Matthew Gabeler-Lee <fastcat@gmail.com> Reported-by: Luca Vallisa <luca.vallisa@gmail.com> Discussion: https://postgr.es/m/CAECtzeWHCA+6tTcm2Oh2+g7fURUJpLZb-=pRXgeWJ-Pi+VU=_w@mail.gmail.com Discussion: https://postgr.es/m/18156-a44bc7096f0683e6@postgresql.org Discussion: https://postgr.es/m/CAAT=myvsiF-Attja5DcWoUWh21R12R-sfXECY2-3ynt8kaOqjw@mail.gmail.com
2025-04-30Update time zone data files to tzdata release 2025b.Tom Lane
DST law changes in Chile: there is a new time zone America/Coyhaique for Chile's Aysén Region, to account for it changing to UTC-03 year-round and thus diverging from America/Santiago. Historical corrections for Iran. Backpatch-through: 13
2025-04-28Fix xmin advancement during fast_forward decoding.Amit Kapila
During logical decoding, we advance catalog_xmin of logical too early in fast_forward mode, resulting in required catalog data being removed by vacuum. This mode is normally used to advance the slot without processing the changes, but we still can't let the slot's xmin to advance to an incorrect value. Commit f49a80c481 fixed a similar issue where the logical slot's catalog_xmin was getting advanced prematurely during non-fast-forward mode. During xl_running_xacts processing, instead of directly advancing the slot's xmin to the oldest running xid in the record, it allowed the xmin to be held back for snapshots that can be used for not-yet-replayed transactions, as those might consider older txns as running too. However, it missed the fact that the same problem can happen during fast_forward mode decoding, as we won't build a base snapshot in that mode, and the future call to get_changes from the same slot can miss seeing the required catalog changes leading to incorrect reslts. This commit allows building the base snapshot even in fast_forward mode to prevent the early advancement of xmin. Reported-by: Amit Kapila <amit.kapila16@gmail.com> Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/CAA4eK1LqWncUOqKijiafe+Ypt1gQAQRjctKLMY953J79xDBgAg@mail.gmail.com Discussion: https://postgr.es/m/OS0PR01MB57163087F86621D44D9A72BF94BB2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-04-23Avoid possibly-theoretical OOM crash hazard in hash_create().Tom Lane
One place in hash_create() used DynaHashAlloc() as a convenient shorthand for MemoryContextAlloc(). That was fine when it was written, but it stopped being fine when 9c911ec06 changed DynaHashAlloc() to use MCXT_ALLOC_NO_OOM (mea culpa). Change the code to call plain MemoryContextAlloc() as intended. I think that this bug may be unreachable in practice, since we now always create AllocSets with some space already allocated, so that an OOM failure here for a non-shared hash table should be impossible (with a hash table name of reasonable length anyway). And there aren't enough shared hash tables to make a crash for one of those probable. Nonetheless it's clearly not operating as designed, so back-patch to v16 where 9c911ec06 came in. Reported-by: Maksim Korotkov <m.korotkov@postgrespro.ru> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/219bdccd460510efaccf90b57e5e5ef2@postgrespro.ru Backpatch-through: 16
2025-04-23Fix an oversight in 3f28b2fcac.Amit Kapila
Commit 3f28b2fcac tried to ensure that the replication origin shouldn't be advanced in case of an ERROR in the apply worker, so that it can request the same data again after restart. However, it is possible that an ERROR was caught and handled by a (say PL/pgSQL) function, and the apply worker continues to apply further changes, in which case, we shouldn't reset the replication origin. Ensure to reset the origin only when the apply worker exits after an ERROR. Commit 3f28b2fcac added new function geterrlevel, which we removed in HEAD as part of this commit, but kept it in backbranches to avoid breaking any applications. A separate case can be made to have such a function even for HEAD. Reported-by: Shawn McCoy <shawn.the.mccoy@gmail.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 16, where it was introduced Discussion: https://postgr.es/m/CALsgZNCGARa2mcYNVTSj9uoPcJo-tPuWUGECReKpNgTpo31_Pw@mail.gmail.com
2025-04-23Remove assertion based on pending_since in pgstat_report_stat()Michael Paquier
This assertion, based on pending_since (timestamp used to prevent stats reports to be too frequent or should a partial flush happen), is reached when it is found that no data can be flushed but a previous call of pgstat_report_stat() determined that some stats data has been found as in need of a flush. So pending_since is set when some stats data is pending (in non-force mode) or if report attempts are too frequent, and reset to 0 once all stats have been flushed. Since 5cbbe70a9cc6, WAL senders have begun to report their stats on a periodic basis for IO stats in v16~ and backend stats on HEAD, creating some friction with the concurrent pgstat_report_stat() calls that can happen in the context of a WAL sender (shutdown callback doing a final report or backend-related code paths). This problem is the cause of spurious failures in the TAP tests. In theory, this assertion can be also reached in v15, even if that's very unlikely. For example, a process, say a background worker, could do periodic and direct stats flushes with concurrent calls of pgstat_report_stat() that could cause conflicting values of pending_since. This can be done with WAL or SLRU stats flushes using pgstat_flush_wal() or pgstat_slru_flush(). HEAD makes this situation easier to happen with custom cumulative stats. This commit removes the assertion altogether, per discussion, as it is more useful to keep the state of things as they are for the WAL sender. The assertion could use a special state based on for example am_walsender, but I doubt that this would be meaningful in the long run based on the other arguments raised while discussing this issue. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/1489124.1744685908@sss.pgh.pa.us Discussion: https://postgr.es/m/dwrkeszz6czvtkxzr5mqlciy652zau5qqnm3cp5f3p2po74ppk@omg4g3cc6dgq Backpatch-through: 15
2025-04-22Rename injection point for invalidation messages at end of transactionMichael Paquier
This injection point was named "AtEOXact_Inval-with-transInvalInfo", not respecting the implied naming convention that injection points should use lower-case characters, with terms separated by dashes. All the other points defined in the tree follow this style, so let's be more consistent. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/OSCPR01MB14966E14C1378DEE51FB7B7C5F5B32@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 17
2025-04-20Test restartpoints in archive recovery.Noah Misch
v14 commit 1f95181b44c843729caaa688f74babe9403b5850 and its v13 equivalent caused timing-dependent failures in archive recovery, at restartpoints. The symptom was "invalid magic number 0000 in log segment X, offset 0", "unexpected pageaddr X in log segment Y, offset 0" [X < Y], or an assertion failure. Commit 3635a0a35aafd3bfa80b7a809bc6e91ccd36606a and predecessors back-patched v15 changes to fix that. This test reproduces the problem probabilistically, typically in less than 1000 iterations of the test. Hence, buildfarm and CI runs would have surfaced enough failures to get attention within a day. Reported-by: Arun Thirupathi <arunth@google.com> Discussion: https://postgr.es/m/20250306193013.36.nmisch@google.com Backpatch-through: 13
2025-04-20Avoid ERROR at ON COMMIT DELETE ROWS after relhassubclass=f.Noah Misch
Commit 7102070329d8147246d2791321f9915c3b5abf31 fixed a similar bug, but it missed the case of database-wide ANALYZE ("use_own_xacts" mode). Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 changed consequences from silent discard of a pg_class stats (relpages et al.) update to ERROR "tuple to be updated was already modified". Losing a relpages update of an ON COMMIT DELETE ROWS table was negligible, but a COMMIT-time error isn't negligible. Back-patch to v13 (all supported versions). Reported-by: Richard Guo <guofenglinux@gmail.com Reported-by: Robins Tharakan <tharakan@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-XwMKMKJ_GT=p3_-_=j9rQSEs1FbDFUnW9zHuKPsPNEQ@mail.gmail.com Backpatch-through: 13
2025-04-20Fix issue with ORDER BY / DISTINCT aggregates and FILTERDavid Rowley
1349d2790 added support so that aggregate functions with an ORDER BY or DISTINCT clause could make use of presorted inputs to avoid an implicit sort within nodeAgg.c. That commit failed to consider that a FILTER clause may exist that filters rows before the aggregate function arguments are evaluated. That can be problematic if an aggregate argument contains an expression which could error out during evaluation. It's perfectly valid to want to have a FILTER clause which eliminates such values, and with the pre-sorted path added in 1349d2790, it was possible that the planner would produce a plan with a Sort node above the Aggregate to perform the sort on the aggregate's arguments long before the Aggregate node would filter out the non-matching values. Here we fix this by inspecting ORDER BY / DISTINCT aggregate functions which have a FILTER clause to see if the aggregate's arguments are anything more complex than a Var or a Const. Evaluating these isn't going to cause an error. If we find any non-Var, non-Const parameters then the planner will now opt to perform the sort in the Aggregate node for these aggregates, i.e. disable the presorted aggregate optimization. An alternative fix would have been to completely disallow the presorted optimization for Aggrefs with any FILTER clause, but that wasn't done as that could cause large performance regressions for queries that see significant gains from 1349d2790 due to presorted results coming in from an Index Scan. Backpatch to 16, where 1349d2790 was introduced Author: David Rowley <dgrowleyml@gmail.com> Reported-by: Kaimeh <kkaimeh@gmail.com> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAK-%2BJz9J%3DQ06-M7cDJoPNeYbz5EZDqkjQbJnmRyQyzkbRGsYkA%40mail.gmail.com Backpatch-through: 16
2025-04-16Fix pg_dump --clean with partitioned indexes.Tom Lane
We'd try to drop the partitions of a partitioned index separately, which is disallowed by the backend, leading to an error during restore. While the error is harmless, it causes problems if you try to use --single-transaction mode. Fortunately, there seems no need to do a DROP at all, since the partition will go away silently when we drop either the parent index or the partition's table. So just make the DROP conditional on not being a partition. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxF0QSdkjFKF4di-JGWN6CSdQYEAhGPmQJJCdkSZtd=oLg@mail.gmail.com Backpatch-through: 13
2025-04-15Fix failure for generated column with a not-null domain constraint.Tom Lane
If a GENERATED column is declared to have a domain data type where the domain's constraints disallow null values, INSERT commands failed because we built a targetlist that included coercing a null constant to the domain's type. The failure occurred even when the generated value would have been perfectly OK. This is adjacent to the issues fixed in 0da39aa76, but we didn't notice for lack of testing a domain with such a constraint. We aren't going to use the result of the targetlist entry for the generated column --- ExecComputeStoredGenerated will overwrite it. So it's not really necessary that it have the exact datatype of the generated column. This patch fixes the problem by changing the targetlist entry to be a null Const of the domain's base type, which should be sufficiently legal. (We do have to tweak ExecCheckPlanOutput to accept the situation, though.) This has been broken since we implemented generated columns. However, this patch only applies easily as far back as v14, partly because I (tgl) only carried 0da39aa76 back that far, but mostly because v14 significantly refactored the handling of INSERT/UPDATE targetlists. Given the lack of field complaints and the short remaining support lifetime of v13, I judge the cost-benefit ratio not good for devising a version that would work in v13. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxG59tip2+9h=rEv-ykOFjt0cbsPVchhi0RTij8bABBA0Q@mail.gmail.com Backpatch-through: 14
2025-04-15pg_combinebackup: Fix incorrect code documentationDaniel Gustafsson
The code comment for parse_oid accidentally used the wrong parameter when referring to the location of the last backup. Also, while there, improve sentence wording by removing a superfluous word. Backpatch to v17 where pg_combinebackup was addedd Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAAJ_b95ecWgzcS4K3Dx0E_Yp-SLwK5JBasFgioKMSjhQLw9xvg@mail.gmail.com Backpatch-through: 17
2025-04-12Fix GIN's shimTriConsistentFn to not corrupt its input.Tom Lane
Commit 0f21db36d made an assumption that GIN triConsistentFns would not modify their input entryRes[] arrays. But in fact, the "shim" triConsistentFn that we use for opclasses that don't supply their own did exactly that, potentially leading to wrong answers from a GIN index search. Through bad luck, none of the test cases that we have for such opclasses exposed the bug. One response to this could be that the assumption of consistency check functions not modifying entryRes[] arrays is a bad one, but it still seems reasonable to me. Notably, shimTriConsistentFn is itself assuming that with respect to the underlying boolean consistentFn, so it's sure being self-centered in supposing that it gets to do so. Fortunately, it's quite simple to fix shimTriConsistentFn to restore the entry-time state of entryRes[], so let's do that instead. This issue doesn't affect any core GIN opclasses, since they all supply their own triConsistentFns. It does affect contrib modules btree_gin, hstore, and intarray. Along the way, I (tgl) noticed that shimTriConsistentFn failed to pick up on a "recheck" flag returned by its first call to the boolean consistentFn. This may be only a latent problem, since it would be unlikely for a consistentFn to set recheck for the all-false case and not any other cases. (Indeed, none of our contrib modules do that.) Nonetheless, it's formally wrong. Reported-by: Vinod Sridharan <vsridh90@gmail.com> Author: Vinod Sridharan <vsridh90@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFMdLD7XzsXfi1+DpTqTgrD8XU0i2C99KuF=5VHLWjx4C1pkcg@mail.gmail.com Backpatch-through: 13
2025-04-11Fix race with synchronous_standby_names at startupMichael Paquier
synchronous_standby_names cannot be reloaded safely by backends, and the checkpointer is in charge of updating a state in shared memory if the GUC is enabled in WalSndCtl, to let the backends know if they should wait or not for a given LSN. This provides a strict control on the timing of the waiting queues if the GUC is enabled or disabled, then reloaded. The checkpointer is also in charge of waking up the backends that could be waiting for a LSN when the GUC is disabled. This logic had a race condition at startup, where it would be possible for backends to not wait for a LSN even if synchronous_standby_names is enabled. This would cause visibility issues with transactions that we should be waiting for but they were not. The problem lasts until the checkpointer does its initial update of the shared memory state when it loads synchronous_standby_names. In order to take care of this problem, the shared memory state in WalSndCtl is extended to detect if it has been initialized by the checkpointer, and not only check if synchronous_standby_names is defined. In WalSndCtlData, sync_standbys_defined is renamed to sync_standbys_status, a bits8 able to know about two states: - If the shared memory state has been initialized. This flag is set by the checkpointer at startup once, and never removed. - If synchronous_standby_names is known as defined in the shared memory state. This is the same as the previous sync_standbys_defined in WalSndCtl. This method gives a way for backends to decide what they should do until the shared memory area is initialized, and they now ultimately fall back to a check on the GUC value in this case, which is the best thing that can be done. Fortunately, SyncRepUpdateSyncStandbysDefined() is called immediately by the checkpointer when this process starts, so the window is very narrow. It is possible to enlarge the problematic window by making the checkpointer wait at the beginning of SyncRepUpdateSyncStandbysDefined() with a hardcoded sleep for example, and doing so has showed that a 2PC visibility test is indeed failing. On machines slow enough, this bug would cause spurious failures. In 17~, we have looked at the possibility of adding an injection point to have a reproducible test, but as the problematic window happens at early startup, we would need to invent a way to make an injection point optionally persistent across restarts when attached, something that would be fine for this case as it would involve the checkpointer. This issue is quite old, and can be reproduced on all the stable branches. Author: Melnikov Maksim <m.melnikov@postgrespro.ru> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/163fcbec-900b-4b07-beaa-d2ead8634bec@postgrespro.ru Backpatch-through: 13
2025-04-10Fix data loss in logical replication.Amit Kapila
Data loss can happen when the DDLs like ALTER PUBLICATION ... ADD TABLE ... or ALTER TYPE ... that don't take a strong lock on table happens concurrently to DMLs on the tables involved in the DDL. This happens because logical decoding doesn't distribute invalidations to concurrent transactions and those transactions use stale cache data to decode the changes. The problem becomes bigger because we keep using the stale cache even after those in-progress transactions are finished and skip the changes required to be sent to the client. This commit fixes the issue by distributing invalidation messages from catalog-modifying transactions to all concurrent in-progress transactions. This allows the necessary rebuild of the catalog cache when decoding new changes after concurrent DDL. We observed performance regression primarily during frequent execution of *publication DDL* statements that modify the published tables. The regression is minor or nearly nonexistent for DDLs that do not affect the published tables or occur infrequently, making this a worthwhile cost to resolve a longstanding data loss issue. An alternative approach considered was to take a strong lock on each affected table during publication modification. However, this would only address issues related to publication DDLs (but not the ALTER TYPE ...) and require locking every relation in the database for publications created as FOR ALL TABLES, which is impractical. The bug exists in all supported branches, but we are backpatching till 14. The fix for 13 requires somewhat bigger changes than this fix, so the fix for that branch is still under discussion. Reported-by: hubert depesz lubaczewski <depesz@depesz.com> Reported-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Tested-by: Benoit Lobréau <benoit.lobreau@dalibo.com> Backpatch-through: 14 Discussion: https://postgr.es/m/de52b282-1166-1180-45a2-8d8917ca74c6@enterprisedb.com Discussion: https://postgr.es/m/CAD21AoAenVqiMjpN-PvGHL1N9DWnHSq673bfgr6phmBUzx=kLQ@mail.gmail.com
2025-04-09Fix test races between syscache-update-pruned.spec and autovacuum.Noah Misch
This spec fails ~3% of my Valgrind runs, and the spec has failed on Valgrind buildfarm member skink at a similar rate. Two problems contributed to that: - A competing buffer pin triggered VACUUM's lazy_scan_noprune() path, causing "tuples missed: 1 dead from 1 pages not removed due to cleanup lock contention". FREEZE fixes that. - The spec ran lazy VACUUM immediately after VACUUM FULL. The spec implicitly assumed lazy VACUUM prunes the one tuple that VACUUM FULL made dead. First wait for old snapshots, making that assumption reliable. This also adds two forms of defense in depth: - Wait for snapshots using shared catalog pruning rules (VISHORIZON_SHARED). This avoids the removable cutoff moving backward when an XID-bearing autoanalyze process runs in another database. That may never happen in this test, but it's cheap insurance. - Use lazy VACUUM option DISABLE_PAGE_SKIPPING. Commit c2dc1a79767a0f947e1145f82eb65dfe4360d25f did this for a related requirement in other tests, but I suspect FREEZE is necessary and sufficient in all these tests. Back-patch to v17, where the test first appeared. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/sv3taq4e6ea4qckimien3nxp3sz4b6cw6sfcy4nhwl52zpur4g@h6i6tohxmizu Backpatch-through: 17
2025-04-08Stabilize 035_standby_logical_decoding.pl.Amit Kapila
Some tests try to invalidate logical slots on the standby server by running VACUUM on the primary. The problem is that xl_running_xacts was getting generated and replayed before the VACUUM command, leading to the advancement of the active slot's catalog_xmin. Due to this, active slots were not getting invalidated, leading to test failures. We fix it by skipping the generation of xl_running_xacts for the required tests with the help of injection points. As the required interface for injection points was not present in back branches, we fixed the failing tests in them by disallowing the slot to become active for the required cases (where rows_removed conflict could be generated). Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 16, where it was introduced Discussion: https://postgr.es/m/Z6oQXc8LmiTLfwLA@ip-10-97-1-34.eu-west-3.compute.internal
2025-04-07Fix PG 17 [NOT] NULL optimization bug for domainsBruce Momjian
A PG 17 optimization allowed columns with NOT NULL constraints to skip table scans for IS NULL queries, and to skip IS NOT NULL checks for IS NOT NULL queries. This didn't work for domain types, since domain types don't follow the IS NULL/IS NOT NULL constraint logic. To fix, disable this optimization for domains for PG 17+. Reported-by: Jan Behrens Diagnosed-by: Tom Lane Discussion: https://postgr.es/m/Z37p0paENWWUarj-@momjian.us Backpatch-through: 17
2025-04-08Flush the IO statistics of active WAL senders more frequentlyMichael Paquier
WAL senders do not flush their statistics until they exit, limiting the monitoring possible for live processes. This is penalizing when WAL senders are running for a long time, like in streaming or logical replication setups, because it is not possible to know the amount of IO they generate while running. This commit makes WAL senders more aggressive with their statistics flush, using an internal of 1 second, with the flush timing calculated based on the existing GetCurrentTimestamp() done before the sleeps done to wait for some activity. Note that the sleep done for logical and physical WAL senders happens in two different code paths, so the stats flushes need to happen in these two places. One test is added for the physical WAL sender case, and one for the logical WAL sender case. This can be done in a stable fashion by relying on the WAL generated by the TAP tests in combination with a stats reset while a server is running, but only on HEAD as WAL data has been added to pg_stat_io in a051e71e28a1. This issue exists since a9c70b46dbe and the introduction of pg_stat_io, so backpatch down to v16. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/Z73IsKBceoVd4t55@ip-10-97-1-34.eu-west-3.compute.internal Backpatch-through: 16
2025-04-06Fix unintentional 'NULL' string literal in pg_upgrade.Jeff Davis
Introduced in 2a083ab807. Note: backport of commit 945126234b, which was missed at the time. Discussion: https://postgr.es/m/e852442da35b4f31acc600ed98bbee0f12e65e0c.camel@j-davis.com Reviewed-by: Michael Paquier <michael@paquier.xyz> Backpatch-through: 16
2025-04-05Fix parse_cte.c's failure to examine sub-WITHs in DML statements.Tom Lane
makeDependencyGraphWalker thought that only SelectStmt nodes could contain a WithClause. Which was true in our original implementation of WITH, but astonishingly we missed updating this code when we added the ability to attach WITH to INSERT/UPDATE/DELETE (and later MERGE). Moreover, since it was coded to deliberately block recursion to a WithClause, even updating raw_expression_tree_walker didn't save it. The upshot of this was that we didn't see references to outer CTE names appearing within an inner WITH, and would neither complain about disallowed recursion nor account for such references when sorting CTEs into a usable order. The lack of complaints about this is perhaps not so surprising, because typical usage of WITH wouldn't hit either case. Still, it's pretty broken; failing to detect recursion here leads to assert failures or worse later on. Fix by factoring out the processing of sub-WITHs into a new function WalkInnerWith, and invoking that for all the statement types that can have WITH. Bug: #18878 Reported-by: Yu Liang <luy70@psu.edu> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18878-a26fa5ab6be2f2cf@postgresql.org Backpatch-through: 13
2025-04-05Avoid double transformation of json_array()'s subquery.Tom Lane
transformJsonArrayQueryConstructor() applied transformStmt() to the same subquery tree twice. While this causes no issue in many cases, there are some where it causes a coredump, thanks to the parser's habit of scribbling on its input. Fix by making a copy before the first transformation (compare 0f43083d1). This is quite brute-force, but then so is the whole business of transforming the input twice. Per discussion in the bug thread, this implementation of json_array() parsing should be replaced completely. But that will take some work and will surely not be back-patchable, so for the moment let's take the easy way out. Oversight in 7081ac46a. Back-patch to v16 where that came in. Bug: #18877 Reported-by: Yu Liang <luy70@psu.edu> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18877-c3c3ad75845833bb@postgresql.org Backpatch-through: 16
2025-04-04Repair misbehavior with duplicate entries in FK SET column lists.Tom Lane
Since v15 we've had an option to apply a foreign key constraint's ON DELETE SET DEFAULT or SET NULL action to just some of the referencing columns. There was not a check for duplicate entries in the list of columns-to-set, though. That caused a potential memory stomp in CreateConstraintEntry(), which incautiously assumed that the list of columns-to-set couldn't be longer than the number of key columns. Even after fixing that, the case doesn't work because you get an error like "multiple assignments to same column" from the SQL command that is generated to do the update. We could either raise an error for duplicate columns or silently suppress the dups, and after a bit of thought I chose to do the latter. This is motivated by the fact that duplicates in the FK column list are legal, so it's not real clear why duplicates in the columns-to-set list shouldn't be. Of course there's no need to actually set the column more than once. I left in the fix in CreateConstraintEntry() too, just because it didn't seem like such low-level code ought to be making assumptions about what it's handed. Bug: #18879 Reported-by: Yu Liang <luy70@psu.edu> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18879-259fc59d072bd4d7@postgresql.org Backpatch-through: 15
2025-04-04Relax assertion in finding correct GiST parentHeikki Linnakangas
Commit 28d3c2ddcf introduced an assertion that if the memorized downlink location in the insertion stack isn't valid, the parent's LSN should've changed too. Turns out that was too strict. In gistFindCorrectParent(), if we walk right, we update the parent's block number and clear its memorized 'downlinkoffnum'. That triggered the assertion on next call to gistFindCorrectParent(), if the parent needed to be split too. Relax the assertion, so that it's OK if downlinkOffnum is InvalidOffsetNumber. Backpatch to v13-, all supported versions. The assertion was added in commit 28d3c2ddcf in v12. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://www.postgresql.org/message-id/18396-03cac9beb2f7aac3@postgresql.org
2025-04-04Fix logical decoding test to correctly check slot removal on standby.Fujii Masao
The regression test for logical decoding verifies whether a logical slot is correctly dropped on a standby when its associated database is dropped. However, the test mistakenly retrieved slot information from the primary instead of the standby, causing incorrect behavior. This commit fixes the issue by ensuring the test correctly checks the slot on the standby. Back-patch to all supported versions. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/1fdfd020-a509-403c-bd8f-a04664aba148@oss.nttdata.com Backpatch-through: 13
2025-04-04Fix logical decoding regression tests to correctly check slot existence.Fujii Masao
The regression tests for logical decoding verify whether a logical slot exists or has been dropped. Previously, these tests attempted to retrieve "slot_name" from the result of slot(), but since "slot_name" was not included in the result, slot()->{'slot_name'} always returned undef, leading to incorrect behavior. This commit fixes the issue by checking the "plugin" field in the result of slot() instead, ensuring the tests properly verify slot existence. Back-patch to all supported versions. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB149667EC4E738769CA80B7EA5F5AE2@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 13
2025-04-03Restrict copying of invalidated replication slots.Masahiko Sawada
Previously, invalidated logical and physical replication slots could be copied using the pg_copy_logical_replication_slot and pg_copy_physical_replication_slot functions. Replication slots that were invalidated for reasons other than WAL removal retained their restart_lsn. This meant that a new slot copied from an invalidated slot could have a restart_lsn pointing to a WAL segment that might have already been removed. This commit restricts the copying of invalidated replication slots. Backpatch to v16, where slots could retain their restart_lsn when invalidated for reasons other than WAL removal. For v15 and earlier, this check is not required since slots can only be invalidated due to WAL removal, and existing checks already handle this issue. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com Backpatch-through: 16
2025-04-02Remove unnecessary type violation in tsvectorrecv().Tom Lane
compareentry() is declared to work on WordEntryIN structs, but tsvectorrecv() is using it in two places to work on WordEntry structs. This is almost okay, since WordEntry is the first field of WordEntryIN. But on machines with 8-byte pointers, WordEntryIN will have a larger alignment spec than WordEntry, and it's at least theoretically possible that the compiler could generate code that depends on the larger alignment. Given the lack of field reports, this may be just a hypothetical bug that upsets nothing except sanitizer tools. Or it may be real on certain hardware but nobody's tried to use tsvectorrecv() on such hardware. In any case we should fix it, and the fix is trivial: just change compareentry() so that it works on WordEntry without any mention of WordEntryIN. We can also get rid of the quite-useless intermediate function WordEntryCMP. Bug: #18875 Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18875-07a29c49c825a608@postgresql.org Backpatch-through: 13
2025-04-02Remove HeapBitmapScan's skip_fetch optimizationAndres Freund
The optimization does not take the removal of TIDs by a concurrent vacuum into account. The concurrent vacuum can remove dead TIDs and make pages ALL_VISIBLE while those dead TIDs are referenced in the bitmap. This can lead to a skip_fetch scan returning too many tuples. It likely would be possible to implement this optimization safely, but we don't have the necessary infrastructure in place. Nor is it clear that it's worth building that infrastructure, given how limited the skip_fetch optimization is. In the backbranches we just disable the optimization by always passing need_tuples=true to table_beginscan_bm(). We can't perform API/ABI changes in the backbranches and we want to make the change as minimal as possible. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Reported-By: Konstantin Knizhnik <knizhnik@garret.ru> Discussion: https://postgr.es/m/CAEze2Wg3gXXZTr6_rwC+s4-o2ZVFB5F985uUSgJTsECx6AmGcQ@mail.gmail.com Backpatch-through: 13
2025-04-02Need to do CommandCounterIncrement after StoreAttrMissingVal.Tom Lane
Without this, an additional change to the same pg_attribute row within the same command will fail. This is possible at least with ALTER TABLE ADD COLUMN on a multiple-inheritance-pathway structure. (Another potential hazard is that immediately-following operations might not see the missingval.) Introduced by 95f650674, which split the former coding that used a single pg_attribute update to change both atthasdef and atthasmissing/attmissingval into two updates, but missed that this should entail two CommandCounterIncrements as well. Like that fix, back-patch through v13. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/025a3ffa-5eff-4a88-97fb-8f583b015965@gmail.com Backpatch-through: 13
2025-04-02Fix code commentPeter Eisentraut
The changes made in commit d2b4b4c2259 contained incorrect comments: They said that certain forward declarations were necessary to "avoid including pathnodes.h here", but the file is itself pathnodes.h! So change the comment to just say it's a forward declaration in one case, and in the other case we don't need the declaration at all because it already appeared earlier in the file.