summaryrefslogtreecommitdiff
path: root/src/backend
AgeCommit message (Collapse)Author
2025-09-02Add max_retention_duration option to subscriptions.Amit Kapila
This commit introduces a new subscription parameter, max_retention_duration, aimed at mitigating excessive accumulation of dead tuples when retain_dead_tuples is enabled and the apply worker lags behind the publisher. When the time spent advancing a non-removable transaction ID exceeds the max_retention_duration threshold, the apply worker will stop retaining conflict detection information. In such cases, the conflict slot's xmin will be set to InvalidTransactionId, provided that all apply workers associated with the subscription (with retain_dead_tuples enabled) confirm the retention duration has been exceeded. To ensure retention status persists across server restarts, a new column subretentionactive has been added to the pg_subscription catalog. This prevents unnecessary reactivation of retention logic after a restart. The conflict detection slot will not be automatically re-initialized unless a new subscription is created with retain_dead_tuples = true, or the user manually re-enables retain_dead_tuples. A future patch will introduce support for automatic slot re-initialization once at least one apply worker confirms that the retention duration is within the configured max_retention_duration. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-08-31Fix const-simplification for constraints and statsRichard Guo
Constraint expressions and statistics expressions loaded from the system catalogs need to be run through const-simplification, because the planner will be comparing them to similarly-processed qual clauses. Without this step, the planner may fail to detect valid matches. Currently, NullTest clauses in these expressions may not be reduced correctly during const-simplification. This happens because their Var nodes do not yet have the correct varno when eval_const_expressions is applied. Since eval_const_expressions relies on varno to reduce NullTest quals, incorrect varno can cause problems. Additionally, for statistics expressions, eval_const_expressions is called with root set to NULL, which also inhibits NullTest reduction. This patch fixes the issue by ensuring that Vars are updated to have the correct varno before const-simplification, and that a valid root is passed to eval_const_expressions when needed. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/19007-4cc6e252ed8aa54a@postgresql.org
2025-08-29Prepare DSM registry for upcoming changes to LWLock tranche names.Nathan Bossart
A proposed patch would place a limit of NAMEDATALEN-1 (i.e., 63) bytes on the names of dynamically-allocated LWLock tranches, but GetNamedDSA() and GetNamedDSHash() may register tranches with longer names. This commit lowers the maximum DSM registry entry name length to NAMEDATALEN-1 bytes and modifies GetNamedDSHash() to create only one tranche, thereby allowing us to keep the DSM registry's tranche names below NAMEDATALEN bytes. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/aKzIg1JryN1qhNuy%40nathan
2025-08-29Provide error context when an error is thrown within WaitOnLock().Tom Lane
Show the requested lock level and the object being waited on, in the same format we use for deadlock reports and similar errors. This is particularly helpful for debugging lock-timeout errors, since otherwise the user has very little to go on about which lock timed out. The performance cost of setting up the callback should be negligible compared to the other tracing support already present in WaitOnLock. As in the deadlock-report case, we just show numeric object OIDs, because it seems too scary to try to perform catalog lookups in this context. Reported-by: Steve Baldwin <steve.baldwin@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1602369.1752167154@sss.pgh.pa.us
2025-08-29Make LWLockCounter a global variable.Nathan Bossart
Using the LWLockCounter requires first calculating its address in shared memory like this: LWLockCounter = (int *) ((char *) MainLWLockArray - sizeof(int)); Commit 82e861fbe1 started this trend in order to fix EXEC_BACKEND builds, but it could also be fixed by adding it to the BackendParameters struct. The current approach is somewhat difficult to follow, so this commit switches to the latter. While at it, swap around the code in LWLockShmemSize() to match the order of assignments in CreateLWLocks() for added readability. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aLDLnan9gNCS9fHx%40nathan
2025-08-29Remove unused parameter from ProcessSlotSyncInterrupts().Nathan Bossart
Oversight in commit 93db6cbda0. Author: ChangAo Chen <cca5507@qq.com> Discussion: https://postgr.es/m/tencent_7B42BBE8D0A5C28DDAB91436192CBCCB8307%40qq.com
2025-08-30Fix possible use after free in expand_partitioned_rtentry()David Rowley
It's possible that if the only live partition is concurrently dropped and try_table_open() fails, that the bms_del_member() will pfree the live_parts Bitmapset. Since the bms_del_member() call does not assign the result back to the live_parts local variable, the while loop could segfault as that variable would still reference the pfree'd Bitmapset. Backpatch to 15. 52f3de874 was backpatched to 14, but there's no bms_del_member() there due to live_parts not yet existing in RelOptInfo in that version. Technically there's no bug in version 15 as bms_del_member() didn't pfree when the set became empty prior to 00b41463c (from v16). Applied to v15 anyway to keep the code similar and to avoid the bad coding pattern. Author: Bernd Reiß <bd_reiss@gmx.at> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/6b88f27a-c45c-4826-8e37-d61a04d90182@gmx.at Backpatch-through: 15
2025-08-29CREATE STATISTICS: improve misleading error messageÁlvaro Herrera
I think the error message for a different condition was inadvertently copied. This problem seems to have been introduced by commit a4d75c86bf15. Author: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Backpatch-through: 14 Discussion: https://postgr.es/m/CACJufxEZ48toGH0Em_6vdsT57Y3L8pLF=DZCQ_gCii6=C3MeXw@mail.gmail.com
2025-08-29Mark ItemPointer arguments as const in tuple/table lock functionsPeter Eisentraut
The functions LockTuple, ConditionalLockTuple, UnlockTuple, and XactLockTableWait take an ItemPointer argument that they do not modify, so the argument can be const-qualified to better convey intent and allow the compiler to enforce immutability. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEoWx2m9e4rECHBwpRE4%2BGCH%2BpbYZXLh2f4rB1Du5hDfKug%2BOg%40mail.gmail.com
2025-08-29Remove unneeded casts of BufferGetPage() resultPeter Eisentraut
BufferGetPage() already returns type Page, so casting it to Page doesn't achieve anything. A sizable number of call sites does this casting; remove that. This was already done inconsistently in the code in the first import in 1996 (but didn't exist in the pre-1995 code), and it was then apparently just copied around. Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/CALdSSPgFhc5=vLqHdk-zCcnztC0zEY3EU_Q6a9vPEaw7FkE9Vw@mail.gmail.com
2025-08-29Fix semijoin unique-ification for child relationsRichard Guo
For a child relation, we should not assume that its parent's unique-ified relation (or unique-ified path in v18) always exists. In cases where all RHS columns that need to be unique-ified are equated to constants, the unique-ified relation/path for the parent table is not built, as there are no columns left to unique-ify. Failing to account for this can result in a SIGSEGV crash during planning. This patch checks whether the parent's unique-ified relation or path exists and skips unique-ification of the child relation if it does not. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs49MOdLW2c+qbLHHBt8VBu=4ONpM91D19=AWeW93eFUF6A@mail.gmail.com Backpatch-through: 18
2025-08-28Use LW_SHARED in walsummarizer.c for WALSummarizerLock lock where possible.Masahiko Sawada
Previously, we used LW_EXCLUSIVE in several places despite only reading WalSummarizerCtl fields. This patch reduces the lock level to LW_SHARED where we are only reading the shared fields. Backpatch to 17, where wal summarization was introduced. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/CAD21AoDdKhf_9oriEYxY-JCdF+Oe_muhca3pcdkMEdBMzyHyKw@mail.gmail.com Backpatch-through: 17
2025-08-28Fix "variable not found in subplan target lists" in semijoin de-duplication.Tom Lane
One mechanism we have for implementing semi-joins is to de-duplicate the output of the RHS and then treat the join as a plain inner join. Initial construction of the join's SpecialJoinInfo identifies the RHS columns that need to be de-duplicated, but later we may find that some of those don't need to be handled explicitly, either because they're known to be constant or because they are redundant with some previous column. Up to now, while sort-based de-duplication handled such cases well, hash-based de-duplication didn't: we'd still hash on all of the originally-identified columns. This is probably not a very big deal performance-wise, but in the wake of commit a3179ab69 it can cause planner errors. That happens when join elimination causes recalculation of variables' attr_needed bitmapsets, and we decide that a variable mentioned in a semijoin clause doesn't need to be propagated up to the join level anymore. There are a number of ways we could slice the blame for this, but the only fix that doesn't result in pessimizing plans for loosely-related cases is to be more careful about not hashing columns we don't actually need to de-duplicate. We can install that consideration into create_unique_paths in master, or the predecessor code in create_unique_path in v18, without much refactoring. (As follow-up work, it might be a good idea to look at more-invasive refactoring, in hopes of preventing other bugs in this area. But with v18 release so close, there's not time for that now, nor would we be likely to want to put such refactoring into v18 anyway.) Reported-by: Sergey Soloviev <sergey.soloviev@tantorlabs.ru> Diagnosed-by: Richard Guo <guofenglinux@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/1fd1a421-4609-4d46-a1af-ab74d5de504a@tantorlabs.ru Backpatch-through: 18
2025-08-28Avoid including commands/dbcommands.h in so many placesÁlvaro Herrera
This has been done historically because of get_database_name (which since commit cb98e6fb8fd4 belongs in lsyscache.c/h, so let's move it there) and get_database_oid (which is in the right place, but whose declaration should appear in pg_database.h rather than dbcommands.h). Clean this up. Also, xlogreader.h and stringinfo.h are no longer needed by dbcommands.h since commit f1fd515b393a, so remove them. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/202508191031.5ipojyuaswzt@alvherre.pgsql
2025-08-28Message style improvementsPeter Eisentraut
An improvement pass over the new stats import functionality.
2025-08-27aio: Stop using enum bitfields due to bad code generationAndres Freund
During an investigation into rather odd aio related errors on macos, observed by Alexander and Konstantin, we started to wonder if bitfield access is related to the error. At the moment it looks like it is related, we cannot reproduce the failures when replacing the bitfields. In addition, the problem can only be reproduced with some compiler [versions] and not everyone has been able to reproduce the issue. The observed problem is that, very rarely, PgAioHandle->{state,target} are in an inconsistent state, after having been checked to be in a valid state not long before, triggering an assertion failure. Unfortunately, this could be caused by wrong compiler code generation or somehow of missing memory barriers - we don't really know. In theory there should not be any concurrent write access to the handle in the state the bug is triggered, as the handle was idle and is just being initialized. Separately from the bug, we observed that at least gcc and clang generate rather terrible code for the bitfield access. Even if it's not clear if the observed assertion failure is actually caused by the bitfield somehow, the bad code generation alone is sufficient reason to stop using bitfields. Therefore, replace the enum bitfields with uint8s and instead cast in each switch statement. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reported-by: Konstantin Knizhnik <knizhnik@garret.ru> Discussion: https://postgr.es/m/1500090.1745443021@sss.pgh.pa.us Backpatch-through: 18
2025-08-27Improve objectNamesToOids() commentPeter Eisentraut
Commit d31bbfb6590 removed the comment at objectNamesToOids() that there is no locking, because that commit added locking. But to fix all the problems, we'd still need a stronger lock. So put the comment back with more a detailed explanation. Co-authored-by: Noah Misch <noah@leadboat.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://www.postgresql.org/message-id/flat/bf72b82c-124d-4efa-a484-bb928e9494e4@eisentraut.org
2025-08-27Fix: Don't strip $libdir from nested module_pathnamesPeter Eisentraut
This patch fixes a bug in how 'load_external_function' handles '$libdir/ prefixes in module paths. Previously, 'load_external_function' would unconditionally strip '$libdir/' from the beginning of the 'filename' string. This caused an issue when the path was nested, such as "$libdir/nested/my_lib". Stripping the prefix resulted in a path of "nested/my_lib", which would fail to be found by the expand_dynamic_library_name function because the original '$libdir' macro was removed. To fix this, the code now checks for the presence of an additional directory separator ('/' or '\') after the '$libdir/' prefix. The prefix is only stripped if the remaining string does not contain a separator. This ensures that simple filenames like '"$libdir/my_lib"' are correctly handled, while nested paths are left intact for 'expand_dynamic_library_name' to process correctly. Reported-by: Dilip Kumar <dilipbalaut@gmail.com> Co-authored-by: Matheus Alcantara <matheusssilv97@gmail.com> Co-authored-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAFiTN-uKNzAro4tVwtJhF1UqcygfJ%2BR%2BRL%3Db-_ZMYE3LdHoGhA%40mail.gmail.com
2025-08-26Message style improvementsPeter Eisentraut
Mostly adding some quoting.
2025-08-26Put "excludeOnly" GIN scan keys at the end of the scankey array.Tom Lane
Commit 4b754d6c1 introduced the concept of an excludeOnly scan key, which cannot select matching index entries but can reject non-matching tuples, for example a tsquery such as '!term'. There are poorly-documented assumptions that such scan keys do not appear as the first scan key. ginNewScanKey did nothing to ensure that, however, with the result that certain GIN index searches could go into an infinite loop while apparently-equivalent queries with the clauses in a different order were fine. Fix by teaching ginNewScanKey to place all excludeOnly scan keys after all not-excludeOnly ones. So far as we know at present, it might be sufficient to avoid the case where the very first scan key is excludeOnly; but I'm not very convinced that there aren't other dependencies on the ordering. Bug: #19031 Reported-by: Tim Wood <washwithcare@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19031-0638148643d25548@postgresql.org Backpatch-through: 13
2025-08-26Do CHECK_FOR_INTERRUPTS inside, not before, scanGetItem.Tom Lane
The CHECK_FOR_INTERRUPTS call in gingetbitmap turns out to be inadequate to prevent a long uninterruptible loop, because we now know a case where looping occurs within scanGetItem. While the next patch will fix the bug that caused that, it seems foolish to assume that no similar patterns are possible. Let's do the CFI within scanGetItem's retry loop, instead. This demonstrably allows canceling out of the loop exhibited in bug #19031. Bug: #19031 Reported-by: Tim Wood <washwithcare@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19031-0638148643d25548@postgresql.org Backpatch-through: 13
2025-08-26Improve RowMark handling during Self-Join EliminationAlexander Korotkov
The Self-Join Elimination SJE feature messes up keeping and removing RowMark's in remove_self_joins_one_group(). That didn't lead to user-level error, because the planned RowMark is only used to reference a rtable entry in later execution stages. An RTE entry for keeping and removing relations is identical and refers to the same relation OID. To reduce confusion and prevent future issues, this commit cleans up the code and fixes the incorrect behaviour. Furthermore, it includes sanity checks in setrefs.c on existing non-null RTE and RelOptInfo entries for each RowMark. Discussion: https://postgr.es/m/18c6bd6c-6d2a-419a-b0da-dfedef34b585%40gmail.com Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Backpatch-through: 18
2025-08-26Refactor variable names in remove_self_joins_one_group()Alexander Korotkov
Rename inner and outer to rrel and krel, respectively, to highlight their connection to r and k indexes. For the same reason, rename imark and omark to rmark and kmark. Discussion: https://postgr.es/m/18c6bd6c-6d2a-419a-b0da-dfedef34b585%40gmail.com Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Backpatch-through: 18
2025-08-25Message wording improvementsPeter Eisentraut
Use "row" instead of "tuple" for user-facing information for logical replication conflicts.
2025-08-25Use PqMsg_* macros in applyparallelworker.c.Nathan Bossart
Oversight in commit f4b54e1ed9. Author: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQAobFsHaLMypA6C96-9YExvF4AcU1xNPoPuNYRVm3mq4dg%40mail.gmail.com
2025-08-25Formatting cleanup of guc_tables.cPeter Eisentraut
This cleans up a few minor formatting inconsistencies. Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/dae6fe89-1e0c-4c3f-8d92-19d23374fb10%40eisentraut.org
2025-08-22Revert "Get rid of WALBufMappingLock"Alexander Korotkov
This reverts commit bc22dc0e0ddc2dcb6043a732415019cc6b6bf683. It appears that conditional variables are not suitable for use inside critical sections. If WaitLatch()/WaitEventSetWaitBlock() face postmaster death, they exit, releasing all locks instead of PANIC. In certain situations, this leads to data corruption. Reported-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/B3C69B86-7F82-4111-B97F-0005497BB745%40yandex-team.ru Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Yura Sokolov <y.sokolov@postgrespro.ru> Reviewed-by: Michael Paquier <michael@paquier.xyz> Backpatch-through: 18
2025-08-22Use ereport() rather than elog()Heikki Linnakangas
Noah pointed this out before I committed 50f770c3d9, but I accidentally pushed the old version with elog() anyway. Oops. Reported-by: Noah Misch <noah@leadboat.com> Discussion: https://www.postgresql.org/message-id/20250820003756.31.nmisch@google.com
2025-08-22Revert GetTransactionSnapshot() to return historic snapshot during LRHeikki Linnakangas
Commit 1585ff7387 changed GetTransactionSnapshot() to throw an error if it's called during logical decoding, instead of returning the historic snapshot. I made that change for extra protection, because a historic snapshot can only be used to access catalog tables while GetTransactionSnapshot() is usually called when you're executing arbitrary queries. You might get very subtle visibility problems if you tried to use the historic snapshot for arbitrary queries. There's no built-in code in PostgreSQL that calls GetTransactionSnapshot() during logical decoding, but it turns out that the pglogical extension does just that, to evaluate row filter expressions. You would get weird results if the row filter runs arbitrary queries, but it is sane as long as you don't access any non-catalog tables. Even though there are no checks to enforce that in pglogical, a typical row filter expression does not access any tables and works fine. Accessing tables marked with the user_catalog_table = true option is also OK. To fix pglogical with row filters, and any other extensions that might do similar things, revert GetTransactionSnapshot() to return a historic snapshot during logical decoding. To try to still catch the unsafe usage of historic snapshots, add checks in heap_beginscan() and index_beginscan() to complain if you try to use a historic snapshot to scan a non-catalog table. We're very close to the version 18 release however, so add those new checks only in master. Backpatch-through: 18 Reported-by: Noah Misch <noah@leadboat.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://www.postgresql.org/message-id/20250809222338.cc.nmisch@google.com
2025-08-22Reduce lock level for ALTER DOMAIN ... VALIDATE CONSTRAINTPeter Eisentraut
Reduce from ShareLock to ShareUpdateExclusivelock. Validation during ALTER DOMAIN ... ADD CONSTRAINT keeps using ShareLock. Example: create domain d1 as int; create table t (a d1); alter domain d1 add constraint cc10 check (value > 10) not valid; begin; alter domain d1 validate constraint cc10; -- another session insert into t values (8); Now we should still be able to perform DML operations on table t while the domain constraint is being validated. The equivalent works already on table constraints. Author: jian he <jian.universality@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CACJufxHz92A88NLRTA2msgE2dpXpE-EoZ2QO61od76-6bfqurA%40mail.gmail.com
2025-08-22Change dynahash.c and hsearch.h to use int64 instead of longMichael Paquier
This code was relying on "long", which is signed 8 bytes everywhere except on Windows where it is 4 bytes, that could potentially expose it to overflows, even if the current uses in the code are fine as far as I know. This code is now able to rely on the same sizeof() variable everywhere, with int64. long was used for sizes, partition counts and entry counts. Some callers of the dynahash.c routines used long declarations, that can be cleaned up to use int64 instead. There was one shortcut based on SIZEOF_LONG, that can be removed. long is entirely removed from dynahash.c and hsearch.h. Similar work was done in b1e5c9fa9ac4. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aKQYp-bKTRtRauZ6@paquier.xyz
2025-08-22Ignore temporary relations in RelidByRelfilenumber()Michael Paquier
Temporary relations may share the same RelFileNumber with a permanent relation, or other temporary relations associated with other sessions. Being able to uniquely identify a temporary relation would require RelidByRelfilenumber() to know about the proc number of the temporary relation it wants to identify, something it is not designed for since its introduction in f01d1ae3a104. There are currently three callers of RelidByRelfilenumber(): - autoprewarm. - Logical decoding, reorder buffer. - pg_filenode_relation(), that attempts to find a relation OID based on a tablespace OID and a RelFileNumber. This makes the situation problematic particularly for the first two cases, leading to the possibility of random ERRORs due to inconsistencies that temporary relations can create in the cache maintained by RelidByRelfilenumber(). The third case should be less of an issue, as I suspect that there are few direct callers of pg_filenode_relation(). The window where the ERRORs are happen is very narrow, requiring an OID wraparound to create a lookup conflict in RelidByRelfilenumber() with a temporary table reusing the same OID as another relation already cached. The problem is easier to reach in workloads with a high OID consumption rate, especially with a higher number of temporary relations created. We could get pg_filenode_relation() and RelidByRelfilenumber() to work with temporary relations if provided the means to identify them with an optional proc number given in input, but the years have also shown that we do not have a use case for it, yet. Note that this could not be backpatched if pg_filenode_relation() needs changes. It is simpler to ignore temporary relations. Reported-by: Shenhao Wang <wangsh.fnst@fujitsu.com> Author: Vignesh C <vignesh21@gmail.com> Reviewed-By: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-By: Takamichi Osumi <osumi.takamichi@fujitsu.com> Reviewed-By: Michael Paquier <michael@paquier.xyz> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Reported-By: Shenhao Wang <wangsh.fnst@fujitsu.com> Discussion: https://postgr.es/m/bbaaf9f9-ebb2-645f-54bb-34d6efc7ac42@fujitsu.com Backpatch-through: 13
2025-08-21Use consistent type for pgaio_io_get_id() resultPeter Eisentraut
The result of pgaio_io_get_id() was being assigned to a mix of int and uint32 variables. This fixes it to use int consistently, which seems the most correct. Also change the queue empty special value in method_worker.c to -1 from UINT32_MAX. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/70c784b3-f60b-4652-b8a6-75e5f051243e%40eisentraut.org
2025-08-21Disallow server start with sync_replication_slots = on and wal_level < logical.Fujii Masao
Replication slot synchronization (sync_replication_slots = on) requires wal_level to be logical. This commit prevents the server from starting if sync_replication_slots is enabled but wal_level is set to minimal or replica. Failing early during startup helps users catch invalid configurations immediately, which is important because changing wal_level requires a server restart. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Discussion: https://postgr.es/m/CAH0PTU_pc3oHi__XESF9ZigCyzai1Mo3LsOdFyQA4aUDkm01RA@mail.gmail.com
2025-08-20Fix re-execution of a failed SQLFunctionCache entry.Tom Lane
If we error out during execution of a SQL-language function, we will often leave behind non-null pointers in its SQLFunctionCache's cplan and eslist fields. This is problematic if the SQLFunctionCache is re-used, because those pointers will point at resources that were released during error cleanup. This problem escaped detection so far because ordinarily we won't re-use an FmgrInfo+SQLFunctionCache struct after a query error. However, in the rather improbable case that someone implements an opclass support function in SQL language, there will be long-lived FmgrInfos for it in the relcache, and then the problem is reachable after the function throws an error. To fix, add a flag to SQLFunctionCache that tracks whether execution escapes out of fmgr_sql, and clear out the relevant fields during init_sql_fcache if so. (This is going to need more thought if we ever try to share FMgrInfos across threads; but it's very far from being the only problem such a project will encounter, since many functions regard fn_extra as being query-local state.) This broke at commit 0313c5dc6; before that we did not try to re-use SQLFunctionCache state across calls. Hence, back-patch to v18. Bug: #19026 Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19026-90aed5e71d0c8af3@postgresql.org Backpatch-through: 18
2025-08-20Minor error message enhancementPeter Eisentraut
In refuseDupeIndexAttach(), change from errdetail("Another index is already attached for partition \"%s\"."...) to errdetail("Another index \"%s\" is already attached for partition \"%s\"."...) so we can easily understand which index is already attached for partition \"%s\". Author: Jian He <jian.universality@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/CACJufxGBfykJ_1ztk9T%2BL_gLmkOSOF%2BmL9Mn4ZPydz-rh%3DLccQ%40mail.gmail.com
2025-08-20Fix assertion failure with replication slot release in single-user modeMichael Paquier
Some replication slot manipulations (logical decoding via SQL, advancing) were failing an assertion when releasing a slot in single-user mode, because active_pid was not set in a ReplicationSlot when its slot is acquired. ReplicationSlotAcquire() has some logic to be able to work with the single-user mode. This commit sets ReplicationSlot->active_pid to MyProcPid, to let the slot-related logic fall-through, considering the single process as the one holding the slot. Some TAP tests are added for various replication slot functions with the single-user mode, while on it, for slot creation, drop, advancing, copy and logical decoding with multiple slot types (temporary, physical vs logical). These tests are skipped on Windows, as direct calls of postgres --single would fail on permission failures. There is no platform-specific behavior that needs to be checked, so living with this restriction should be fine. The CI is OK with that, now let's see what the buildfarm tells. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Mutaamba Maasha <maasha@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966ED588A0328DAEBE8CB25F5FA2@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 13
2025-08-19Fix comment for MAX_SIMUL_LWLOCKS.Nathan Bossart
This comment mentions that pg_buffercache locks all buffer partitions simultaneously, but it hasn't done so since v10. Oversight in commit 6e654546fb. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/aKTuAHVEuYCUmmIy%40nathan
2025-08-19Fix self-deadlock during DROP SUBSCRIPTION.Amit Kapila
The DROP SUBSCRIPTION command performs several operations: it stops the subscription workers, removes subscription-related entries from system catalogs, and deletes the replication slot on the publisher server. Previously, this command acquired an AccessExclusiveLock on pg_subscription before initiating these steps. However, while holding this lock, the command attempts to connect to the publisher to remove the replication slot. In cases where the connection is made to a newly created database on the same server as subscriber, the cache-building process during connection tries to acquire an AccessShareLock on pg_subscription, resulting in a self-deadlock. To resolve this issue, we reduce the lock level on pg_subscription during DROP SUBSCRIPTION from AccessExclusiveLock to RowExclusiveLock. Earlier, the higher lock level was used to prevent the launcher from starting a new worker during the drop operation, as a restarted worker could become orphaned. Now, instead of relying on a strict lock, we acquire an AccessShareLock on the specific subscription being dropped and re-validate its existence after acquiring the lock. If the subscription is no longer valid, the worker exits gracefully. This approach avoids the deadlock while still ensuring that orphan workers are not created. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/18988-7312c868be2d467f@postgresql.org
2025-08-19Refactor ReadMultiXactCounts() into GetMultiXactInfo()Michael Paquier
This provides a single entry point to access some information about the state of MultiXacts, able to return some data about multixacts offsets and counts. Originally this function was only able to return some information about the number of multixacts and multixact members, extended here to provide some data about the oldest multixact ID in use and the oldest offset, if known. This change has been proposed in a patch that aims at providing more monitoring capabilities for multixacts, and it is useful on its own. GetMultiXactInfo() is added to multixact.h, becoming available for out-of-core code. Extracted from a larger patch by the same author. Author: Naga Appani <nagnrik@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CA+QeY+AAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg@mail.gmail.com
2025-08-19Remove useless pointer update in StatsShmemInit()Michael Paquier
This pointer was not used after its last update. This variable assignment was most likely a vestige artifact of the earlier versions of the patch set that have led to 5891c7a8ed8f. This pointer update is useless, so let's remove it. It removes one call to pgstat_dsa_init_size(), making the code slightly easier to grasp. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aKLsu2sdpnyeuSSc@ip-10-97-1-34.eu-west-3.compute.internal
2025-08-19Simplify relation_has_unique_index_for()Richard Guo
Now that the only call to relation_has_unique_index_for() that supplied an exprlist and oprlist has been removed, the loop handling those lists is effectively dead code. This patch removes that loop and simplifies the function accordingly. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-EBnaRvEs7frTLbsXiweSTUXifsteF-d3rvv01FKO86w@mail.gmail.com
2025-08-19Pathify RHS unique-ification for semijoin planningRichard Guo
There are two implementation techniques for semijoins: one uses the JOIN_SEMI jointype, where the executor emits at most one matching row per left-hand side (LHS) row; the other unique-ifies the right-hand side (RHS) and then performs a plain inner join. The latter technique currently has some drawbacks related to the unique-ification step. * Only the cheapest-total path of the RHS is considered during unique-ification. This may cause us to miss some optimization opportunities; for example, a path with a better sort order might be overlooked simply because it is not the cheapest in total cost. Such a path could help avoid a sort at a higher level, potentially resulting in a cheaper overall plan. * We currently rely on heuristics to choose between hash-based and sort-based unique-ification. A better approach would be to generate paths for both methods and allow add_path() to decide which one is preferable, consistent with how path selection is handled elsewhere in the planner. * In the sort-based implementation, we currently pay no attention to the pathkeys of the input subpath or the resulting output. This can result in redundant sort nodes being added to the final plan. This patch improves semijoin planning by creating a new RelOptInfo for the RHS rel to represent its unique-ified version. It then generates multiple paths that represent elimination of distinct rows from the RHS, considering both a hash-based implementation using the cheapest total path of the original RHS rel, and sort-based implementations that either exploit presorted input paths or explicitly sort the cheapest total path. All resulting paths compete in add_path(), and those deemed worthy of consideration are added to the new RelOptInfo. Finally, the unique-ified rel is joined with the other side of the semijoin using a plain inner join. As a side effect, most of the code related to the JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER jointypes -- used to indicate that the LHS or RHS path should be made unique -- has been removed. Besides, the T_Unique path now has the same meaning for both semijoins and upper DISTINCT clauses: it represents adjacent-duplicate removal on presorted input. This patch unifies their handling by sharing the same data structures and functions. This patch also removes the UNIQUE_PATH_NOOP related code along the way, as it is dead code -- if the RHS rel is provably unique, the semijoin should have already been simplified to a plain inner join by analyzejoins.c. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-EBnaRvEs7frTLbsXiweSTUXifsteF-d3rvv01FKO86w@mail.gmail.com
2025-08-19Remove unneeded header declarations in multixact.cMichael Paquier
Two header declarations were related to SQL-callable functions, that should have been cleaned up in df9133fa6384. Some more includes can be removed on closer inspection, so let's clean up these as well, while on it. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/345438.1755524834@sss.pgh.pa.us
2025-08-19Remove HASH_DEBUG output from dynahash.cDavid Rowley
This existed in a semi broken stated from be0a66666 until 296cba276. Recent discussion has questioned the value of having this at all as it only outputs static information from various of the hash table's properties when the hash table is created. Author: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/OSCPR01MB1496650D03FA0293AB9C21416F534A@OSCPR01MB14966.jpnprd01.prod.outlook.com
2025-08-19Use elog(DEBUG4) for dynahash.c statistics outputDavid Rowley
Previously this was being output to stderr. This commit adjusts things to use elog(DEBUG4). Here we also adjust the format of the message to add the hash table name and also put the message on a single line. This should make grepping the logs for this information easier. Also get rid of the global hash table statistics. This seems very dated and didn't fit very well with trying to put all the statistics for a specific hash table on a single log line. The main aim here is to allow it so we can have at least one buildfarm member build with HASH_STATISTICS to help prevent future changes from breaking things in that area. ca3891251 recently fixed some issues here. In passing, switch to using uint64 data types rather than longs for the usage counters. The long type is 32 bits on some platforms we support. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAApHDvoccvJ9CG5zx+i-EyCzJbcL5K=CzqrnL_YN59qaL5hiaw@mail.gmail.com
2025-08-18Detect buffer underflow in get_th()Peter Eisentraut
Input with zero length can result in a buffer underflow when accessing *(num + (len - 1)), as (len - 1) would produce a negative index. Add an assertion for zero-length input to prevent it. This was found by ALT Linux Team. Reviewing the call sites shows that get_th() currently cannot be applied to an empty string: it is always called on a string containing a number we've just printed. Therefore, an assertion rather than a user-facing error message is sufficient. Co-authored-by: Alexander Kuznetsov <kuznetsovam@altlinux.org> Discussion: https://www.postgresql.org/message-id/flat/e22df993-cdb4-4d0a-b629-42211ebed582@altlinux.org
2025-08-18Move SQL-callable code related to multixacts into its own fileMichael Paquier
A patch is under discussion to add more SQL capabilities related to multixacts, and this move avoids bloating the file more than necessary. This affects pg_get_multixact_members(). A side effect of this move is the requirement to add mxstatus_to_string() to multixact.h. Extracted from a larger patch by the same author, tweaked by me. Author: Naga Appani <nagnrik@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CA+QeY+AAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg@mail.gmail.com
2025-08-18Refactor init_params() in sequence.c to not use FormData_pg_sequence_dataMichael Paquier
init_params() sets up "last_value" and "is_called" for a sequence relation holdind its metadata, based on the sequence properties in pg_sequences. "log_cnt" is the third property that can be updated in this routine for FormData_pg_sequence_data, tracking when WAL records should be generated for a sequence after nextval() iterations. This routine is called when creating or altering a sequence. This commit refactors init_params() to not depend anymore on FormData_pg_sequence_data, removing traces of it in sequence.c, making easier the manipulation of metadata related to sequences. The knowledge about "log_cnt" is replaced with a more general "reset_state" flag, to let the caller know if the sequence state should be reset. In the case of in-core sequences, this relates to WAL logging. We still need to depend on FormData_pg_sequence. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/ZWlohtKAs0uVVpZ3@paquier.xyz
2025-08-16Fix typos in comments.Masahiko Sawada
Oversight in commit fd5a1a0c3e56. Author: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAHewXNmTT3M_w4NngG=6G3mdT3iJ6DdncTqV9YnGXBPHW8XYtA@mail.gmail.com