summaryrefslogtreecommitdiff
path: root/src/backend/commands
AgeCommit message (Collapse)Author
6 daysFix lookup code for REINDEX INDEX.Nathan Bossart
This commit adjusts RangeVarCallbackForReindexIndex() to handle an extremely unlikely race condition involving concurrent OID reuse. In short, if REINDEX INDEX is executed at the same time that the index is re-created with the same name and OID but a different parent table OID, we might lock the wrong parent table. To fix, simply detect when this happens and emit an ERROR. Unfortunately, we can't gracefully handle this situation because we will have already locked the index, and we must lock the parent table before the index to avoid deadlocks. While at it, I've replaced all but one early return in this callback function with ERRORs that should be unreachable. While I haven't verified the presence of a live bug, the checks in question appear to be unnecessary, and the early returns seem prone to breaking the parent table locking code in subtle ways. If nothing else, this simplifies the code a bit. This is a bug fix and could be back-patched, but given the presumed rarity of the race condition and the lack of reports, I'm not going to bother. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/Z8zwVmGzXyDdkAXj%40nathan
6 daysAdd log_autoanalyze_min_durationPeter Eisentraut
The log output functionality of log_autovacuum_min_duration applies to both VACUUM and ANALYZE, so it is not possible to separate the VACUUM and ANALYZE log output thresholds. Logs are likely to be output only for VACUUM and not for ANALYZE. Therefore, we decided to separate the threshold for log output of VACUUM by autovacuum (log_autovacuum_min_duration) and the threshold for log output of ANALYZE by autovacuum (log_autoanalyze_min_duration). Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Kasahara Tatsuhito <kasaharatt@oss.nttdata.com> Discussion: https://www.postgresql.org/message-id/flat/CAOzEurQtfV4MxJiWT-XDnimEeZAY+rgzVSLe8YsyEKhZcajzSA@mail.gmail.com
6 daysStandardize use of REFRESH PUBLICATION in code and messages.Amit Kapila
This patch replaces ALTER SUBSCRIPTION REFRESH with ALTER SUBSCRIPTION REFRESH PUBLICATION in comments and error messages to improve clarity and support future extensibility. The change aligns with upcoming addition REFRESH SEQUENCES for sequence synchronization. Author: vignesh C <vignesh21@gmail.com> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com
10 daysStop creating constraints during DETACH CONCURRENTLYÁlvaro Herrera
Commit 71f4c8c6f74b (which implemented DETACH CONCURRENTLY) added code to create a separate table constraint when a table is detached concurrently, identical to the partition constraint, on the theory that such a constraint was needed in case the optimizer had constructed any query plans that depended on the constraint being there. However, that theory was apparently bogus because any such plans would be invalidated. For hash partitioning, those constraints are problematic, because their expressions reference the OID of the parent partitioned table, to which the detached table is no longer related; this causes all sorts of problems (such as inability of restoring a pg_dump of that table, and the table no longer working properly if the partitioned table is later dropped). We'd like to get rid of all those constraints. In fact, for branch master, do that -- no longer create any substitute constraints. However, out of fear that some users might somehow depend on these constraints for other partitioning strategies, for stable branches (back to 14, which added DETACH CONCURRENTLY), only do it for hash partitioning. (If you repeatedly DETACH CONCURRENTLY and then ATTACH a partition, then with this constraint addition you don't need to scan the table in the ATTACH step, which presumably is good. But if users really valued this feature, they would have requested that it worked for non-concurrent DETACH also.) Author: Haiyang Li <mohen.lhy@alibaba-inc.com> Reported-by: Fei Changhong <feichanghong@qq.com> Reported-by: Haiyang Li <mohen.lhy@alibaba-inc.com> Backpatch-through: 14 Discussion: https://postgr.es/m/18371-7fef49f63de13f02@postgresql.org Discussion: https://postgr.es/m/19070-781326347ade7c57@postgresql.org
10 daysdbase_redo: Fix Valgrind-reported memory leakÁlvaro Herrera
Introduced by my (Álvaro's) commit 9e4f914b5eba, which was itself backpatched to pg10, though only pg15 and up contain the problem because of commit 9c08aea6a309. This isn't a particularly significant leak, but given the fix is trivial, we might as well backpatch to all branches where it applies, so do that. Author: Nathan Bossart <nathandbossart@gmail.com> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/x4odfdlrwvsjawscnqsqjpofvauxslw7b4oyvxgt5owoyf4ysn@heafjusodrz7
12 daysCleanup VACUUM option processing error messagesDavid Rowley
The processing of the PARALLEL option for VACUUM was not quite following what the DefElem code had intended. defGetInt32() already has code to handle missing parameters and returns a perfectly good error message for when that happens. Here we get rid of the ExecVacuum() error: ERROR: parallel option requires a value between 0 and N and leave defGetInt32() handle it, which will give: ERROR: parallel requires an integer value defGetInt32() was already handling the non-integer parameter case, so it may as well handle the missing parameter case too. Additionally, parameterize the option name to make translator work easier, and also use errhint_internal() rather than errhint() for the BUFFER_USAGE_LIMIT option since there isn't any work for a translator to do for "%s". Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAApHDvovH14tNWB+WvP6TSbfi7-=TysQ9h5tQ5AgavwyWRWKHA@mail.gmail.com
12 daysAdd "ALL SEQUENCES" support to publications.Amit Kapila
This patch adds support for the ALL SEQUENCES clause in publications, enabling synchronization/replication of all sequences that is useful for upgrades. Publications can now include all sequences via FOR ALL SEQUENCES. psql enhancements: \d shows publications for a given sequence. \dRp indicates if a publication includes all sequences. ALL SEQUENCES can be combined with ALL TABLES, but not with other options like TABLE or TABLES IN SCHEMA. We can extend support for more granular clauses in future. The view pg_publication_sequences provides information about the mapping between publications and sequences. This patch enables publishing of sequences; subscriber-side support will be added in upcoming patches. Author: vignesh C <vignesh21@gmail.com> Author: Tomas Vondra <tomas@vondra.me> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com
13 daysAdd ExplainState argument to pg_plan_query() and planner().Robert Haas
This allows extensions to have access to any data they've stored in the ExplainState during planning. Unfortunately, it won't help with EXPLAIN EXECUTE is used, but since that case is less common, this still seems like an improvement. Since planner() has quite a few arguments now, also add some documentation of those arguments and the return value. Author: Robert Haas <rhaas@postgresql.org> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com
14 daysAssign each subquery a unique name prior to planning it.Robert Haas
Previously, subqueries were given names only after they were planned, which makes it difficult to use information from a previous execution of the query to guide future planning. If, for example, you knew something about how you want "InitPlan 2" to be planned, you won't know whether the subquery you're currently planning will end up being "InitPlan 2" until after you've finished planning it, by which point it's too late to use the information that you had. To fix this, assign each subplan a unique name before we begin planning it. To improve consistency, use textual names for all subplans, rather than, as we did previously, a mix of numbers (such as "InitPlan 1") and names (such as "CTE foo"), and make sure that the same name is never assigned more than once. We adopt the somewhat arbitrary convention of using the type of sublink to set the plan name; for example, a query that previously had two expression sublinks shown as InitPlan 2 and InitPlan 1 will now end up named expr_1 and expr_2. Because names are assigned before rather than after planning, some of the regression test outputs show the numerical part of the name switching positions: what was previously SubPlan 2 was actually the first one encountered, but we finished planning it later. We assign names even to subqueries that aren't shown as such within the EXPLAIN output. These include subqueries that are a FROM clause item or a branch of a set operation, rather than something that will be turned into an InitPlan or SubPlan. The purpose of this is to make sure that, below the topmost query level, there's always a name for each subquery that is stable from one planning cycle to the next (assuming no changes to the query or the database schema). Author: Robert Haas <rhaas@postgresql.org> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: http://postgr.es/m/3641043.1758751399@sss.pgh.pa.us
2025-10-06Expose sequence page LSN via pg_get_sequence_data.Amit Kapila
This patch enhances the pg_get_sequence_data function to include the page-level LSN (Log Sequence Number) of the sequence. This additional metadata will be used by upcoming patches to support synchronization of sequences during logical replication. By exposing the LSN, we enable more accurate tracking of sequence changes, which is essential for maintaining consistency across replicated nodes. Author: vignesh C <vignesh21@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://www.postgresql.org/message-id/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com
2025-10-05Don't include access/htup_details.h in executor/tuptable.hÁlvaro Herrera
This is not at all needed; I suspect it was a simple mistake in commit 5408e233f066. It causes htup_details.h to bleed into a huge number of places via execnodes.h. Remove it and fix fallout. Discussion: https://postgr.es/m/202510021240.ptc2zl5cvwen@alvherre.pgsql
2025-09-26Remove unused for_all_tables field from AlterPublicationStmt.Masahiko Sawada
No backpatch as AlterPublicationStmt struct is exposed. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAD21AoC6B6AuxWOST-TkxUbDgp8FwX=BLEJZmKLG_VrH-hfxpA@mail.gmail.com
2025-09-25Fix array allocation bugs in SetExplainExtensionState.Robert Haas
If we already have an extension_state array but see a new extension_id much larger than the highest the extension_id we've previously seen, the old code might have failed to expand the array to a large enough size, leading to disaster. Also, if we don't have an extension array at all and need to create one, we should make sure that it's big enough that we don't have to resize it instantly. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/2949591.1758570711@sss.pgh.pa.us Backpatch-through: 18
2025-09-24Remove PointerIsValid()Peter Eisentraut
This doesn't provide any value over the standard style of checking the pointer directly or comparing against NULL. Also remove related: - AllocPointerIsValid() [unused] - IndexScanIsValid() [had one user] - HeapScanIsValid() [unused] - InvalidRelation [unused] Leaving HeapTupleIsValid(), ItemIdIsValid(), PortalIsValid(), RelationIsValid for now, to reduce code churn. Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/ad50ab6b-6f74-4603-b099-1cd6382fb13d%40eisentraut.org Discussion: https://www.postgresql.org/message-id/CA+hUKG+NFKnr=K4oybwDvT35dW=VAjAAfiuLxp+5JeZSOV3nBg@mail.gmail.com Discussion: https://www.postgresql.org/message-id/bccf2803-5252-47c2-9ff0-340502d5bd1c@iki.fi
2025-09-23Keep track of what RTIs a Result node is scanning.Robert Haas
Result nodes now include an RTI set, which is only non-NULL when they have no subplan, and is taken from the relid set of the RelOptInfo that the Result is generating. ExplainPreScanNode now takes notice of these RTIs, which means that a few things get schema-qualified in the regression tests that previously did not. This makes the output more consistent between cases where some part of the plan tree is replaced by a Result node and those where this does not happen. Likewise, pg_overexplain's EXPLAIN (RANGE_TABLE) now displays the RTIs stored in a Result node just as it already does for other RTI-bearing node types. Result nodes also now include a result_reason, which tells us something about why the Result node was inserted. Using that information, EXPLAIN now emits, where relevant, a "Replaces" line describing the origin of a Result node. The purpose of these changes is to allow code that inspects a Plan tree to understand the origin of Result nodes that appear therein. Discussion: http://postgr.es/m/CA+TgmoYeUZePZWLsSO+1FAN7UPePT_RMEZBKkqYBJVCF1s60=w@mail.gmail.com Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
2025-09-22Fix various incorrect filename referencesDavid Rowley
Author: Chao Li <li.evan.chao@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAEoWx2=hOBCPm-Z=F15twr_23XjHeoXSbifP5GdEdtWona97wQ@mail.gmail.com
2025-09-20Track the maximum possible frequency of non-MCE array elements.Tom Lane
The lossy-counting algorithm that ANALYZE uses to identify most-common array elements has a notion of cutoff frequency: elements with frequency greater than that are guaranteed to be collected, elements with smaller frequencies are not. In cases where we find fewer MCEs than the stats target would permit us to store, the cutoff frequency provides valuable additional information, to wit that there are no non-MCEs with frequency greater than that. What the selectivity estimation functions actually use the "minfreq" entry for is as a ceiling on the possible frequency of non-MCEs, so using the cutoff rather than the lowest stored MCE frequency provides a tighter bound and more accurate estimates. Therefore, instead of redundantly storing the minimum observed MCE frequency, store the cutoff frequency when there are fewer tracked values than we want. (When there are more, then of course we cannot assert that no non-stored elements are above the cutoff frequency, since we're throwing away some that are; so we still use the minimum stored frequency in that case.) Notably, this works even when none of the values are common enough to be called MCEs. In such cases we previously stored nothing in the STATISTIC_KIND_MCELEM pg_statistic slot, which resulted in the selectivity functions falling back to default estimates. So in that case we want to construct a STATISTIC_KIND_MCELEM entry that contains no "values" but does have "numbers", to wit the three extra numbers that the MCELEM entry type defines. A small obstacle is that update_attstats() has traditionally stored a null, not an empty array, when passed zero "values" for a slot. That gives rise to an MCELEM entry that get_attstatsslot() will spit up on. The least risky solution seems to be to adjust update_attstats() so that it will emit a non-null (but possibly empty) array when the passed stavalues array pointer isn't NULL, rather than conditioning that on numvalues > 0. In other existing cases I don't believe that that changes anything. For consistency, handle the stanumbers array the same way. In passing, improve the comments in routines that use STATISTIC_KIND_MCELEM data. Particularly, explain why we use minfreq / 2 not minfreq as the estimate for non-MCE values. Thanks to Matt Long for the suggestion that we could apply this idea even when there are more than zero MCEs. Reported-by: Mark Frost <FROSTMAR@uk.ibm.com> Reported-by: Matt Long <matt@mattlong.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/PH3PPF1C905D6E6F24A5C1A1A1D8345B593E16FA@PH3PPF1C905D6E6.namprd15.prod.outlook.com
2025-09-16Revert "Avoid race condition between "GRANT role" and "DROP ROLE"".Tom Lane
This reverts commit 98fc31d6499163a0a781aa6f13582a07f09cd7c6. That change allowed DROP OWNED BY to drop grants of the target role to other roles, arguing that nobody would need those privileges anymore. But that's not so: if you're not superuser, you still need admin privilege on the target role so you can drop it. It's not clear whether or how the dependency-based approach to solving the original problem can be adapted to keep these grants. Since v18 release is fast approaching, the sanest thing to do seems to be to revert this patch for now. The race-condition problem is low severity and not worth taking risks for. I didn't force a catversion bump in 98fc31d64, so I won't do so here either. Reported-by: Dipesh Dhameliya <dipeshdhameliya125@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CABgZEgczOFicCJoqtrH9gbYMe_BV3Hq8zzCBRcMgmU6LRsihUA@mail.gmail.com Backpatch-through: 18
2025-09-15Hide duplicate names from extension viewsPeter Eisentraut
If extensions of equal names were installed in different directories in the path, the views pg_available_extensions and pg_available_extension_versions would show all of them, even though only the first one was actually reachable by CREATE EXTENSION. To fix, have those views skip extensions found later in the path if they have names already found earlier. Also add a bit of documentation that only the first extension in the path can be used. Reported-by: Pierrick <pierrick.chovelon@dalibo.com> Discussion: https://www.postgresql.org/message-id/flat/8f5a0517-1cb8-4085-ae89-77e7454e27ba%40dalibo.com
2025-09-12Fix oversights in pg_event_trigger_dropped_objects() fixes.Tom Lane
Commit a0b99fc12 caused pg_event_trigger_dropped_objects() to not fill the object_name field for schemas, which it should have; and caused it to fill the object_name field for default values, which it should not have. In addition, triggers and RLS policies really should behave the same way as we're making column defaults do; that is, they should have is_temporary = true if they belong to a temporary table. Fix those things, and upgrade event_trigger.sql's woefully inadequate test coverage of these secondary output columns. As before, back-patch only to v15. Reported-by: Sergey Shinderuk <s.shinderuk@postgrespro.ru> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/bd7b4651-1c26-4d30-832b-f942fabcb145@postgrespro.ru Backpatch-through: 15
2025-09-11Report the correct is_temporary flag for column defaults.Tom Lane
pg_event_trigger_dropped_objects() would report a column default object with is_temporary = false, even if it belongs to a temporary table. This seems clearly wrong, so adjust it to report the table's temp-ness. While here, refactor EventTriggerSQLDropAddObject to make its handling of namespace objects less messy and avoid duplication of the schema-lookup code. And add some explicit test coverage of dropped-object reports for dependencies of temp tables. Back-patch to v15. The bug exists further back, but the GetAttrDefaultColumnAddress function this patch depends on does not, and it doesn't seem worth adjusting it to cope with the older code. Author: Antoine Violin <violin.antuan@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFjUV9x3-hv0gihf+CtUc-1it0hh7Skp9iYFhMS7FJjtAeAptA@mail.gmail.com Backpatch-through: 15
2025-09-08pg_upgrade: Transfer pg_largeobject_metadata's files when possible.Nathan Bossart
Commit 161a3e8b68 taught pg_upgrade to use COPY for large object metadata for upgrades from v12 and newer, which is much faster to restore than the proper large object commands. For upgrades from v16 and newer, we can take this a step further and transfer the large object metadata files as if they were user tables. We can't transfer the files from older versions because the aclitem data type (needed by pg_largeobject_metadata.lomacl) changed its storage format in v16 (see commit 7b378237aa). Note that this commit is essentially a revert of commit 12a53c732c. There are a couple of caveats. First, we still need to COPY the corresponding pg_shdepend rows for large objects. Second, we need to COPY anything in pg_largeobject_metadata with a comment or security label, else restoring those will fail. This means that an upgrade in which every large object has a comment or security label won't gain anything from this commit, but it should at least avoid making those unusual use-cases any worse. pg_upgrade must also take care to transfer the relfilenodes of pg_largeobject_metadata and its index, as was done for pg_largeobject in commits d498e052b4 and bbe08b8869. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aJ3_Gih_XW1_O2HF%40nathan
2025-09-08Post-commit review fixes for 228c370868.Amit Kapila
This commit fixes three issues: 1) When a disabled subscription is created with retain_dead_tuples set to true, the launcher is not woken up immediately, which may lead to delays in creating the conflict detection slot. Creating the conflict detection slot is essential even when the subscription is not enabled. This ensures that dead tuples are retained, which is necessary for accurately identifying the type of conflict during replication. 2) Conflict-related data was unnecessarily retained when the subscription does not have a table. 3) Conflict-relevant data could be prematurely removed before applying prepared transactions on the publisher that are in the commit critical section. This issue occurred because the backend executing COMMIT PREPARED was not accounted for during the computation of oldestXid in the commit phase on the publisher. As a result, the subscriber could advance the conflict slot's xmin without waiting for such COMMIT PREPARED transactions to complete. We fixed this issue by identifying prepared transactions that are in the commit critical section during computation of oldestXid in commit phase. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS9PR01MB16913DACB64E5721872AA5C02943BA@OS9PR01MB16913.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/OS9PR01MB16913F67856B0DA2A909788129400A@OS9PR01MB16913.jpnprd01.prod.outlook.com
2025-09-04Fix replica identity check for INSERT ON CONFLICT DO UPDATE.Dean Rasheed
If an INSERT has an ON CONFLICT DO UPDATE clause, the executor must check that the target relation supports UPDATE as well as INSERT. In particular, it must check that the target relation has a REPLICA IDENTITY if it publishes updates. Formerly, it was not doing this check, which could lead to silently breaking replication. Fix by adding such a check to CheckValidResultRel(), which requires adding a new onConflictAction argument. In back-branches, preserve ABI compatibility by introducing a wrapper function with the original signature. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Tested-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/OS3PR01MB57180C87E43A679A730482DF94B62@OS3PR01MB5718.jpnprd01.prod.outlook.com Backpatch-through: 13
2025-09-02Add max_retention_duration option to subscriptions.Amit Kapila
This commit introduces a new subscription parameter, max_retention_duration, aimed at mitigating excessive accumulation of dead tuples when retain_dead_tuples is enabled and the apply worker lags behind the publisher. When the time spent advancing a non-removable transaction ID exceeds the max_retention_duration threshold, the apply worker will stop retaining conflict detection information. In such cases, the conflict slot's xmin will be set to InvalidTransactionId, provided that all apply workers associated with the subscription (with retain_dead_tuples enabled) confirm the retention duration has been exceeded. To ensure retention status persists across server restarts, a new column subretentionactive has been added to the pg_subscription catalog. This prevents unnecessary reactivation of retention logic after a restart. The conflict detection slot will not be automatically re-initialized unless a new subscription is created with retain_dead_tuples = true, or the user manually re-enables retain_dead_tuples. A future patch will introduce support for automatic slot re-initialization once at least one apply worker confirms that the retention duration is within the configured max_retention_duration. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-08-29Remove unneeded casts of BufferGetPage() resultPeter Eisentraut
BufferGetPage() already returns type Page, so casting it to Page doesn't achieve anything. A sizable number of call sites does this casting; remove that. This was already done inconsistently in the code in the first import in 1996 (but didn't exist in the pre-1995 code), and it was then apparently just copied around. Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/CALdSSPgFhc5=vLqHdk-zCcnztC0zEY3EU_Q6a9vPEaw7FkE9Vw@mail.gmail.com
2025-08-28Avoid including commands/dbcommands.h in so many placesÁlvaro Herrera
This has been done historically because of get_database_name (which since commit cb98e6fb8fd4 belongs in lsyscache.c/h, so let's move it there) and get_database_oid (which is in the right place, but whose declaration should appear in pg_database.h rather than dbcommands.h). Clean this up. Also, xlogreader.h and stringinfo.h are no longer needed by dbcommands.h since commit f1fd515b393a, so remove them. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/202508191031.5ipojyuaswzt@alvherre.pgsql
2025-08-22Reduce lock level for ALTER DOMAIN ... VALIDATE CONSTRAINTPeter Eisentraut
Reduce from ShareLock to ShareUpdateExclusivelock. Validation during ALTER DOMAIN ... ADD CONSTRAINT keeps using ShareLock. Example: create domain d1 as int; create table t (a d1); alter domain d1 add constraint cc10 check (value > 10) not valid; begin; alter domain d1 validate constraint cc10; -- another session insert into t values (8); Now we should still be able to perform DML operations on table t while the domain constraint is being validated. The equivalent works already on table constraints. Author: jian he <jian.universality@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CACJufxHz92A88NLRTA2msgE2dpXpE-EoZ2QO61od76-6bfqurA%40mail.gmail.com
2025-08-20Minor error message enhancementPeter Eisentraut
In refuseDupeIndexAttach(), change from errdetail("Another index is already attached for partition \"%s\"."...) to errdetail("Another index \"%s\" is already attached for partition \"%s\"."...) so we can easily understand which index is already attached for partition \"%s\". Author: Jian He <jian.universality@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/CACJufxGBfykJ_1ztk9T%2BL_gLmkOSOF%2BmL9Mn4ZPydz-rh%3DLccQ%40mail.gmail.com
2025-08-19Fix self-deadlock during DROP SUBSCRIPTION.Amit Kapila
The DROP SUBSCRIPTION command performs several operations: it stops the subscription workers, removes subscription-related entries from system catalogs, and deletes the replication slot on the publisher server. Previously, this command acquired an AccessExclusiveLock on pg_subscription before initiating these steps. However, while holding this lock, the command attempts to connect to the publisher to remove the replication slot. In cases where the connection is made to a newly created database on the same server as subscriber, the cache-building process during connection tries to acquire an AccessShareLock on pg_subscription, resulting in a self-deadlock. To resolve this issue, we reduce the lock level on pg_subscription during DROP SUBSCRIPTION from AccessExclusiveLock to RowExclusiveLock. Earlier, the higher lock level was used to prevent the launcher from starting a new worker during the drop operation, as a restarted worker could become orphaned. Now, instead of relying on a strict lock, we acquire an AccessShareLock on the specific subscription being dropped and re-validate its existence after acquiring the lock. If the subscription is no longer valid, the worker exits gracefully. This approach avoids the deadlock while still ensuring that orphan workers are not created. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/18988-7312c868be2d467f@postgresql.org
2025-08-18Refactor init_params() in sequence.c to not use FormData_pg_sequence_dataMichael Paquier
init_params() sets up "last_value" and "is_called" for a sequence relation holdind its metadata, based on the sequence properties in pg_sequences. "log_cnt" is the third property that can be updated in this routine for FormData_pg_sequence_data, tracking when WAL records should be generated for a sequence after nextval() iterations. This routine is called when creating or altering a sequence. This commit refactors init_params() to not depend anymore on FormData_pg_sequence_data, removing traces of it in sequence.c, making easier the manipulation of metadata related to sequences. The knowledge about "log_cnt" is replaced with a more general "reset_state" flag, to let the caller know if the sequence state should be reset. In the case of in-core sequences, this relates to WAL logging. We still need to depend on FormData_pg_sequence. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/ZWlohtKAs0uVVpZ3@paquier.xyz
2025-08-08Add missing Datum conversionsPeter Eisentraut
Add various missing conversions from and to Datum. The previous code mostly relied on implicit conversions or its own explicit casts instead of using the correct DatumGet*() or *GetDatum() functions. We think these omissions are harmless. Some actual bugs that were discovered during this process have been committed separately (80c758a2e1d, fd2ab03fea2). Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org
2025-08-08Remove useless/superfluous Datum conversionsPeter Eisentraut
Remove useless DatumGetFoo() and FooGetDatum() calls. These are places where no conversion from or to Datum was actually happening. We think these extra calls covered here were harmless. Some actual bugs that were discovered during this process have been committed separately (80c758a2e1d, 2242b26ce47). Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org
2025-08-08Fix oversight in FindTriggerIncompatibleWithInheritance.Etsuro Fujita
This function is called from ATExecAttachPartition/ATExecAddInherit, which prevent tables with row-level triggers with transition tables from becoming partitions or inheritance children, to check if there is such a trigger on the given table, but failed to check if a found trigger is row-level, causing the caller functions to needlessly prevent a table with only a statement-level trigger with transition tables from becoming a partition or inheritance child. Repair. Oversight in commit 501ed02cf. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Discussion: https://postgr.es/m/CAPmGK167mXzwzzmJ_0YZ3EZrbwiCxtM1vogH_8drqsE6PtxRYw%40mail.gmail.com Backpatch-through: 13
2025-08-08Disallow collecting transition tuples from child foreign tables.Etsuro Fujita
Commit 9e6104c66 disallowed transition tables on foreign tables, but failed to account for cases where a foreign table is a child table of a partitioned/inherited table on which transition tables exist, leading to incorrect transition tuples collected from such foreign tables for queries on the parent table triggering transition capture. This occurred not only for inherited UPDATE/DELETE but for partitioned INSERT later supported by commit 3d956d956, which should have handled it at least for the INSERT case, but didn't. To fix, modify ExecAR*Triggers to throw an error if the given relation is a foreign table requesting transition capture. Also, this commit fixes make_modifytable so that in case of an inherited UPDATE/DELETE triggering transition capture, FDWs choose normal operations to modify child foreign tables, not DirectModify; which is needed because they would otherwise skip the calls to ExecAR*Triggers at execution, causing unexpected behavior. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CAPmGK14QJYikKzBDCe3jMbpGENnQ7popFmbEgm-XTNuk55oyHg%40mail.gmail.com Backpatch-through: 13
2025-08-05Don't copy datlocale from template unless provider matches.Jeff Davis
During CREATE DATABASE, if changing the locale provider, require that a new locale is specified rather than trying to reinterpret the template's locale using the new provider. This only affects the behavior when the template uses the builtin provider and CREATE DATABASE specifies the ICU provider without specifying the locale. Previously, that may have succeeded due to loose validation by ICU, whereas now that will cause an error. Because it can cause an error, backport only to unreleased versions. Discussion: https://postgr.es/m/5038b33a6dc639009f4b3d43fa6ae0c5ba9e04f7.camel@j-davis.com Backpatch-through: 18
2025-08-05Throw ERROR when publish_generated_columns is specified without a value.Amit Kapila
Previously, specifying the publication option 'publish_generated_columns' without an explicit value would incorrectly default to 'stored', which is not the intended behavior. This patch fixes the issue by raising an ERROR when no value is provided for 'publish_generated_columns', ensuring that users must explicitly specify a valid option. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Backpatch-through: 18, where it was introduced Discussion: https://postgr.es/m/CAHut+PsCUCWiEKmB10DxhoPfXbF6jw5RD9ib2LuaQeA_XraW7w@mail.gmail.com
2025-08-04Fix incorrect comment regarding mod_since_analyzeDavid Rowley
Author: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/20250804140120.280c2d6a9d2ea687cd167743@sraoss.co.jp
2025-07-31Rename CachedPlanType to PlannedStmtOrigin for PlannedStmtMichael Paquier
Commit 719dcf3c42 introduced a field called CachedPlanType in PlannedStmt to allow extensions to determine whether a cached plan is generic or custom. After discussion, the concepts that we want to track are a bit wider than initially anticipated, as it is closer to knowing from which "source" or "origin" a PlannedStmt has been generated or retrieved. Custom and generic cached plans are a subset of that. Based on the state of HEAD, we have been able to define two more origins: - "standard", for the case where PlannedStmt is generated in standard_planner(), the most common case. - "internal", for the fake PlannedStmt generated internally by some query patterns. This could be tuned in the future depending on what is needed. This looks like a good starting point, at least. The default value is called "UNKNOWN", provided as fallback value. This value is not used in the core code, the idea is to let extensions building their own PlannedStmts know about this new field. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/aILaHupXbIGgF2wJ@paquier.xyz
2025-07-29Display Memoize planner estimates in EXPLAINDavid Rowley
There've been a few complaints that it can be overly difficult to figure out why the planner picked a Memoize plan. To help address that, here we adjust the EXPLAIN output to display the following additional details: 1) The estimated number of cache entries that can be stored at once 2) The estimated number of unique lookup keys that we expect to see 3) The number of lookups we expect 4) The estimated hit ratio Technically #4 can be calculated using #1, #2 and #3, but it's not a particularly obvious calculation, so we opt to display it explicitly. The original patch by Lukas Fittl only displayed the hit ratio, but there was a fear that might lead to more questions about how that was calculated. The idea with displaying all 4 is to be transparent which may allow queries to be tuned more easily. For example, if #2 isn't correct then maybe extended statistics or a manual n_distinct estimate can be used to help fix poor plan choices. Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAP53Pky29GWAVVk3oBgKBDqhND0BRBN6yTPeguV_qSivFL5N_g%40mail.gmail.com
2025-07-24Introduce field tracking cached plan type in PlannedStmtMichael Paquier
PlannedStmt gains a new field, called CachedPlanType, able to track if a given plan tree originates from the cache and if we are dealing with a generic or custom cached plan. This field can be used for monitoring or statistical purposes, in the executor hooks, for example, based on the planned statement attached to a QueryDesc. A patch is under discussion for pg_stat_statements to provide an equivalent of the counters in pg_prepared_statements for custom and generic plans, to provide a more global view of such data, as this data is now restricted to the current session. The concept introduced in this commit is useful on its own, and has been extracted from a larger patch by the same author. Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAA5RZ0uFw8Y9GCFvafhC=OA8NnMqVZyzXPfv_EePOt+iv1T-qQ@mail.gmail.com
2025-07-23Preserve conflict-relevant data during logical replication.Amit Kapila
Logical replication requires reliable conflict detection to maintain data consistency across nodes. To achieve this, we must prevent premature removal of tuples deleted by other origins and their associated commit_ts data by VACUUM, which could otherwise lead to incorrect conflict reporting and resolution. This patch introduces a mechanism to retain deleted tuples on the subscriber during the application of concurrent transactions from remote nodes. Retaining these tuples allows us to correctly ignore concurrent updates to the same tuple. Without this, an UPDATE might be misinterpreted as an INSERT during resolutions due to the absence of the original tuple. Additionally, we ensure that origin metadata is not prematurely removed by vacuum freeze, which is essential for detecting update_origin_differs and delete_origin_differs conflicts. To support this, a new replication slot named pg_conflict_detection is created and maintained by the launcher on the subscriber. Each apply worker tracks its own non-removable transaction ID, which the launcher aggregates to determine the appropriate xmin for the slot, thereby retaining necessary tuples. Conflict information retention (deleted tuples and commit_ts) can be enabled per subscription via the retain_conflict_info option. This is disabled by default to avoid unnecessary overhead for configurations that do not require conflict resolution or logging. During upgrades, if any subscription on the old cluster has retain_conflict_info enabled, a conflict detection slot will be created to protect relevant tuples from deletion when the new cluster starts. This is a foundational work to correctly detect update_deleted conflict which will be done in a follow-up patch. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-07-18Fix concurrent update trigger issues with MERGE in a CTE.Dean Rasheed
If a MERGE inside a CTE attempts an UPDATE or DELETE on a table with BEFORE ROW triggers, and a concurrent UPDATE or DELETE happens, the merge code would fail (crashing in the case of an UPDATE action, and potentially executing the wrong action for a DELETE action). This is the same issue that 9321c79c86 attempted to fix, except now for a MERGE inside a CTE. As noted in 9321c79c86, what needs to happen is for the trigger code to exit early, returning the TM_Result and TM_FailureData information to the merge code, if a concurrent modification is detected, rather than attempting to do an EPQ recheck. The merge code will then do its own rechecking, and rescan the action list, potentially executing a different action in light of the concurrent update. In particular, the trigger code must never call ExecGetUpdateNewTuple() for MERGE, since that is bound to fail because MERGE has its own per-action projection information. Commit 9321c79c86 did this using estate->es_plannedstmt->commandType in the trigger code to detect that a MERGE was being executed, which is fine for a plain MERGE command, but does not work for a MERGE inside a CTE. Fix by passing that information to the trigger code as an additional parameter passed to ExecBRUpdateTriggers() and ExecBRDeleteTriggers(). Back-patch as far as v17 only, since MERGE cannot appear inside a CTE prior to that. Additionally, take care to preserve the trigger ABI in v17 (though not in v18, which is still in beta). Bug: #18986 Reported-by: Yaroslav Syrytsia <me@ys.lc> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/18986-e7a8aac3d339fa47@postgresql.org Backpatch-through: 17
2025-07-11Rename CHECKPOINT_IMMEDIATE to CHECKPOINT_FAST.Nathan Bossart
The new name more accurately reflects the effects of this flag on a requested checkpoint. Checkpoint-related log messages (i.e., those controlled by the log_checkpoints configuration parameter) will now say "fast" instead of "immediate", too. Likewise, references to "immediate" checkpoints in the documentation have been updated to say "fast". This is preparatory work for a follow-up commit that will add a MODE option to the CHECKPOINT command. Author: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de
2025-07-11Rename CHECKPOINT_FLUSH_ALL to CHECKPOINT_FLUSH_UNLOGGED.Nathan Bossart
The new name more accurately relects the effects of this flag on a requested checkpoint. Checkpoint-related log messages (i.e., those controlled by the log_checkpoints configuration parameter) will now say "flush-unlogged" instead of "flush-all", too. This is preparatory work for a follow-up commit that will add a FLUSH_UNLOGGED option to the CHECKPOINT command. Author: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de
2025-07-07Standardize LSN formatting by zero paddingÁlvaro Herrera
This commit standardizes the output format for LSNs to ensure consistent representation across various tools and messages. Previously, LSNs were inconsistently printed as `%X/%X` in some contexts, while others used zero-padding. This often led to confusion when comparing. To address this, the LSN format is now uniformly set to `%X/%08X`, ensuring the lower 32-bit part is always zero-padded to eight hexadecimal digits. Author: Japin Li <japinli@hotmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/ME0P300MB0445CA53CA0E4B8C1879AF84B641A@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-07-03Obtain required table lock during cross-table updates, redux.Tom Lane
Commits 8319e5cb5 et al missed the fact that ATPostAlterTypeCleanup contains three calls to ATPostAlterTypeParse, and the other two also need protection against passing a relid that we don't yet have lock on. Add similar logic to those code paths, and add some test cases demonstrating the need for it. In v18 and master, the test cases demonstrate that there's a behavioral discrepancy between stored generated columns and virtual generated columns: we disallow changing the expression of a stored column if it's used in any composite-type columns, but not that of a virtual column. Since the expression isn't actually relevant to either sort of composite-type usage, this prohibition seems unnecessary; but changing it is a matter for separate discussion. For now we are just documenting the existing behavior. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: CACJufxGKJtGNRRSXfwMW9SqVOPEMdP17BJ7DsBf=tNsv9pWU9g@mail.gmail.com Backpatch-through: 13
2025-07-03Prevent creation of duplicate not-null constraints for domainsÁlvaro Herrera
This was previously harmless, but now that we create pg_constraint rows for those, duplicates are not welcome anymore. Backpatch to 18. Co-authored-by: jian he <jian.universality@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CACJufxFSC0mcQ82bSk58sO-WJY4P-o4N6RD2M0D=DD_u_6EzdQ@mail.gmail.com
2025-07-03Refactor subtype field of AlterDomainStmtMichael Paquier
AlterDomainStmt.subtype used characters for its subtypes of commands, SET|DROP DEFAULT|NOT NULL and ADD|DROP|VALIDATE CONSTRAINT, which were hardcoded in a couple of places of the code. The code is improved by using an enum instead, with the same character values as the original code. Note that the field was documented in parsenodes.h and that it forgot to mention 'V' (VALIDATE CONSTRAINT). Author: Quan Zongliang <quanzongliang@yeah.net> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/41ff310b-16bd-44b9-a3ef-97e20f14b709@yeah.net
2025-07-03Support multi-line headers in COPY FROM command.Fujii Masao
The COPY FROM command now accepts a non-negative integer for the HEADER option, allowing multiple header lines to be skipped. This is useful when the input contains multi-line headers that should be ignored during data import. Author: Shinya Kato <shinya11.kato@gmail.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/CAOzEurRPxfzbxqeOPF_AGnAUOYf=Wk0we+1LQomPNUNtyZGBZw@mail.gmail.com