summaryrefslogtreecommitdiff
path: root/src/backend
AgeCommit message (Collapse)Author
2024-09-17Don't enter parallel mode when holding interrupts.Noah Misch
Doing so caused the leader to hang in wait_event=ParallelFinish, which required an immediate shutdown to resolve. Back-patch to v12 (all supported versions). Francesco Degrassi Discussion: https://postgr.es/m/CAC-SaSzHUKT=vZJ8MPxYdC_URPfax+yoA1hKTcF4ROz_Q6z0_Q@mail.gmail.com
2024-09-15Replace usages of xmlXPathCompile() with xmlXPathCtxtCompile().Tom Lane
In existing releases of libxml2, xmlXPathCompile can be driven to stack overflow because it fails to protect itself against too-deeply-nested input. While there is an upstream fix as of yesterday, it will take years for that to propagate into all shipping versions. In the meantime, we can protect our own usages basically for free by calling xmlXPathCtxtCompile instead. (The actual bug is that libxml2 keeps its nesting counter in the xmlXPathContext, and its parsing code was willing to just skip counting nesting levels if it didn't have a context. So if we supply a context, all is well. It seems odd actually that it works at all to not supply a context, because this means that XPath parsing does not have access to XML namespace info. Apparently libxml2 never checks namespaces until runtime? Anyway, this seems like good future-proofing even if its only immediate effect is to dodge a bug.) Sadly, this hack only offers protection with libxml2 2.9.11 and newer. Before that there are multiple similar problems, so if you are processing untrusted XML it behooves you to get a newer version. But we have some pretty old libxml2 in the buildfarm, so it seems impractical to add a regression test to verify this fix. Per bug #18617 from Jingzhou Fu. Back-patch to all supported versions. Discussion: https://postgr.es/m/18617-1cee4d2ed1f4e7ae@postgresql.org Discussion: https://gitlab.gnome.org/GNOME/libxml2/-/issues/799
2024-09-13Allow _h_indexbuild() to be interrupted.Tom Lane
When we are building a hash index that is large enough to need pre-sorting (larger than either maintenance_work_mem or NBuffers), the initial sorting phase is interruptible, but the insertion phase wasn't. Add the missing CHECK_FOR_INTERRUPTS(). Per bug #18616 from Alexander Lakhin. Back-patch to all supported branches. Pavel Borisov Discussion: https://postgr.es/m/18616-acbb9e5caf41e964@postgresql.org
2024-09-11Remove incorrect Assert.Tom Lane
check_agglevels_and_constraints() asserted that if we find an aggregate function in an EXPR_KIND_FROM_SUBSELECT expression, the expression must be in a LATERAL subquery. Alexander Lakhin found a case where that's not so: because of the odd scoping rules for NEW/OLD within a rule, a reference to NEW/OLD could cause an aggregate to be considered top-level even though it's in an unmarked sub-select. The error message that would be thrown seems sufficiently on-point, so just remove the Assert. (Hence, this is not a bug for production builds.) This Assert was added by me in commit eaccfded9 (9.3 era). It looks like I put it in to cross-check that the new logic for detecting misplaced aggregates (using agglevelsup) caught the same cases that a previous check on p_lateral_active did. So there might have been some related misbehavior before eaccfded9 ... but that's very ancient history by now, so I didn't dig any deeper. Per bug #18608 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/18608-48de0717508ee429@postgresql.org
2024-08-30Clarify restrict_nonsystem_relation_kind description.Masahiko Sawada
This change improves the description of the restrict_nonsystem_relation_kind parameter in guc_table.c and the documentation for better clarity. Backpatch to 12, where this GUC parameter was introduced. Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/6a96f1af-22b4-4a80-8161-1f26606b9ee2%40eisentraut.org Backpatch-through: 12
2024-08-29Disallow USING clause when altering type of generated columnPeter Eisentraut
This does not make sense. It would write the output of the USING clause into the converted column, which would violate the generation expression. This adds a check to error out if this is specified. There was a test for this, but that test errored out for a different reason, so it was not effective. Reported-by: Jian He <jian.universality@gmail.com> Reviewed-by: Yugo NAGATA <nagata@sraoss.co.jp> Discussion: https://www.postgresql.org/message-id/flat/c7083982-69f4-4b14-8315-f9ddb20b9834%40eisentraut.org
2024-08-19Explain dropdb can't use syscache because of TOASTTomas Vondra
Add a comment explaining dropdb() can't rely on syscache. The issue with flattened rows was fixed by commit 0f92b230f88b, but better to have a clear explanation why the systable scan is necessary. The other places doing in-place updates on pg_database have the same comment. Suggestion and patch by Yugo Nagata. Backpatch to 12, same as the fix. Author: Yugo Nagata Backpatch-through: 12 Discussion: https://postgr.es/m/CAJTYsWWNkCt+-UnMhg=BiCD3Mh8c2JdHLofPxsW3m2dkDFw8RA@mail.gmail.com
2024-08-19Fix regression in TLS session ticket disablingDaniel Gustafsson
Commit 274bbced disabled session tickets for TLSv1.3 on top of the already disabled TLSv1.2 session tickets, but accidentally caused a regression where TLSv1.2 session tickets were incorrectly sent. Fix by unconditionally disabling TLSv1.2 session tickets and only disable TLSv1.3 tickets when the right version of OpenSSL is used. Backpatch to all supported branches. Reported-by: Cameron Vogt <cvogt@automaticcontrols.net> Reported-by: Fire Emerald <fire.github@gmail.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/DM6PR16MB3145CF62857226F350C710D1AB852@DM6PR16MB3145.namprd16.prod.outlook.com Backpatch-through: v12
2024-08-19Fix DROP DATABASE for databases with many ACLsTomas Vondra
Commit c66a7d75e652 modified DROP DATABASE so that if interrupted, the database is known to be in an invalid state and can only be dropped. This is done by setting a flag using an in-place update, so that it's not lost in case of rollback. For databases with many ACLs, this may however fail like this: ERROR: wrong tuple length This happens because with many ACLs, the pg_database.datacl attribute gets TOASTed. The dropdb() code reads the tuple from the syscache, which means it's detoasted. But the in-place update expects the tuple length to match the on-disk tuple. Fixed by reading the tuple from the catalog directly, not from syscache. Report and fix by Ayush Tiwari. Backpatch to 12. The DROP DATABASE fix was backpatched to 11, but 11 is EOL at this point. Reported-by: Ayush Tiwari Author: Ayush Tiwari Reviewed-by: Tomas Vondra Backpatch-through: 12 Discussion: https://postgr.es/m/CAJTYsWWNkCt+-UnMhg=BiCD3Mh8c2JdHLofPxsW3m2dkDFw8RA@mail.gmail.com
2024-08-12Fix creation of partition descriptor during concurrent detach+dropAlvaro Herrera
If a partition undergoes DETACH CONCURRENTLY immediately followed by DROP, this could cause a problem for a concurrent transaction recomputing the partition descriptor when running a prepared statement, because it tries to dereference a pointer to a tuple that's not found in a catalog scan. The existing retry logic added in commit dbca3469ebf8 is sufficient to cope with the overall problem, provided we don't try to dereference a non-existant heap tuple. Arguably, the code in RelationBuildPartitionDesc() has been wrong all along, since no check was added in commit 898e5e3290a7 against receiving a NULL tuple from the catalog scan; that bug has only become user-visible with DETACH CONCURRENTLY which was added in branch 14. Therefore, even though there's no known mechanism to cause a crash because of this, backpatch the addition of such a check to all supported branches. In branches prior to 14, this would cause the code to fail with a "missing relpartbound for relation XYZ" error instead of crashing; that's okay, because there are no reports of such behavior anyway. Author: Kuntal Ghosh <kuntalghosh.2007@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/18559-b48286d2eacd9a4e@postgresql.org
2024-08-11Suppress Coverity warnings about Asserts in get_name_for_var_field.Tom Lane
Coverity thinks dpns->plan could be null at these points. That shouldn't really be possible, but it's easy enough to modify the Asserts so they'd not core-dump if it were true. These are new in b919a97a6. Back-patch to v13; the v12 version of the patch didn't have these Asserts.
2024-08-10Allow adjusting session_authorization and role in parallel workers.Tom Lane
The code intends to allow GUCs to be set within parallel workers via function SET clauses, but not otherwise. However, doing so fails for "session_authorization" and "role", because the assign hooks for those attempt to set the subsidiary "is_superuser" GUC, and that call falls foul of the "not otherwise" prohibition. We can't switch to using GUC_ACTION_SAVE for this, so instead add a new GUC variable flag GUC_ALLOW_IN_PARALLEL to mark is_superuser as being safe to set anyway. (This is okay because is_superuser has context PGC_INTERNAL and thus only hard-wired calls can change it. We'd need more thought before applying the flag to other GUCs; but maybe there are other use-cases.) This isn't the prettiest fix perhaps, but other alternatives we thought of would be much more invasive. While here, correct a thinko in commit 059de3ca4: when rejecting a GUC setting within a parallel worker, we should return 0 not -1 if the ereport doesn't longjmp. (This seems to have no consequences right now because no caller cares, but it's inconsistent.) Improve the comments to try to forestall future confusion of the same kind. Despite the lack of field complaints, this seems worth back-patching. Thanks to Nathan Bossart for the idea to invent a new flag, and for review. Discussion: https://postgr.es/m/2833457.1723229039@sss.pgh.pa.us
2024-08-09Fix "failed to find plan for subquery/CTE" errors in EXPLAIN.Tom Lane
To deparse a reference to a field of a RECORD-type output of a subquery, EXPLAIN normally digs down into the subquery's plan to try to discover exactly which anonymous RECORD type is meant. However, this can fail if the subquery has been optimized out of the plan altogether on the grounds that no rows could pass the WHERE quals, which has been possible at least since 3fc6e2d7f. There isn't anything remaining in the plan tree that would help us, so fall back to printing the field name as "fN" for the N'th column of the record. (This will actually be the right thing some of the time, since it matches the column names we assign to RowExprs.) In passing, fix a comment typo in create_projection_plan, which I noticed while experimenting with an alternative fix for this. Per bug #18576 from Vasya B. Back-patch to all supported branches. Richard Guo and Tom Lane Discussion: https://postgr.es/m/18576-9feac34e132fea9e@postgresql.org
2024-08-08Refuse ATTACH of a table referenced by a foreign keyAlvaro Herrera
Trying to attach a table as a partition which is already on the referenced side of a foreign key on the partitioned table that it is being attached to, leads to strange behavior: we try to clone the foreign key from the parent to the partition, but this new FK points to the partition itself, and the mix of pg_constraint rows and triggers doesn't behave well. Rather than trying to untangle the mess (which might be possible given sufficient time), I opted to forbid the ATTACH. This doesn't seem a problematic restriction, given that we already fail to create the foreign key if you do it the other way around, that is, having the partition first and the FK second. Backpatch to all supported branches. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/18541-628a61bc267cd2d3@postgresql.org
2024-08-05Restrict accesses to non-system views and foreign tables during pg_dump.Masahiko Sawada
When pg_dump retrieves the list of database objects and performs the data dump, there was possibility that objects are replaced with others of the same name, such as views, and access them. This vulnerability could result in code execution with superuser privileges during the pg_dump process. This issue can arise when dumping data of sequences, foreign tables (only 13 or later), or tables registered with a WHERE clause in the extension configuration table. To address this, pg_dump now utilizes the newly introduced restrict_nonsystem_relation_kind GUC parameter to restrict the accesses to non-system views and foreign tables during the dump process. This new GUC parameter is added to back branches too, but these changes do not require cluster recreation. Back-patch to all supported branches. Reviewed-by: Noah Misch Security: CVE-2024-7348 Backpatch-through: 12
2024-08-05Translation updatesPeter Eisentraut
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 1b16f532c5e3688b4439a2769cef003b17946667
2024-07-31Revert "Allow parallel workers to cope with a newly-created session user ID."Tom Lane
This reverts commit 216201027d90e99a0a2b2d2efba85dc0aac94c62. Some buildfarm animals are failing with "cannot change "client_encoding" during a parallel operation". It looks like assign_client_encoding is unhappy at being asked to roll back a client_encoding setting after a parallel worker encounters a failure. There must be more to it though: why didn't I see this during local testing? In any case, it's clear that moving the RestoreGUCState() call is not as side-effect-free as I thought. Given that the bug f5f30c22e intended to fix has gone unreported for years, it's not something that's urgent to fix; I'm not willing to risk messing with it further with only days to our next release wrap.
2024-07-31Allow parallel workers to cope with a newly-created session user ID.Tom Lane
Parallel workers failed after a sequence like BEGIN; CREATE USER foo; SET SESSION AUTHORIZATION foo; because check_session_authorization could not see the uncommitted pg_authid row for "foo". This is because we ran RestoreGUCState() in a separate transaction using an ordinary just-created snapshot. The same disease afflicts any other GUC that requires catalog lookups and isn't forgiving about the lookups failing. To fix, postpone RestoreGUCState() into the worker's main transaction after we've set up a snapshot duplicating the leader's. This affects check_transaction_isolation and check_transaction_deferrable, which think they should only run during transaction start. Make them act like check_transaction_read_only, which already knows it should silently accept the value when InitializingParallelWorker. Per bug #18545 from Andrey Rachitskiy. Back-patch to all supported branches, because this has been wrong for awhile. Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
2024-07-28libpq: Use strerror_r instead of strerrorPeter Eisentraut
Commit 453c4687377 introduced a use of strerror() into libpq, but that is not thread-safe. Fix by using strerror_r() instead. In passing, update some of the code comments added by 453c4687377, as we have learned more about the reason for the change in OpenSSL that started this. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: Discussion: https://postgr.es/m/b6fb018b-f05c-4afd-abd3-318c649faf18@highgo.ca
2024-07-26Disable all TLS session ticketsDaniel Gustafsson
OpenSSL supports two types of session tickets for TLSv1.3, stateless and stateful. The option we've used only turns off stateless tickets leaving stateful tickets active. Use the new API introduced in 1.1.1 to disable all types of tickets. Backpatch to all supported versions. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20240617173803.6alnafnxpiqvlh3g@awork3.anarazel.de Backpatch-through: v12
2024-07-24Reset relhassubclass upon attaching table as a partitionAlvaro Herrera
We don't allow inheritance parents as partitions, and have checks to prevent this; but if a table _was_ in the past an inheritance parents and all their children are removed, the pg_class.relhassubclass flag may remain set, which confuses the partition pruning code (most obviously, it results in an assertion failure; in production builds it may be worse.) Fix by resetting relhassubclass on attach. Backpatch to all supported versions. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18550-d5e047e9a897a889@postgresql.org
2024-07-23Detect integer overflow in array_set_slice().Nathan Bossart
When provided an empty initial array, array_set_slice() fails to check for overflow when computing the new array's dimensions. While such overflows are ordinarily caught by ArrayGetNItems(), commands with the following form are accepted: INSERT INTO t (i[-2147483648:2147483647]) VALUES ('{}'); To fix, perform the hazardous computations using overflow-detecting arithmetic routines. As with commit 18b585155a, the added test cases generate errors that include a platform-dependent value, so we again use psql's VERBOSITY parameter to suppress printing the message text. Reported-by: Alexander Lakhin Author: Joseph Koshakow Reviewed-by: Jian He Discussion: https://postgr.es/m/31ad2cd1-db94-bdb3-f91a-65ffdb4bef95%40gmail.com Backpatch-through: 12
2024-07-20Correctly check updatability of columns targeted by INSERT...DEFAULT.Tom Lane
If a view has some updatable and some non-updatable columns, we failed to verify updatability of any columns for which an INSERT or UPDATE on the view explicitly specifies a DEFAULT item (unless the view has a declared default for that column, which is rare anyway, and one would almost certainly not write one for a non-updatable column). This would lead to an unexpected "attribute number N not found in view targetlist" error rather than the intended error. Per bug #18546 from Alexander Lakhin. This bug is old, so back-patch to all supported branches. Discussion: https://postgr.es/m/18546-84a292e759a9361d@postgresql.org
2024-07-19Add overflow checks to money type.Nathan Bossart
None of the arithmetic functions for the the money type handle overflow. This commit introduces several helper functions with overflow checking and makes use of them in the money type's arithmetic functions. Fixes bug #18240. Reported-by: Alexander Lakhin Author: Joseph Koshakow Discussion: https://postgr.es/m/18240-c5da758d7dc1ecf0%40postgresql.org Discussion: https://postgr.es/m/CAAvxfHdBPOyEGS7s%2Bxf4iaW0-cgiq25jpYdWBqQqvLtLe_t6tw%40mail.gmail.com Backpatch-through: 12
2024-07-15Fix bad indentation introduced in 43cd30bcd1cAndres Freund
Oops. Reported-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/ZpVZB9rH5tHllO75@nathan Backpatch: 12-, like 43cd30bcd1c
2024-07-15Fix type confusion in guc_var_compare()Andres Freund
Before this change guc_var_compare() cast the input arguments to const struct config_generic *. That's not quite right however, as the input on one side is often just a char * on one side. Instead just use char *, the first field in config_generic. This fixes a -Warray-bounds warning with some versions of gcc. While the warning is only known to be triggered for <= 15, the issue the warning points out seems real, so apply the fix everywhere. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reported-by: Erik Rijkers <er@xs4all.nl> Suggested-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/a74a1a0d-0fd2-3649-5224-4f754e8f91aa%40xs4all.nl
2024-07-14Avoid unhelpful internal error for incorrect recursive-WITH queries.Tom Lane
checkWellFormedRecursion would issue "missing recursive reference" if a WITH RECURSIVE query contained a single self-reference but that self-reference was inside a top-level WITH, ORDER BY, LIMIT, etc, rather than inside the second arm of the UNION as expected. We already intended to throw more-on-point errors for such cases, but those error checks must be done before examining the UNION arm in order to have the desired results. So this patch need only move some code (and improve the comments). Per bug #18536 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/18536-0a342ec07901203e@postgresql.org
2024-07-13Fix lost Windows socket EOF events.Thomas Munro
Winsock only signals an FD_CLOSE event once if the other end of the socket shuts down gracefully. Because each WaitLatchOrSocket() call constructs and destroys a new event handle every time, with unlucky timing we can lose it and hang. We get away with this only if the other end disconnects non-gracefully, because FD_CLOSE is repeatedly signaled in that case. To fix this design flaw in our Windows socket support fundamentally, we'd probably need to rearchitect it so that a single event handle exists for the lifetime of a socket, or switch to completely different multiplexing or async I/O APIs. That's going to be a bigger job and probably wouldn't be back-patchable. This brute force kludge closes the race by explicitly polling with MSG_PEEK before sleeping. Back-patch to all supported releases. This should hopefully clear up some random build farm and CI hang failures reported over the years. It might also allow us to try using graceful shutdown in more places again (reverted in commit 29992a6) to fix instability in the transmission of FATAL error messages, but that isn't done by this commit. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Tested-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/176008.1715492071%40sss.pgh.pa.us
2024-07-12Fix ALTER TABLE DETACH for inconsistent indexesAlvaro Herrera
When a partitioned table has an index that doesn't support a constraint, but a partition has an equivalent index that does, then a DETACH operation would misbehave: a crash in assertion-enabled systems (because we fail to find the constraint in the parent that we expect to), or a broken coninhcount value (-1) in production systems (because we blindly believe that we've successfully detached the parent). While we should reject an ATTACH of a partition with such an index, we have failed to do so in existing releases, so adding an error in stable releases might break the (unlikely) existing applications that rely on this behavior. At this point I don't even want to reject them in master, because it'd break pg_upgrade if such databases exist, and there would be no easy way to fix existing databases without expensive index rebuilds. (Later on we could add ALTER TABLE ... ADD CONSTRAINT USING INDEX to partitioned tables, which would allow the user to fix such patterns. At that point we could add more restrictions to prevent the problem from its root.) Also, add a test case that leaves one table in this condition, so that we can verify that pg_upgrade continues to work if we later decide to change the policy on the master branch. Backpatch to all supported branches. Co-authored-by: Tender Wang <tndrwang@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/18500-62948b6fe5522f56@postgresql.org
2024-07-11Fix possibility of logical decoding partial transaction changes.Masahiko Sawada
When creating and initializing a logical slot, the restart_lsn is set to the latest WAL insertion point (or the latest replay point on standbys). Subsequently, WAL records are decoded from that point to find the start point for extracting changes in the DecodingContextFindStartpoint() function. Since the initial restart_lsn could be in the middle of a transaction, the start point must be a consistent point where we won't see the data for partial transactions. Previously, when not building a full snapshot, serialized snapshots were restored, and the SnapBuild jumps to the consistent state even while finding the start point. Consequently, the slot's restart_lsn and confirmed_flush could be set to the middle of a transaction. This could lead to various unexpected consequences. Specifically, there were reports of logical decoding decoding partial transactions, and assertion failures occurred because only subtransactions were decoded without decoding their top-level transaction until decoding the commit record. To resolve this issue, the changes prevent restoring the serialized snapshot and jumping to the consistent state while finding the start point. On v17 and HEAD, a flag indicating whether snapshot restores should be skipped has been added to the SnapBuild struct, and SNAPBUILD_VERSION has been bumpded. On backbranches, the flag is stored in the LogicalDecodingContext instead, preserving on-disk compatibility. Backpatch to all supported versions. Reported-by: Drew Callahan Reviewed-by: Amit Kapila, Hayato Kuroda Discussion: https://postgr.es/m/2444AA15-D21B-4CCE-8052-52C7C2DAFE5C%40amazon.com Backpatch-through: 12
2024-07-10Make our back branches compatible with libxml2 2.13.x.Tom Lane
This back-patches HEAD commits 066e8ac6e, 6082b3d5d, e7192486d, and 896cd266f into supported branches. Changes: * Use xmlAddChildList not xmlAddChild in XMLSERIALIZE (affects v16 and up only). This was a flat-out coding mistake that we got away with due to lax checking in previous versions of xmlAddChild. * Use xmlParseInNodeContext not xmlParseBalancedChunkMemory. This is to dodge a bug in xmlParseBalancedChunkMemory in libxm2 releases 2.13.0-2.13.2. While that bug is now fixed upstream and will probably never be seen in any production-oriented distro, it is currently a problem on some more-bleeding-edge-friendly platforms. * Suppress "chunk is not well balanced" errors from libxml2, unless it is the only error. This eliminates an error-reporting discrepancy between 2.13 and older releases. This error is almost always redundant with previous errors, if not flat-out inappropriate, which is why 2.13 changed the behavior and why nobody's likely to miss it. Erik Wienhold and Tom Lane, per report from Frank Streitzig. Discussion: https://postgr.es/m/trinity-b0161630-d230-4598-9ebc-7a23acdb37cb-1720186432160@3c-app-gmx-bap25 Discussion: https://postgr.es/m/trinity-361ba18b-541a-4fe7-bc63-655ae3a7d599-1720259822452@3c-app-gmx-bs01
2024-07-08Fix scale clamping in numeric round() and trunc().Dean Rasheed
The numeric round() and trunc() functions clamp the scale argument to the range between +/- NUMERIC_MAX_RESULT_SCALE (2000), which is much smaller than the actual allowed range of type numeric. As a result, they return incorrect results when asked to round/truncate more than 2000 digits before or after the decimal point. Fix by using the correct upper and lower scale limits based on the actual allowed (and documented) range of type numeric. While at it, use the new NUMERIC_WEIGHT_MAX constant instead of SHRT_MAX in all other overflow checks, and fix a comment thinko in power_var() introduced by e54a758d24 -- the minimum value of ln_dweight is -NUMERIC_DSCALE_MAX (-16383), not -SHRT_MAX, though this doesn't affect the point being made in the comment, that the resulting local_rscale value may exceed NUMERIC_MAX_DISPLAY_SCALE (1000). Back-patch to all supported branches. Dean Rasheed, reviewed by Joel Jacobson. Discussion: https://postgr.es/m/CAEZATCXB%2BrDTuMjhK5ZxcouufigSc-X4tGJCBTMpZ3n%3DxxQuhg%40mail.gmail.com
2024-07-01Preserve CurrentMemoryContext across notify and sinval interrupts.Tom Lane
ProcessIncomingNotify is called from the main processing loop that normally runs in MessageContext. That outer-loop code assumes that whatever it allocates will be cleaned up when we're done processing the current client message --- but if we service a notify interrupt, then whatever gets allocated before the next switch into MessageContext will be permanently leaked in TopMemoryContext, because CommitTransactionCommand sets CurrentMemoryContext to TopMemoryContext. There are observable leaks associated with (at least) encoding conversion of incoming queries and parameters attached to Bind messages. sinval catchup interrupts have a similar problem. There might be others, but I've not identified any other clear cases. To fix, take care to save and restore CurrentMemoryContext across the Start/CommitTransactionCommand calls in these functions. Per bug #18512 from wizardbrony. Commit to back branches only; in HEAD, this was dealt with by the riskier but more thoroughgoing approach in commit 1afe31f03. Discussion: https://postgr.es/m/3478884.1718656625@sss.pgh.pa.us
2024-06-27Cope with inplace update making catcache stale during TOAST fetch.Noah Misch
This extends ad98fb14226ae6456fbaed7990ee7591cbe5efd2 to invals of inplace updates. Trouble requires an inplace update of a catalog having a TOAST table, so only pg_database was at risk. (The other catalog on which core code performs inplace updates, pg_class, has no TOAST table.) Trouble would require something like the inplace-inval.spec test. Consider GRANT ... ON DATABASE fetching a stale row from cache and discarding a datfrozenxid update that vac_truncate_clog() has already relied upon. Back-patch to v12 (all supported versions). Reviewed (in an earlier version) by Robert Haas. Discussion: https://postgr.es/m/20240114201411.d0@rfd.leadboat.com Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com
2024-06-27AccessExclusiveLock new relations just after assigning the OID.Noah Misch
This has no user-visible, important consequences, since other sessions' catalog scans can't find the relation until we commit. However, this unblocks introducing a rule about locks required to heap_update() a pg_class row. CREATE TABLE has been acquiring this lock eventually, but it can heap_update() pg_class.relchecks earlier. create_toast_table() has been acquiring only ShareLock. Back-patch to v12 (all supported versions), the plan for the commit relying on the new rule. Reviewed (in an earlier version) by Robert Haas. Discussion: https://postgr.es/m/20240611024525.9f.nmisch@google.com
2024-06-27Lock before setting relhassubclass on RELKIND_PARTITIONED_INDEX.Noah Misch
Commit 5b562644fec696977df4a82790064e8287927891 added a comment that SetRelationHasSubclass() callers must hold this lock. When commit 17f206fbc824d2b4b14480199ca9ff7dea417eda extended use of this column to partitioned indexes, it didn't take the lock. As the latter commit message mentioned, we currently never reset a partitioned index to relhassubclass=f. That largely avoids harm from the lock omission. The cause for fixing this now is to unblock introducing a rule about locks required to heap_update() a pg_class row. This might cause more deadlocks. It gives minor user-visible benefits: - If an ALTER INDEX SET TABLESPACE runs concurrently with ALTER TABLE ATTACH PARTITION or CREATE PARTITION OF, one transaction blocks instead of failing with "tuple concurrently updated". (Many cases of DDL concurrency still fail that way.) - Match ALTER INDEX ATTACH PARTITION in choosing to lock the index. While not user-visible today, we'll need this if we ever make something set the flag to false for a partitioned index, like ANALYZE does today for tables. Back-patch to v12 (all supported versions), the plan for the commit relying on the new rule. In back branches, add LockOrStrongerHeldByMe() instead of adding a LockHeldByMe() parameter. Reviewed (in an earlier version) by Robert Haas. Discussion: https://postgr.es/m/20240611024525.9f.nmisch@google.com
2024-06-27Expand comments and add an assertion in nodeModifyTable.c.Noah Misch
Most comments concern RELKIND_VIEW. One addresses the ExecUpdate() "tupleid" parameter. A later commit will rely on these facts, but they hold already. Back-patch to v12 (all supported versions), the plan for that commit. Reviewed (in an earlier version) by Robert Haas. Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com
2024-06-27Avoid crashing when a JIT-inlined backend function throws an error.Tom Lane
errfinish() assumes that the __FUNC__ and __FILE__ arguments it's passed are compile-time constant strings that can just be pointed to rather than physically copied. However, it's possible for LLVM to generate code in which those pointers point into a dynamically loaded code segment. If that segment gets unloaded before we're done with the ErrorData struct, we have dangling pointers that will lead to SIGSEGV. In simple cases that won't happen, because we won't unload LLVM code before end of transaction. But it's possible to happen if the error is thrown within end-of-transaction code run by _SPI_commit or _SPI_rollback, because since commit 2e517818f those functions clean up by ending the transaction and starting a new one. Rather than fixing this by adding pstrdup() overhead to every elog/ereport sequence, let's fix it by copying the risky pointers in CopyErrorData(). That solves it for _SPI_commit/_SPI_rollback because they use that function to preserve the error data across the transaction end/restart sequence; and it seems likely that any other code doing something similar would need to do that too. I'm suspicious that this behavior amounts to an LLVM bug (or a bug in our use of it?), because it implies that string constant references that should be pointer-equal according to a naive understanding of C semantics will sometimes not be equal. However, even if it is a bug and someday gets fixed, we'll have to cope with the current behavior for a long time to come. Report and patch by me. Back-patch to all supported branches. Discussion: https://postgr.es/m/1565654.1719425368@sss.pgh.pa.us
2024-06-27Fix MVCC bug with prepared xact with subxacts on standbyHeikki Linnakangas
We did not recover the subtransaction IDs of prepared transactions when starting a hot standby from a shutdown checkpoint. As a result, such subtransactions were considered as aborted, rather than in-progress. That would lead to hint bits being set incorrectly, and the subtransactions suddenly becoming visible to old snapshots when the prepared transaction was committed. To fix, update pg_subtrans with prepared transactions's subxids when starting hot standby from a shutdown checkpoint. The snapshots taken from that state need to be marked as "suboverflowed", so that we also check the pg_subtrans. Backport to all supported versions. Discussion: https://www.postgresql.org/message-id/6b852e98-2d49-4ca1-9e95-db419a2696e0@iki.fi
2024-06-20Don't throw an error if a queued AFTER trigger no longer exists.Tom Lane
afterTriggerInvokeEvents and AfterTriggerExecute have always treated it as an error if the trigger OID mentioned in a queued after-trigger event can't be found. However, that fails to account for the edge case where the trigger's been dropped in the current transaction since queueing the event. There seems no very good reason to disallow that case, so instead silently do nothing if the trigger OID can't be found. This does give up a little bit of bug-detection ability, but I don't recall that these error messages have ever actually revealed a bug, so it seems mostly theoretical. Alternatives such as marking pending events DONE at the time of dropping a trigger would be complicated and perhaps introduce bugs of their own. Per bug #18517 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/18517-af2d19882240902c@postgresql.org
2024-06-17Fix insertion of SP-GiST REDIRECT tuples during REINDEX CONCURRENTLY.Tom Lane
Reconstruction of an SP-GiST index by REINDEX CONCURRENTLY may insert some REDIRECT tuples. This will typically happen in a transaction that lacks an XID, which leads either to assertion failure in spgFormDeadTuple or to insertion of a REDIRECT tuple with zero xid. The latter's not good either, since eventually VACUUM will apply GlobalVisTestIsRemovableXid() to the zero xid, resulting in either an assertion failure or a garbage answer. In practice, since REINDEX CONCURRENTLY locks out index scans till it's done, it doesn't matter whether it inserts REDIRECTs or PLACEHOLDERs; and likewise it doesn't matter how soon VACUUM reduces such a REDIRECT to a PLACEHOLDER. So in non-assert builds there's no observable problem here, other than perhaps a little index bloat. But it's not behaving as intended. To fix, remove the failing Assert in spgFormDeadTuple, acknowledging that we might sometimes insert a zero XID; and guard VACUUM's GlobalVisTestIsRemovableXid() call with a test for valid XID, ensuring that we'll reduce such a REDIRECT the first time VACUUM sees it. (Versions before v14 use TransactionIdPrecedes here, which won't fail on zero xid, so they really have no bug at all in non-assert builds.) Another solution could be to not create REDIRECTs at all during REINDEX CONCURRENTLY, making the relevant code paths treat that case like index build (which likewise knows that no concurrent index scans can be happening). That would allow restoring the Assert in spgFormDeadTuple, but we'd still need the VACUUM change because redirection tuples with zero xid may be out there already. But there doesn't seem to be a nice way for spginsert() to tell that it's being called in REINDEX CONCURRENTLY without some API changes, so we'll leave that as a possible future improvement. In HEAD, also rename the SpGistState.myXid field to redirectXid, which seems less misleading (since it might not in fact be our transaction's XID) and is certainly less uninformatively generic. Per bug #18499 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/18499-8a519c280f956480@postgresql.org
2024-06-14Clean out column-level pg_init_privs entries when dropping tables.Tom Lane
DeleteInitPrivs did not get the memo about how, when dropping a whole object (with subid == 0), you should drop entries relating to its sub-objects too. This is visible in the test_pg_dump test case if one drops the extension at the end: the entry for GRANT SELECT(col1) ON regress_pg_dump_table TO public; was still present in pg_init_privs afterwards, although it was pointing to a dangling table OID. Noted while fooling with a fix for REASSIGN OWNED for pg_init_privs entries. This bug is aboriginal in the pg_init_privs feature though, and there seems no reason not to back-patch the fix.
2024-06-13Fix parsing of ignored operators in websearch_to_tsquery().Tom Lane
The manual says clearly that punctuation in the input of websearch_to_tsquery() is ignored, except for the special cases of dashes and quotes. However, this failed for cases like "(foo bar) or something", or in general an ISOPERATOR character in front of the "or". We'd switch back to WAITOPERAND state, then ignore the operator character while remaining in that state, and then reach the "or" in WAITOPERAND state which (intentionally) makes us treat it as data. The fix is simple enough: if we see an ISOPERATOR character while in WAITOPERATOR state, we have to skip it while staying in that state. (We don't need to worry about other punctuation characters: those will be consumed as though they were words, but then rejected by lexizing.) In v14 and up (since commit eb086056f) we can simplify the code a bit more too, because there is no longer a reason for the WAITOPERAND state to distinguish between quoted and unquoted operands. Per bug #18479 from Manos Emmanouilidis. Back-patch to all supported branches. Discussion: https://postgr.es/m/18479-d9b46e2fc242c33e@postgresql.org
2024-06-13Clamp result of MultiXactMemberFreezeThresholdHeikki Linnakangas
The purpose of the function is to reduce the effective autovacuum_multixact_freeze_max_age if the multixact members SLRU is approaching wraparound, to make multixid freezing more aggressive. The returned value should therefore never be greater than plain autovacuum_multixact_freeze_max_age. Reviewed-by: Robert Haas Discussion: https://www.postgresql.org/message-id/85fb354c-f89f-4d47-b3a2-3cbd461c90a3@iki.fi Backpatch-through: 12, all supported versions
2024-06-11Fix infer_arbiter_indexes() to not assume resultRelation is 1.Tom Lane
infer_arbiter_indexes failed to renumber varnos in index expressions or predicates that it got from the catalogs. This escaped detection up to now because the stored varnos in such trees will be 1, and an INSERT's result relation is usually the first rangetable entry, so that that was fine. However, in cases such as inserting through an updatable view, it's not fine, leading to failure to match the expressions to the query with ensuing "there is no unique or exclusion constraint matching the ON CONFLICT specification" errors. Fix by copy-and-paste from get_relation_info(). Per bug #18502 from Michael Wang. Back-patch to all supported versions. Discussion: https://postgr.es/m/18502-545b53f5b81e54e0@postgresql.org
2024-06-07Reject modifying a temp table of another session with ALTER TABLE.Tom Lane
Normally this case isn't even reachable by non-superusers, since permissions checks prevent naming such a table. However, it is possible to make it happen by altering a parent table whose child is another session's temp table. We definitely can't support any such ALTER that requires modifying the contents of such a table, since we lack access to the other session's temporary-buffer pool. But there seems no good reason to allow it even if it'd only require changing catalog contents. One reason not to allow it is that we'd rather not expose the implementation-dependent behavior of whether a specific ALTER requires touching the table contents. Another is that there may be (in future, even if not today) optimizations that assume that a session's own temp tables won't be modified by other sessions. Hence, add a RELATION_IS_OTHER_TEMP() check to all the places where ALTER TABLE currently does CheckTableNotInUse(). (I looked through all other callers of CheckTableNotInUse(), and they seem OK already.) Per bug #18492 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/18492-c7a2634bf4968763@postgresql.org
2024-06-07Fix behavior of stable functions called from a CALL's argument list.Tom Lane
If the CALL is within an atomic context (e.g. there's an outer transaction block), _SPI_execute_plan should acquire a fresh snapshot to execute any such functions with. We failed to do that and instead passed them the Portal snapshot, which had been acquired at the start of the current SQL command. This'd lead to seeing stale values of rows modified since the start of the command. This is arguably a bug in 84f5c2908: I failed to see that "are we in non-atomic mode" needs to be defined the same way as it is further down in _SPI_execute_plan, i.e. check !_SPI_current->atomic not just options->allow_nonatomic. Alternatively the blame could be laid on plpgsql, which is unconditionally passing allow_nonatomic = true for CALL/DO even when it knows it's in an atomic context. However, fixing it in spi.c seems like a better idea since that will also fix the problem for any extensions that may have copied plpgsql's coding pattern. While here, update an obsolete comment about _SPI_execute_plan's snapshot management. Per report from Victor Yegorov. Back-patch to all supported versions. Discussion: https://postgr.es/m/CAGnEboiRe+fG2QxuBO2390F7P8e2MQ6UyBjZSL_w1Cej+E4=Vw@mail.gmail.com
2024-05-18Account for optimized MinMax aggregates during SS_finalize_plan.Tom Lane
We are capable of optimizing MIN() and MAX() aggregates on indexed columns into subqueries that exploit the index, rather than the normal thing of scanning the whole table. When we do this, we replace the Aggref node(s) with Params referencing subquery outputs. Such Params really ought to be included in the per-plan-node extParam/allParam sets computed by SS_finalize_plan. However, we've never done so up to now because of an ancient implementation choice to perform that substitution during set_plan_references, which runs after SS_finalize_plan, so that SS_finalize_plan never sees these Params. The cleanest fix would be to perform a separate tree walk to do these substitutions before SS_finalize_plan runs. That seems unattractive, first because a whole-tree mutation pass is expensive, and second because we lack infrastructure for visiting expression subtrees in a Plan tree, so that we'd need a new function knowing as much as SS_finalize_plan knows about that. I also considered swapping the order of SS_finalize_plan and set_plan_references, but that fell foul of various assumptions that seem tricky to fix. So the approach adopted here is to teach SS_finalize_plan itself to check for such Aggrefs. I refactored things a bit in setrefs.c to avoid having three copies of the code that does that. Back-patch of v17 commits d0d44049d and 779ac2c74. When d0d44049d went in, there was no evidence that it was fixing a reachable bug, so I refrained from back-patching. Now we have such evidence. Per bug #18465 from Hal Takahara. Back-patch to all supported branches. Discussion: https://postgr.es/m/18465-2fae927718976b22@postgresql.org Discussion: https://postgr.es/m/2391880.1689025003@sss.pgh.pa.us
2024-05-16Fix documentation about DROP DATABASE FORCE process termination rights.Noah Misch
Specifically, it terminates a background worker even if the caller couldn't terminate the background worker with pg_terminate_backend(). Commit 3a9b18b3095366cd0c4305441d426d04572d88c1 neglected to update this. Back-patch to v13, which introduced DROP DATABASE FORCE. Reviewed by Amit Kapila. Reported by Kirill Reshke. Discussion: https://postgr.es/m/20240429212756.60.nmisch@google.com
2024-05-14Fix handling of polymorphic output arguments for procedures.Tom Lane
Most of the infrastructure for procedure arguments was already okay with polymorphic output arguments, but it turns out that CallStmtResultDesc() was a few bricks shy of a load here. It thought all it needed to do was call build_function_result_tupdesc_t, but that function specifically disclaims responsibility for resolving polymorphic arguments. Failing to handle that doesn't seem to be a problem for CALL in plpgsql, but CALL from plain SQL would get errors like "cannot display a value of type anyelement", or even crash outright. In v14 and later we can simply examine the exposed types of the CallStmt.outargs nodes to get the right type OIDs. But it's a lot more complicated to fix in v12/v13, because those versions don't have CallStmt.outargs, nor do they do expand_function_arguments until ExecuteCallStmt runs. We have to duplicatively run expand_function_arguments, and then re-determine which elements of the args list are output arguments. Per bug #18463 from Drew Kimball. Back-patch to all supported versions, since it's busted in all of them. Discussion: https://postgr.es/m/18463-f8cd77e12564d8a2@postgresql.org