user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2024-02-01	Update time zone data files to tzdata release 2024a.	Tom Lane
	DST law changes in Ittoqqortoormiit, Greenland (America/Scoresbysund), Kazakhstan (Asia/Almaty and Asia/Qostanay) and Palestine; as well as updates for the Antarctic stations Casey and Vostok. Historical corrections for Vietnam, Toronto, and Miquelon.
2024-02-01	Avoid package qualification of $windows_os	Andrew Dunstan
	Further fallout from commit 6ee26c6a4b. To keep code in sync and avoid issues on older releases with different package names, simply use the unqualified name like many other places in our code.
2024-02-01	Apply band-aid fix for an oversight in reparameterize_path_by_child.	Tom Lane
	The path we wish to reparameterize is not a standalone object: in particular, it implicitly references baserestrictinfo clauses in the associated RelOptInfo, and if it's a SampleScan path then there is also the TableSampleClause in the RTE to worry about. Both of those could contain lateral references to the join partner relation, which would need to be modified to refer to its child. Since we aren't doing that, affected queries can give wrong answers, or odd failures such as "variable not found in subplan target list", or executor crashes. But we can't just summarily modify those expressions, because they are shared with other paths for the rel. We'd break things if we modify them and then end up using some non-partitioned-join path. In HEAD, we plan to fix this by postponing reparameterization until create_plan(), when we know that those other paths are no longer of interest, and then adjusting those expressions along with the ones in the path itself. That seems like too big a change for stable branches however. In the back branches, let's just detect whether any troublesome lateral references actually exist in those expressions, and fail reparameterization if so. This will result in not performing a partitioned join in such cases. Given the lack of field complaints, nobody's likely to miss the optimization. Report and patch by Richard Guo. Apply to 12-16 only, since the intended fix for HEAD looks quite different. We're not quite ready to push the HEAD fix, but with back-branch releases coming up soon, it seems wise to get this stopgap fix in place there. Discussion: https://postgr.es/m/CAMbWs496+N=UAjOc=rcD3P7B6oJe4rZw08e_TZRUsWbPxZW3Tw@mail.gmail.com
2024-01-31	Fix various issues with ALTER TEXT SEARCH CONFIGURATION	Michael Paquier
	This commit addresses a set of issues when changing token type mappings in a text search configuration when using duplicated token names: - ADD MAPPING would fail on insertion because of a constraint failure after inserting the same mapping. - ALTER MAPPING with an "overridden" configuration failed with "tuple already updated by self" when the token mappings are removed. - DROP MAPPING failed with "tuple already updated by self", like previously, but in a different code path. The code is refactored so the token names (with their numbers) are handled as a List with unique members rather than an array with numbers, ensuring that no duplicates mess up with the catalog inserts, updates and deletes. The list is generated by getTokenTypes(), with the same error handling as previously while duplicated tokens are discarded from the list used to work on the catalogs. Regression tests are expanded to cover much more ground for the cases fixed by this commit, as there was no coverage for the code touched in this commit. A bit more is done regarding the fact that a token name not supported by a configuration's parser should result in an error even if IF EXISTS is used in a DROP MAPPING clause. This is implied in the code but there was no coverage for that, and it was very easy to miss. These issues exist since at least their introduction in core with 140d4ebcb46e, so backpatch all the way down. Reported-by: Alexander Lakhin Author: Tender Wang, Michael Paquier Discussion: https://postgr.es/m/18310-1eb233c5908189c8@postgresql.org Backpatch-through: 12
2024-01-30	Fix 003_extrafiles.pl test for the Windows	Andrew Dunstan
	File::Find converts backslashes to slashes in the newer Perl versions. See: https://github.com/Perl/perl5/commit/414f14df98cb1c9a20f92c5c54948b67c09f072d So, do the same conversion for Windows before comparing paths. To support all Perl versions, always convert them on Windows regardless of the Perl's version. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Backpatch to all live branches
2024-01-30	Doc: mention foreign keys can reference unique indexes	David Rowley
	We seem to have only documented a foreign key can reference the columns of a primary key or unique constraint. Here we adjust the documentation to mention columns in a non-partial unique index can be mentioned too. The header comment for transformFkeyCheckAttrs() also didn't mention unique indexes, so fix that too. In passing make that header comment reflect reality in the various other aspects where it deviated from it. Bug: 18295 Reported-by: Gilles PARC Author: Laurenz Albe, David Rowley Discussion: https://www.postgresql.org/message-id/18295-0ed0fac5c9f7b17b%40postgresql.org Backpatch-through: 12
2024-01-29	Fix incompatibilities with libxml2 >= 2.12.0.	Tom Lane
	libxml2 changed the required signature of error handler callbacks to make the passed xmlError struct "const". This is causing build failures on buildfarm member caiman, and no doubt will start showing up in the field quite soon. Add a version check to adjust the declaration of xml_errorHandler() according to LIBXML_VERSION. 2.12.x also produces deprecation warnings for contrib/xml2/xpath.c's assignment to xmlLoadExtDtdDefaultValue. I see no good reason for that to still be there, seeing that we disabled external DTDs (at a lower level) years ago for security reasons. Let's just remove it. Back-patch to all supported branches, since they might all get built with newer libxml2 once it gets a bit more popular. (The back branches produce another deprecation warning about xpath.c's use of xmlSubstituteEntitiesDefault(). We ought to consider whether to back-patch all or part of commit 65c5864d7 to silence that. It's less urgent though, since it won't break the buildfarm.) Discussion: https://postgr.es/m/1389505.1706382262@sss.pgh.pa.us
2024-01-29	Fix locking when fixing an incomplete split of a GIN internal page	Heikki Linnakangas
	ginFinishSplit() expects the caller to hold an exclusive lock on the buffer, but when finishing an earlier "leftover" incomplete split of an internal page, the caller held a shared lock. That caused an assertion failure in MarkBufferDirty(). Without assertions, it could lead to corruption if two backends tried to complete the split at the same time. On master, add a test case using the new injection point facility. Report and analysis by Fei Changhong. Backpatch the fix to all supported versions. Reviewed-by: Fei Changhong, Michael Paquier Discussion: https://www.postgresql.org/message-id/tencent_A3CE810F59132D8E230475A5F0F7A08C8307@qq.com
2024-01-29	Fix catalog lookup due to wrong snapshot for subtransactions during decoding.	Amit Kapila
	In commit 272248a0c, we fixed the catalog lookup due to the wrong snapshot for transactions and subtransactions during decoding. We failed to consider the case where top-level xact is already marked as containing catalog change but its subtransaction is not yet marked as containing catalog change even though it contained such a change. This can happen when during decoding, none of the WAL records from the subtransaction was decoded and top-level xact contains a DDL. We fix it by marking the transaction and all its subtransactions as containing catalog changes if the top-level xact contains any catalog change and it is present in the initial running xacts array. This fix is required only for 14 and 15 because in prior branches we already always mark the transaction and all its subtransactions as containing catalog changes in the same case. In 16 and above, we preserve the list of transaction IDs and sub-transaction IDs, that have modified catalogs and are running during snapshot serialization, to the serialized snapshot (see commit 7f13ac8123). Author: Fei Changhong Reviewed-by: Amit Kapila, Hayato Kuroda, Andy Fan Discussion: https://postgr.es/m/18280-4c8060178cb41750@postgresql.org
2024-01-26	Detect Julian-date overflow in timestamp[tz]_pl_interval.	Tom Lane
	We perform addition of the days field of an interval via arithmetic on the Julian-date representation of the timestamp's date. This step is subject to int32 overflow, and we also should not let the Julian date become very negative, for fear of weird results from j2date. (In the timestamptz case, allow a Julian date of -1 to pass, since it might convert back to zero after timezone rotation.) The additions of the months and microseconds fields could also overflow, of course. However, I believe we need no additional checks there; the existing range checks should catch such cases. The difficulty here is that j2date's magic modular arithmetic could produce something that looks like it's in-range. Per bug #18313 from Christian Maurer. This has been wrong for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/18313-64d2c8952d81e84b@postgresql.org
2024-01-25	Track LLVM 18 changes.	Thomas Munro
	A function was given a newly standard name from C++20 in LLVM 16. Then LLVM 18 added a deprecation warning for the old name, and it is about to ship, so it's time to adjust that. Back-patch to all supported releases. Discussion: https://www.postgresql.org/message-id/CA+hUKGLbuVhH6mqS8z+FwAn4=5dHs0bAWmEMZ3B+iYHWKC4-ZA@mail.gmail.com
2024-01-24	Fix ALTER TABLE .. ADD COLUMN with complex inheritance trees	Michael Paquier
	This command, when used to add a column on a parent table with a complex inheritance tree, tried to update multiple times the same tuple in pg_attribute for a child table when incrementing attinhcount, causing failures with "tuple already updated by self" because of a missing CommandCounterIncrement() between two updates. This exists for a rather long time, so backpatch all the way down. Reported-by: Alexander Lakhin Author: Tender Wang Reviewed-by: Richard Guo Discussion: https://postgr.es/m/18297-b04cd83a55b51e35@postgresql.org Backpatch-through: 12
2024-01-22	Abort pgbench if script end is reached with an open pipeline	Alvaro Herrera
	When a pipeline is opened with \startpipeline and not closed, pgbench will either error on the next transaction with a "already in pipeline mode" error or successfully end if this was the last transaction -- despite not sending anything that was piped in the pipeline. Make it an error to reach end of script is reached while there's an open pipeline. Backpatch to 14, where pgbench got support for pipelines. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reported-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/Za4IObZkDjrO4TcS@paquier.xyz
2024-01-18	Fix plpgsql to allow new-style SQL CREATE FUNCTION as a SQL command.	Tom Lane
	plpgsql fails on new-style CREATE FUNCTION/PROCEDURE commands within a routine or DO block, because make_execsql_stmt believes that a semicolon token always terminates a SQL command. Now, that's actually been wrong since the day it was written, because CREATE RULE has long allowed multiple rule actions separated by semicolons. But there are few enough people using multi-action rules that there was never an attempt to fix it. New-style SQL functions, though, are popular. psql has this same problem of "does this semicolon really terminate the command?". It deals with CREATE RULE by counting parenthesis nesting depth: a semicolon within parens doesn't end a command. Commits e717a9a18 and 029c5ac03 created a similar heuristic to count matching BEGIN/END pairs (but only within CREATEs, so as not to be fooled by plain BEGIN). That's survived several releases now without trouble reports, so let's just absorb those heuristics into plpgsql. Per report from Samuel Dussault. Back-patch to v14 where new-style SQL function syntax came in. Discussion: https://postgr.es/m/YT2PR01MB88552C3E9AD40A6C038774A781722@YT2PR01MB8855.CANPRD01.PROD.OUTLOOK.COM
2024-01-18	Improve handling of dropped partitioned indexes for REINDEX INDEX	Michael Paquier
	A REINDEX INDEX done on a partitioned index builds a list of the indexes to work on before processing its partitions in individual transactions. When combined with a DROP of the partitioned index, there was a window where it was possible to see some unexpected "could not open relation with OID", synonym of relation lookup error. The code was robust enough to handle the case where the parent relation is missing, but not the case where an index would be gone missing. This is similar to 1d65416661bb. Support for REINDEX on partitioned relations has been introduced in a6642b3ae060, so backpatch down to 14. Author: Fei Changhong Discussion: https://postgr.es/m/tencent_6A52106095ACDE55333E3AD33F304C0C3909@qq.com Backpatch-through: 14
2024-01-18	Add try_index_open(), conditional variant of index_open()	Michael Paquier
	try_index_open() is able to open an index if its relkind fits, except that it would return NULL instead of generated an error if the relation does not exist. This new routine will be used by an upcoming patch to make REINDEX on partitioned relations more robust when an index in a partition tree is dropped. Extracted from a larger patch by the same author. Author: Fei Changhong Discussion: https://postgr.es/m/tencent_6A52106095ACDE55333E3AD33F304C0C3909@qq.com Backpatch-through: 14
2024-01-18	lwlock: Fix quadratic behavior with very long wait lists	Andres Freund
	Until now LWLockDequeueSelf() sequentially searched the list of waiters to see if the current proc is still is on the list of waiters, or has already been removed. In extreme workloads, where the wait lists are very long, this leads to a quadratic behavior. #backends iterating over a list #backends long. Additionally, the likelihood of needing to call LWLockDequeueSelf() in the first place also increases with the increased length of the wait queue, as it becomes more likely that a lock is released while waiting for the wait list lock, which is held for longer during lock release. Due to the exponential back-off in perform_spin_delay() this is surprisingly hard to detect. We should make that easier, e.g. by adding a wait event around the pg_usleep() - but that's a separate patch. The fix is simple - track whether a proc is currently waiting in the wait list or already removed but waiting to be woken up in PGPROC->lwWaiting. In some workloads with a lot of clients contending for a small number of lwlocks (e.g. WALWriteLock), the fix can substantially increase throughput. This has been originally fixed for 16~ with a4adc31f6902 without a backpatch, and we have heard complaints from users impacted by this quadratic behavior in older versions as well. Author: Andres Freund <andres@anarazel.de> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/20221027165914.2hofzp4cvutj6gin@awork3.anarazel.de Discussion: https://postgr.es/m/CALj2ACXktNbG=K8Xi7PSqbofTZozavhaxjatVc14iYaLu4Maag@mail.gmail.com Backpatch-through: 12
2024-01-17	Fix incorrect comment on how BackendStatusArray is indexed	Heikki Linnakangas
	The comment was copy-pasted from the call to ProcSignalInit() in AuxiliaryProcessMain(), which uses a similar scheme of having reserved slots for aux processes after MaxBackends slots for backends. However, ProcSignalInit() indexing starts from 1, whereas BackendStatusArray starts from 0. The code is correct, but the comment was wrong. Discussion: https://www.postgresql.org/message-id/f3ecd4cb-85ee-4e54-8278-5fabfb3a4ed0@iki.fi Backpatch-through: v14
2024-01-17	Close socket in case of errors in setting non-blocking	Daniel Gustafsson
	If configuring the newly created socket non-blocking fails we error out and return INVALID_SOCKET, but the socket that had been created wasn't closed. Fix by issuing closesocket in the errorpath. Backpatch to all supported branches. Author: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQApmU5CrKefH85VbNYE2y8H=-qqEJbg6RAPU65+vCe+89A@mail.gmail.com Backpatch-through: v12
2024-01-16	Don't test already-referenced pointer for nullness	Alvaro Herrera
	Commit b8ba7344e9eb added in PQgetResult a derefence to a pointer returned by pqPrepareAsyncResult(), before some other code that was already testing that pointer for nullness. But since commit 618c16707a6d (in Postgres 15), pqPrepareAsyncResult() doesn't ever return NULL (a statically-allocated result is returned if OOM). So in branches 15 and up, we can remove the redundant pointer check with no harm done. However, in branch 14, pqPrepareAsyncResult() can indeed return NULL if it runs out of memory. Fix things there by adding a null pointer check before dereferencing the pointer. This should hint Coverity that the preexisting check is not redundant but necessary. Backpatch to 14, like b8ba7344e9eb. Per Coverity.
2024-01-14	Prevent access to an unpinned buffer in BEFORE ROW UPDATE triggers.	Tom Lane
	When ExecBRUpdateTriggers switches to a new target tuple as a result of the EvalPlanQual logic, it must form a new proposed update tuple. Since commit 86dc90056, that tuple (the result of ExecGetUpdateNewTuple) has been a virtual tuple that might contain pointers to by-ref fields of the new target tuple (in "oldslot"). However, immediately after that we materialize oldslot, causing it to drop its buffer pin, whereupon the by-ref pointers are unsafe to use. This is a live bug only when the new target tuple is in a different page than the original target tuple, since we do still hold a pin on the original one. (Before 86dc90056, there was no bug because the EPQ plantree would hold a pin on the new target tuple; but now that's not assured.) To fix, forcibly materialize the new tuple before we materialize oldslot. This costs nothing since we would have done that shortly anyway. The real-world impact of this is probably minimal. A visible failure could occur if the new target tuple's buffer were recycled for some other page in the short interval before we materialize newslot within the trigger-calling loop; but that's quite unlikely given that we'd just touched that page. There's a larger hazard that some other process could prune and repack that page within the window. We have lock on the new target tuple, but that wouldn't prevent it being moved on the page. Alexander Lakhin and Tom Lane, per bug #17798 from Alexander Lakhin. Back-patch to v14 where 86dc90056 came in. Discussion: https://postgr.es/m/17798-0907404928dcf0dd@postgresql.org
2024-01-13	Re-pgindent catcache.c after previous commit.	Tom Lane
	Discussion: https://postgr.es/m/1393953.1698353013@sss.pgh.pa.us Discussion: https://postgr.es/m/CAGjhLkOoBEC9mLsnB42d3CO1vcMx71MLSEuigeABbQ8oRdA6gw@mail.gmail.com
2024-01-13	Cope with catcache entries becoming stale during detoasting.	Tom Lane
	We've long had a policy that any toasted fields in a catalog tuple should be pulled in-line before entering the tuple in a catalog cache. However, that requires access to the catalog's toast table, and we'll typically do AcceptInvalidationMessages while opening the toast table. So it's possible that the catalog tuple is outdated by the time we finish detoasting it. Since no cache entry exists yet, we can't mark the entry stale during AcceptInvalidationMessages, and instead we'll press forward and build an apparently-valid cache entry. The upshot is that we have a race condition whereby an out-of-date entry could be made in a backend's catalog cache, and persist there indefinitely causing indeterminate misbehavior. To fix, use the existing systable_recheck_tuple code to recheck whether the catalog tuple is still up-to-date after we finish detoasting it. If not, loop around and restart the process of searching the catalog and constructing cache entries from the top. The case is rare enough that this shouldn't create any meaningful performance penalty, even in the SearchCatCacheList case where we need to tear down and reconstruct the whole list. Indeed, the case is so rare that AFAICT it doesn't occur during our regression tests, and there doesn't seem to be any easy way to build a test that would exercise it reliably. To allow testing of the retry code paths, add logic (in USE_ASSERT_CHECKING builds only) that randomly pretends that the recheck failed about one time out of a thousand. This is enough to ensure that we'll pass through the retry paths during most regression test runs. By adding an extra level of looping, this commit creates a need to reindent most of SearchCatCacheMiss and SearchCatCacheList. I'll do that separately, to allow putting those changes in .git-blame-ignore-revs. Patch by me; thanks to Alexander Lakhin for having built a test case to prove the bug is real, and to Xiaoran Wang for review. Back-patch to all supported branches. Discussion: https://postgr.es/m/1393953.1698353013@sss.pgh.pa.us Discussion: https://postgr.es/m/CAGjhLkOoBEC9mLsnB42d3CO1vcMx71MLSEuigeABbQ8oRdA6gw@mail.gmail.com
2024-01-12	pg_regress: Disable autoruns for cmd.exe on Windows	Michael Paquier
	This is similar to 9886744a361b, to prevent the execution of other programs due to autorun configurations which could influence the postmaster startup. This was originally applied on HEAD as of 83c75ac7fb69 without a backpatch, but the patch has survived CI and buildfarm cycles. I have checked that cmd /d exists down to Windows XP, which should make this change work correctly in the oldest branches still supported. Discussion: https://postgr.es/m/20230922.161551.320043332510268554.horikyota.ntt@gmail.com Backpatch-through: 12
2024-01-12	pg_ctl: Disable autoruns for cmd.exe on Windows	Michael Paquier
	On Windows, cmd.exe is used to launch the postmaster process to ease its redirection setup. However, cmd.exe may execute other programs at startup due to autorun configurations, which could influence the postmaster startup. This patch adds /D flag to the launcher cmd.exe command line to disable autorun settings written in the registry. This was originally applied on HEAD as of 9886744a361b without a backpatch, but the patch has survived CI and buildfarm cycles. I have checked that cmd /d exists down to Windows XP, which should make this change work correctly in the oldest branches still supported. Reported-by: Hayato Kuroda Author: Kyotaro Horiguchi Reviewed-by: Robert Haas, Michael Paquier Discussion: https://postgr.es/m/20230922.161551.320043332510268554.horikyota.ntt@gmail.com Backpatch-through: 12
2024-01-11	Allow subquery pullup to wrap a PlaceHolderVar in another one.	Tom Lane
	The code for wrapping subquery output expressions in PlaceHolderVars believed that if the expression already was a PlaceHolderVar, it was never necessary to wrap that in another one. That's wrong if the expression is underneath an outer join and involves a lateral reference to outside that scope: failing to add an additional PHV risks evaluating the expression at the wrong place and hence not forcing it to null when the outer join should do so. This is an oversight in commit 9e7e29c75, which added logic to forcibly wrap lateral-reference Vars in PlaceHolderVars, but didn't see that the adjacent case for PlaceHolderVars needed the same treatment. The test case we have for this doesn't fail before 4be058fe9, but now that I see the problem I wonder if it is possible to demonstrate related errors before that. That's moot though, since all such branches are out of support. Per bug #18284 from Holger Reise. Back-patch to all supported branches. Discussion: https://postgr.es/m/18284-47505a20c23647f8@postgresql.org
2024-01-08	Fix indentation in ExecParallelHashIncreaseNumBatches()	Alexander Korotkov
	Backpatch-through: 12
2024-01-07	Fix oversized memory allocation in Parallel Hash Join	Alexander Korotkov
	During the calculations of the maximum for the number of buckets, take into account that later we round that to the next power of 2. Reported-by: Karen Talarico Bug: #16925 Discussion: https://postgr.es/m/16925-ec96d83529d0d629%40postgresql.org Author: Thomas Munro, Andrei Lepikhov, Alexander Korotkov Reviewed-by: Alena Rybakina Backpatch-through: 12
2024-01-03	Avoid masking EOF (no-password-supplied) conditions in auth.c.	Tom Lane
	CheckPWChallengeAuth() would return STATUS_ERROR if the user does not exist or has no password assigned, even if the client disconnected without responding to the password challenge (as libpq often will, for example). We should return STATUS_EOF in that case, and the lower-level functions do, but this code level got it wrong since the refactoring done in 7ac955b34. This breaks the intent of not logging anything for EOF cases (cf. comments in auth_failed()) and might also confuse users of ClientAuthentication_hook. Per report from Liu Lang. Back-patch to all supported versions. Discussion: https://postgr.es/m/b725238c-539d-cb09-2bff-b5e6cb2c069c@esgyn.cn
2023-12-29	In pg_dump, don't dump a stats object unless dumping underlying table.	Tom Lane
	If the underlying table isn't being dumped, it's useless to dump an extended statistics object; it'll just cause errors at restore. We have always applied similar policies to, say, indexes. (When and if we get cross-table stats objects, it might be profitable to think a little harder about what to do with them. But for now there seems no point in considering a stats object as anything but an appendage of its table.) Rian McGuire and Tom Lane, per report from Rian McGuire. Back-patch to supported branches. Discussion: https://postgr.es/m/7075d3aa-3f05-44a5-b68f-47dc6a8a0550@buildkite.com
2023-12-26	Fix failure to verify PGC_[SU_]BACKEND GUCs in pg_file_settings view.	Tom Lane
	set_config_option() bails out early if it detects that the option to be set is PGC_BACKEND or PGC_SU_BACKEND class and we're reading the config file in a postmaster child; we don't want to apply any new value in such a case. That's fine as far as it goes, but it fails to consider the requirements of the pg_file_settings view: for that, we need to check validity of the value even though we have no intention to apply it. Because we didn't, even very silly values for affected GUCs would be reported as valid by the view. There are only half a dozen such GUCs, which perhaps explains why this got overlooked for so long. Fix by continuing when changeVal is false; this parallels the logic in some other early-exit paths. Also, the check added by commit 924bcf4f1 to prevent GUC changes in parallel workers seems a few bricks shy of a load: it's evidently assuming that ereport(elevel, ...) won't return. Make sure we bail out if it does. The lack of trouble reports suggests that this is only a latent bug, i.e. parallel workers don't actually reach here with elevel < ERROR. (Per the code coverage report, we never reach here at all in the regression suite.) But we clearly don't want to risk proceeding if that does happen. Per report from Rıdvan Korkmaz. These are ancient bugs, so back-patch to all supported branches. Discussion: https://postgr.es/m/2089235.1703617353@sss.pgh.pa.us
2023-12-26	Hide warnings from Python headers when using gcc-compatible compiler.	Tom Lane
	Like commit 388e80132, use "#pragma GCC system_header" to silence warnings appearing within the Python headers, since newer Python versions no longer worry about some restrictions we still use like -Wdeclaration-after-statement. This patch improves on 388e80132 by inventing a separate wrapper header file, allowing the pragma to be tightly scoped to just the Python headers and not other stuff we have laying about in plpython.h. I applied the same technique to plperl for the same reason: the original patch suppressed warnings for a good deal of our own code, not only the Perl headers. Like the previous commit, back-patch to supported branches. Peter Eisentraut and Tom Lane Discussion: https://postgr.es/m/ae523163-6d2a-4b81-a875-832e48dec502@eisentraut.org
2023-12-21	Avoid trying to fetch metapage of an SPGist partitioned index.	Tom Lane
	This is necessary when spgcanreturn() is invoked on a partitioned index, and the failure might be reachable in other scenarios as well. The rest of what spgGetCache() does is perfectly sensible for a partitioned index, so we should allow it to go through. I think the main takeaway from this is that we lack sufficient test coverage for non-btree partitioned indexes. Therefore, I added simple test cases for brin and gin as well as spgist (hash and gist AMs were covered already in indexing.sql). Per bug #18256 from Alexander Lakhin. Although the known test case only fails since v16 (3c569049b), I've got no faith at all that there aren't other ways to reach this problem; so back-patch to all supported branches. Discussion: https://postgr.es/m/18256-0b0e1b6e4a620f1b@postgresql.org
2023-12-15	Fix bugs in manipulation of large objects.	Tom Lane
	In v16 and up (since commit afbfc0298), large object ownership checking has been broken because object_ownercheck() didn't take care of the discrepancy between our object-address representation of large objects (classId == LargeObjectRelationId) and the catalog where their ownership info is actually stored (LargeObjectMetadataRelationId). This resulted in failures such as "unrecognized class ID: 2613" when trying to update blob properties as a non-superuser. Poking around for related bugs, I found that AlterObjectOwner_internal would pass the wrong classId to the PostAlterHook in the no-op code path where the large object already has the desired owner. Also, recordExtObjInitPriv checked for the wrong classId; that bug is only latent because the stanza is dead code anyway, but as long as we're carrying it around it should be less wrong. These bugs are quite old. In HEAD, we can reduce the scope for future bugs of this ilk by changing AlterObjectOwner_internal's API to let the translation happen inside that function, rather than requiring callers to know about it. A more bulletproof fix, perhaps, would be to start using LargeObjectMetadataRelationId as the dependency and object-address classId for blobs. However that has substantial risk of breaking third-party code; even within our own code, it'd create hassles for pg_dump which would have to cope with a version-dependent representation. For now, keep the status quo. Discussion: https://postgr.es/m/2650449.1702497209@sss.pgh.pa.us
2023-12-12	Prevent tuples to be marked as dead in subtransactions on standbys	Michael Paquier
	Dead tuples are ignored and are not marked as dead during recovery, as it can lead to MVCC issues on a standby because its xmin may not match with the primary. This information is tracked by a field called "xactStartedInRecovery" in the transaction state data, switched on when starting a transaction in recovery. Unfortunately, this information was not correctly tracked when starting a subtransaction, because the transaction state used for the subtransaction did not update "xactStartedInRecovery" based on the state of its parent. This would cause index scans done in subtransactions to return inconsistent data, depending on how the xmin of the primary and/or the standby evolved. This is broken since the introduction of hot standby in efc16ea52067, so backpatch all the way down. Author: Fei Changhong Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/tencent_C4D907A5093C071A029712E73B43C6512706@qq.com Backpatch-through: 12
2023-12-12	Fix typo in comment	Daniel Gustafsson
	Commit 98e675ed7af accidentally mistyped IDENTIFY_SYSTEM as IDENTIFY_SERVER. Backpatch to all supported branches. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/68138521-5345-8780-4390-1474afdcba1f@gmail.com
2023-12-11	Be more wary about OpenSSL not setting errno on error.	Tom Lane
	OpenSSL will sometimes return SSL_ERROR_SYSCALL without having set errno; this is apparently a reflection of recv(2)'s habit of not setting errno when reporting EOF. Ensure that we treat such cases the same as read EOF. Previously, we'd frequently report them like "could not accept SSL connection: Success" which is confusing, or worse report them with an unrelated errno left over from some previous syscall. To fix, ensure that errno is zeroed immediately before the call, and report its value only when it's not zero afterwards; otherwise report EOF. For consistency, I've applied the same coding pattern in libpq's pqsecure_raw_read(). Bare recv(2) shouldn't really return -1 without setting errno, but in case it does we might as well cope. Per report from Andres Freund. Back-patch to all supported versions. Discussion: https://postgr.es/m/20231208181451.deqnflwxqoehhxpe@awork3.anarazel.de
2023-12-11	Fix an undetected deadlock due to apply worker.	Amit Kapila
	The apply worker needs to update the state of the subscription tables to 'READY' during the synchronization phase which requires locking the corresponding subscription. The apply worker also waits for the subscription tables to reach the 'SYNCDONE' state after holding the locks on the subscription and the wait is done using WaitLatch. The 'SYNCDONE' state is changed by tablesync workers again by locking the corresponding subscription. Both the state updates use AccessShareLock mode to lock the subscription, so they can't block each other. However, a backend can simultaneously try to acquire a lock on the same subscription using AccessExclusiveLock mode to alter the subscription. Now, the backend's wait on a lock can sneak in between the apply worker and table sync worker causing deadlock. In other words, apply_worker waits for tablesync worker which waits for backend, and backend waits for apply worker. This is not detected by the deadlock detector because apply worker uses WaitLatch. The fix is to release existing locks in apply worker before it starts to wait for tablesync worker to change the state. Reported-by: Tomas Vondra Author: Shlok Kyal Reviewed-by: Amit Kapila, Peter Smith Backpatch-through: 12 Discussion: https://postgr.es/m/d291bb50-12c4-e8af-2af2-7bb9bb4d8e3e@enterprisedb.com
2023-12-06	Fix compilation on Windows with WAL_DEBUG	Michael Paquier
	This has been broken since b060dbe0001a that has reworked the callback mechanism of XLogReader, most likely unnoticed because any form of development involving WAL happens on platforms where this compiles fine. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACVF14WKQMFwcJ=3okVDhiXpuK5f7YdT+BdYXbbypMHqWA@mail.gmail.com Backpatch-through: 13
2023-12-05	Fix incorrect error message for IDENTIFY_SYSTEM	Daniel Gustafsson
	Commit 5a991ef8692e accidentally reversed the order of the tuples and fields parameters, making the error message incorrectly refer to 3 tuples with 1 field when IDENTIFY_SYSTEM returns 1 tuple and 3 or 4 fields. Fix by changing the order of the parameters. This also adds a comment describing why we check for < 3 when postgres since 9.4 has been sending 4 fields. Backpatch all the way since the bug is almost a decade old. Author: Tomonari Katsumata <t.katsumata1122@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Bug: #18224 Backpatch-through: v12
2023-12-05	Fix handling of errors in libpq pipelines	Alvaro Herrera
	The logic to keep the libpq command queue in sync with queries that have been processed had a bug when errors were returned for reasons other than problems in queries -- for example, when a connection is lost. We incorrectly consumed an element from the command queue every time, but this is wrong and can lead to the queue becoming empty ahead of time, leading to later malfunction: PQgetResult would return nothing, potentially causing the calling application to enter a busy loop. Fix by making the SYNC queue element a barrier that can only be consumed when a SYNC message is received. Backpatch to 14. Reported by: Иван Трофимов (Ivan Trofimov) <i.trofimow@yandex.ru> Discussion: https://postgr.es/m/17948-fcace7557e449957@postgresql.org
2023-12-04	Don't use pgbench -j in tests	Alvaro Herrera
	It draws an unnecessary error in builds compiled without thread support. Added by commit 038f586d5f1d, which was backpatched to 14; though in branch master we no longer support such builds, there's no reason to have this there, so remove it in all branches since 14. Reported-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/ZW2G9Ix4nBKLcSSO@paquier.xyz
2023-12-01	Check collation when creating partitioned index	Peter Eisentraut
	When creating a partitioned index, the partition key must be a subset of the index's columns. But this currently doesn't check that the collations between the partition key and the index definition match. So you can construct a unique index that fails to enforce uniqueness. (This would most likely involve a nondeterministic collation, so it would have to be crafted explicitly and is not something that would just happen by accident.) This patch adds the required collation check. As a result, any previously allowed unique index that has a collation mismatch would no longer be allowed to be created. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/3327cb54-f7f1-413b-8fdb-7a9dceebb938%40eisentraut.org
2023-11-28	Use BIO_{get,set}_app_data instead of BIO_{get,set}_data.	Tom Lane
	We should have done it this way all along, but we accidentally got away with using the wrong BIO field up until OpenSSL 3.2. There, the library's BIO routines that we rely on use the "data" field for their own purposes, and our conflicting use causes assorted weird behaviors up to and including core dumps when SSL connections are attempted. Switch to using the approved field for the purpose, i.e. app_data. While at it, remove our configure probes for BIO_get_data as well as the fallback implementation. BIO_{get,set}_app_data have been there since long before any OpenSSL version that we still support, even in the back branches. Also, update src/test/ssl/t/001_ssltests.pl to allow for a minor change in an error message spelling that evidently came in with 3.2. Tristan Partin and Bo Andreson. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAN55FZ1eDDYsYaL7mv+oSLUij2h_u6hvD4Qmv-7PK7jkji0uyQ@mail.gmail.com
2023-11-28	Fix assertions with RI triggers in heap_update and heap_delete.	Heikki Linnakangas
	If the tuple being updated is not visible to the crosscheck snapshot, we return TM_Updated but the assertions would not hold in that case. Move them to before the cross-check. Fixes bug #17893. Backpatch to all supported versions. Author: Alexander Lakhin Backpatch-through: 12 Discussion: https://www.postgresql.org/message-id/17893-35847009eec517b5%40postgresql.org
2023-11-27	Fix race condition with BIO methods initialization in libpq with threads	Michael Paquier
	The libpq code in charge of creating per-connection SSL objects was prone to a race condition when loading the custom BIO methods needed by my_SSL_set_fd(). As BIO methods are stored as a static variable, the initialization of a connection could fail because it could be possible to have one thread refer to my_bio_methods while it is being manipulated by a second concurrent thread. This error has been introduced by 8bb14cdd33de, that has removed ssl_config_mutex around the call of my_SSL_set_fd(), that itself sets the custom BIO methods used in libpq. Like previously, the BIO method initialization is now protected by the existing ssl_config_mutex, itself initialized earlier for WIN32. While on it, document that my_bio_methods is protected by ssl_config_mutex, as this can be easy to miss. Reported-by: Willi Mann Author: Willi Mann, Michael Paquier Discussion: https://postgr.es/m/e77abc4c-4d03-4058-a9d7-ef0035657e04@celonis.com Backpatch-through: 12
2023-11-23	Fix timing-dependent failure in GSSAPI data transmission.	Tom Lane
	When using GSSAPI encryption in non-blocking mode, libpq sometimes failed with "GSSAPI caller failed to retransmit all data needing to be retried". The cause is that pqPutMsgEnd rounds its transmit request down to an even multiple of 8K, and sometimes that can lead to not requesting a write of data that was requested to be written (but reported as not written) earlier. That can upset pg_GSS_write's logic for dealing with not-yet-written data, since it's possible the data in question had already been incorporated into an encrypted packet that we weren't able to send during the previous call. We could fix this with a one-or-two-line hack to disable pqPutMsgEnd's round-down behavior, but that seems like making the caller work around a behavior that pg_GSS_write shouldn't expose in this way. Instead, adjust pg_GSS_write to never report a partial write: it either reports a complete write, or reflects the failure of the lower-level pqsecure_raw_write call. The requirement still exists for the caller to present at least as much data as on the previous call, but with the caller-visible write start point not moving there is no temptation for it to present less. We lose some ability to reclaim buffer space early, but I doubt that that will make much difference in practice. This also gets rid of a rather dubious assumption that "any interesting failure condition (from pqsecure_raw_write) will recur on the next try". We've not seen failure reports traceable to that, but I've never trusted it particularly and am glad to remove it. Make the same adjustments to the equivalent backend routine be_gssapi_write(). It is probable that there's no bug on the backend side, since we don't have a notion of nonblock mode there; but we should keep the logic the same to ease future maintenance. Per bug #18210 from Lars Kanis. Back-patch to all supported branches. Discussion: https://postgr.es/m/18210-4c6d0b14627f2eb8@postgresql.org
2023-11-23	Fix resource leak when a FDW's ForeignAsyncRequest function fails	Heikki Linnakangas
	If an error is thrown after calling CreateWaitEventSet(), the memory of a WaitEventSet is free'd as it's allocated in the short-lived memory context, but the file descriptor (on epoll- or kqueue-based systems) or handles (on Windows) that it contains are leaked. Use PG_TRY-FINALLY to ensure it gets freed. (On master, I will apply a better fix, using ResourceOwners to track the WaitEventSet, but that's not backpatchable.) The added test doesn't check for leaking resources, so it passed even before this commit. But at least it covers the code path. In the passing, fix misleading comment on what the 'nevents' argument to WaitEventSetWait means. Report by Alexander Lakhin, analysis and suggestion for the fix by Tom Lane. Fixes bug #17828. Backpatch to v14 where async execution was introduced, but master gets a different fix. Discussion: https://www.postgresql.org/message-id/17828-122da8cba23236be@postgresql.org Discussion: https://www.postgresql.org/message-id/472235.1678387869@sss.pgh.pa.us
2023-11-22	Fix query checking consistency of table amhandlers in opr_sanity.sql	Michael Paquier
	As written, the query checked for an access method of type 's', which is not an AM type supported in the core code. Error introduced by 8586bf7ed888. As this query is not checking what it should, backpatch all the way down. Reviewed-by: Aleksander Alekseev Discussion: https://postgr.es/m/ZVxJkAJrKbfHETiy@paquier.xyz Backpatch-through: 12
2023-11-19	Lock table in DROP STATISTICS	Tomas Vondra
	The DROP STATISTICS code failed to properly lock the table, leading to ERROR: tuple concurrently deleted when executed concurrently with ANALYZE. Fixed by modifying RemoveStatisticsById() to acquire the same lock as ANALYZE. This function is called only by DROP STATISTICS, as ANALYZE calls RemoveStatisticsDataById() directly. Reported by Justin Pryzby, fix by me. Backpatch through 12. The code was like this since it was introduced in 10, but older releases are EOL. Reported-by: Justin Pryzby Reviewed-by: Tom Lane Backpatch-through: 12 Discussion: https://postgr.es/m/ZUuk-8CfbYeq6g_u@pryzbyj2023