summaryrefslogtreecommitdiff
path: root/contrib/postgres_fdw
AgeCommit message (Collapse)Author
2025-05-30Fix memory leakage in postgres_fdw's DirectModify code path.Tom Lane
postgres_fdw tries to use PG_TRY blocks to ensure that it will eventually free the PGresult created by the remote modify command. However, it's fundamentally impossible for this scheme to work reliably when there's RETURNING data, because the query could fail in between invocations of postgres_fdw's DirectModify methods. There is at least one instance of exactly this situation in the regression tests, and the ensuing session-lifespan leak is visible under Valgrind. We can improve matters by using a memory context reset callback attached to the ExecutorState context. That ensures that the PGresult will be freed when the ExecutorState context is torn down, even if control never reaches postgresEndDirectModify. I have little faith that there aren't other potential PGresult leakages in the backend modules that use libpq. So I think it'd be a good idea to apply this concept universally by creating infrastructure that attaches a reset callback to every PGresult generated in the backend. However, that seems too invasive for v18 at this point, let alone the back branches. So for the moment, apply this narrow fix that just makes DirectModify safe. I have a patch in the queue for the more general idea, but it will have to wait for v19. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/2976982.1748049023@sss.pgh.pa.us Backpatch-through: 13
2025-01-29Handle default NULL insertion a little better.Tom Lane
If a column is omitted in an INSERT, and there's no column default, the code in preptlist.c generates a NULL Const to be inserted. Furthermore, if the column is of a domain type, we wrap the Const in CoerceToDomain, so as to throw a run-time error if the domain has a NOT NULL constraint. That's fine as far as it goes, but there are two problems: 1. We're being sloppy about the type/typmod that the Const is labeled with. It really should have the domain's base type/typmod, since it's the input to CoerceToDomain not the output. This can result in coerce_to_domain inserting a useless length-coercion function (useless because it's being applied to a null). The coercion would typically get const-folded away later, but it'd be better not to create it in the first place. 2. We're not applying expression preprocessing (specifically, eval_const_expressions) to the resulting expression tree. The planner's primary expression-preprocessing pass already happened, so that means the length coercion step and CoerceToDomain node miss preprocessing altogether. This is at the least inefficient, since it means the length coercion and CoerceToDomain will actually be executed for each inserted row, though they could be const-folded away in most cases. Worse, it seems possible that missing preprocessing for the length coercion could result in an invalid plan (for example, due to failing to perform default-function-argument insertion). I'm not aware of any live bug of that sort with core datatypes, and it might be unreachable for extension types as well because of restrictions of CREATE CAST, but I'm not entirely convinced that it's unreachable. Hence, it seems worth back-patching the fix (although I only went back to v14, as the patch doesn't apply cleanly at all in v13). There are several places in the rewriter that are building null domain constants the same way as preptlist.c. While those are before the planner and hence don't have any reachable bug, they're still applying a length coercion that will be const-folded away later, uselessly wasting cycles. Hence, make a utility routine that all of these places can call to do it right. Making this code more careful about the typmod assigned to the generated NULL constant has visible but cosmetic effects on some of the plans shown in contrib/postgres_fdw's regression tests. Discussion: https://postgr.es/m/1865579.1738113656@sss.pgh.pa.us Backpatch-through: 14
2024-10-07Fix Y2038 issues with MyStartTime.Nathan Bossart
Several places treat MyStartTime as a "long", which is only 32 bits wide on some platforms. In reality, MyStartTime is a pg_time_t, i.e., a signed 64-bit integer. This will lead to interesting bugs on the aforementioned systems in 2038 when signed 32-bit integers are no longer sufficient to store Unix time (e.g., "pg_ctl start" hanging). To fix, ensure that MyStartTime is handled as a 64-bit value everywhere. (Of course, users will need to ensure that time_t is 64 bits wide on their system, too.) Co-authored-by: Max Johnson Discussion: https://postgr.es/m/CO1PR07MB905262E8AC270FAAACED66008D682%40CO1PR07MB9052.namprd07.prod.outlook.com Backpatch-through: 12
2024-08-05Restrict accesses to non-system views and foreign tables during pg_dump.Masahiko Sawada
When pg_dump retrieves the list of database objects and performs the data dump, there was possibility that objects are replaced with others of the same name, such as views, and access them. This vulnerability could result in code execution with superuser privileges during the pg_dump process. This issue can arise when dumping data of sequences, foreign tables (only 13 or later), or tables registered with a WHERE clause in the extension configuration table. To address this, pg_dump now utilizes the newly introduced restrict_nonsystem_relation_kind GUC parameter to restrict the accesses to non-system views and foreign tables during the dump process. This new GUC parameter is added to back branches too, but these changes do not require cluster recreation. Back-patch to all supported branches. Reviewed-by: Noah Misch Security: CVE-2024-7348 Backpatch-through: 12
2024-07-19postgres_fdw: Avoid "cursor can only scan forward" error.Etsuro Fujita
Commit d844cd75a disallowed rewind in a non-scrollable cursor to resolve anomalies arising from such a cursor operation. However, this failed to take into account the assumption in postgres_fdw that when rescanning a foreign relation, it can rewind the cursor created for scanning the foreign relation without specifying the SCROLL option, regardless of its scrollability, causing this error when it tried to do such a rewind in a non-scrollable cursor. Fix by modifying postgres_fdw to instead recreate the cursor, regardless of its scrollability, when rescanning the foreign relation. (If we had a way to check its scrollability, we could improve this by rewinding it if it is scrollable and recreating it if not, but we do not have it, so this commit modifies it to recreate it in any case.) Per bug #17889 from Eric Cyr. Devrim Gunduz also reported this problem. Back-patch to v15 where that commit enforced the prohibition. Reviewed by Tom Lane. Discussion: https://postgr.es/m/17889-e8c39a251d258dda%40postgresql.org Discussion: https://postgr.es/m/b415ac3255f8352d1ea921cf3b7ba39e0587768a.camel%40gunduz.org
2024-06-07postgres_fdw: Refuse to send FETCH FIRST WITH TIES to remote servers.Etsuro Fujita
Previously, when considering LIMIT pushdown, postgres_fdw failed to check whether the query has this clause, which led to pushing false LIMIT clauses, causing incorrect results. This clause has been supported since v13, so we need to do a remote-version check before deciding that it will be safe to push such a clause, but we do not currently have a way to do the check (without accessing the remote server); disable pushing such a clause for now. Oversight in commit 357889eb1. Back-patch to v13, where that commit added the support. Per bug #18467 from Onder Kalaci. Patch by Japin Li, per a suggestion from Tom Lane, with some changes to the comments by me. Review by Onder Kalaci, Alvaro Herrera, and me. Discussion: https://postgr.es/m/18467-7bb89084ff03a08d%40postgresql.org
2024-04-21Make postgres_fdw request remote time zone 'GMT' not 'UTC'.Tom Lane
This should have the same results for all practical purposes. The advantage of selecting 'GMT' is that it's guaranteed to work even when the remote system's timezone database is missing entries, because pg_tzset() hard-wires handling of that, at least in 9.2 and later. (It seems like it would be a good idea to similarly hard-wire correct handling of 'UTC', but that'll be a little more invasive than I want to consider back-patching. Leave that for another day when we're not in feature freeze.) Per trouble report from Adnan Dautovic. Back-patch to all supported branches. Discussion: https://postgr.es/m/465248.1712211585@sss.pgh.pa.us
2024-04-04Fix bogus coding in ExecAppendAsyncEventWait().Etsuro Fujita
No configured-by-FDW events would result in "return" directly out of a PG_TRY block, making the exception stack dangling. Repair. Oversight in commit 501cfd07d; back-patch to v14, like that commit, but as we do not have this issue in HEAD (cf. commit 50c67c201), no need to apply this patch to it. In passing, improve a comment about the handling of in-process requests in a postgres_fdw.c function called from this function. Alexander Pyhalov, with comment adjustment/improvement by me. Discussion: https://postgr.es/m/425fa29a429b21b0332737c42a4fdc70%40postgrespro.ru
2024-03-11Fix deparsing of Consts in postgres_fdw ORDER BYDavid Rowley
For UNION ALL queries where a union child query contained a foreign table, if the targetlist of that query contained a constant, and the top-level query performed an ORDER BY which contained the column for the constant value, then postgres_fdw would find the EquivalenceMember with the Const and then try to produce an ORDER BY containing that Const. This caused problems with INT typed Consts as these could appear to be requests to order by an ordinal column position rather than the constant value. This could lead to either an error such as: ERROR: ORDER BY position <int const> is not in select list or worse, if the constant value is a valid column, then we could just sort by the wrong column altogether. Here we fix this issue by just not including these Consts in the ORDER BY clause. In passing, add a new section for testing ORDER BY in the postgres_fdw tests and move two existing tests which were misplaced in the WHERE clause testing section into it. Reported-by: Michał Kłeczek Reviewed-by: Ashutosh Bapat, Richard Guo Bug: #18381 Discussion: https://postgr.es/m/0714C8B8-8D82-4ABB-9F8D-A0C3657E7B6E%40kleczek.org Discussion: https://postgr.es/m/18381-137456acd168bf93%40postgresql.org Backpatch-through: 12, oldest supported version
2023-11-23Fix resource leak when a FDW's ForeignAsyncRequest function failsHeikki Linnakangas
If an error is thrown after calling CreateWaitEventSet(), the memory of a WaitEventSet is free'd as it's allocated in the short-lived memory context, but the file descriptor (on epoll- or kqueue-based systems) or handles (on Windows) that it contains are leaked. Use PG_TRY-FINALLY to ensure it gets freed. (On master, I will apply a better fix, using ResourceOwners to track the WaitEventSet, but that's not backpatchable.) The added test doesn't check for leaking resources, so it passed even before this commit. But at least it covers the code path. In the passing, fix misleading comment on what the 'nevents' argument to WaitEventSetWait means. Report by Alexander Lakhin, analysis and suggestion for the fix by Tom Lane. Fixes bug #17828. Backpatch to v14 where async execution was introduced, but master gets a different fix. Discussion: https://www.postgresql.org/message-id/17828-122da8cba23236be@postgresql.org Discussion: https://www.postgresql.org/message-id/472235.1678387869@sss.pgh.pa.us
2023-08-30postgres_fdw: Fix test for parameterized foreign scan.Etsuro Fujita
Commit e4106b252 should have updated this test, but did not; back-patch to all supported branches. Reviewed by Richard Guo. Discussion: http://postgr.es/m/CAPmGK15nR0NXLSCKQAcqbZbTzrzd5MozowWnTnGfPkayndF43Q%40mail.gmail.com
2023-07-28Disallow replacing joins with scans in problematic cases.Etsuro Fujita
Commit e7cb7ee14, which introduced the infrastructure for FDWs and custom scan providers to replace joins with scans, failed to add support handling of pseudoconstant quals assigned to replaced joins in createplan.c, leading to an incorrect plan without a gating Result node when postgres_fdw replaced a join with such a qual. To fix, we could add the support by 1) modifying the ForeignPath and CustomPath structs to store the list of RestrictInfo nodes to apply to the join, as in JoinPaths, if they represent foreign and custom scans replacing a join with a scan, and by 2) modifying create_scan_plan() in createplan.c to use that list in that case, instead of the baserestrictinfo list, to get pseudoconstant quals assigned to the join; but #1 would cause an ABI break. So fix by modifying the infrastructure to just disallow replacing joins with such quals. Back-patch to all supported branches. Reported by Nishant Sharma. Patch by me, reviewed by Nishant Sharma and Richard Guo. Discussion: https://postgr.es/m/CADrsxdbcN1vejBaf8a%2BQhrZY5PXL-04mCd4GDu6qm6FigDZd6Q%40mail.gmail.com
2023-07-03Remove expensive test of postgres_fdw batch insertsTomas Vondra
The test inserted 70k rows into a foreign table, in order to verify correct behavior with more than 65535 parameters, and was added in response to a bug report. However, this is rather expensive, especially when running the tests under valgrind, CLOBBER_CACHE_ALWAYS etc. It doesn't seem worth it to keep running the test, so remove it from all branches (14+). Backpatch-through: 14 Discussion: https://postgr.es/m/2131017.1623451468@sss.pgh.pa.us
2023-04-25Fix buffer refcount leak with FDW bulk insertsMichael Paquier
The leak would show up when using batch inserts with foreign tables included in a partition tree, as the slots used in the batch were not reset once processed. In order to fix this problem, some ExecClearTuple() are added to clean up the slots used once a batch is filled and processed, mapping with the number of slots currently in use as tracked by the counter ri_NumSlots. This buffer refcount leak has been introduced in b676ac4 with the addition of the executor facility to improve bulk inserts for FDWs, so backpatch down to 14. Alexander has provided the patch (slightly modified by me). The test for postgres_fdw comes from me, based on the test case that the author has sent in the report. Author: Alexander Pyhalov Discussion: https://postgr.es/m/b035780a740efd38dc30790c76927255@postgrespro.ru Backpatch-through: 14
2023-02-27Drop test view when done with it.Tom Lane
The view just added by commit 53fe7e6cb decompiles differently in v15 than HEAD (presumably as a consequence of 47bb9db75). That causes failures in cross-version upgrade testing. We could teach AdjustUpgrade.pm to compensate for that, but it seems less painful to just drop the view after we're done with it. Per buildfarm.
2023-02-27Harden postgres_fdw tests against unexpected cache flushes.Tom Lane
postgres_fdw will close its remote session if an sinval cache reset occurs, since it's possible that that means some FDW parameters changed. We had two tests that were trying to ensure that the session remains alive by setting debug_discard_caches = 0; but that's not sufficient. Even though the tests seem stable enough in the buildfarm, they flap a lot under CI. In the first test, which is checking the ability to recover from a lost connection, we can stabilize the results by just not caring whether pg_terminate_backend() finds a victim backend. If a reset did happen, there won't be a session to terminate anymore, but the test can proceed anyway. (Arguably, we are then not testing the unintentional-disconnect case, but as long as that scenario is exercised in most runs I think it's fine; testing the reset-driven case is of value too.) In the second test, which is trying to verify the application_name displayed in pg_stat_activity by a remote session, we had a race condition in that the remote session might go away before we can fetch its pg_stat_activity entry. We can close that race and make the test more certainly test what it intends to by arranging things so that the remote session itself fetches its pg_stat_activity entry (based on PID rather than a somewhat-circular assumption about the application name). Both tests now demonstrably pass under debug_discard_caches = 1, so we can remove that hack. Back-patch into relevant back branches. Discussion: https://postgr.es/m/20230226194340.u44bkfgyz64c67i6@awork3.anarazel.de
2023-02-21Fix handling of escape sequences in postgres_fdw.application_nameMichael Paquier
postgres_fdw.application_name relies on MyProcPort to define the data that should be added to escape sequences %u (user name) or %d (database name). However this code could be run in processes that lack a MyProcPort, like an autovacuum process, causing crashes. The code generating the application name is made more flexible with this commit, so as it now generates no data for %u and %d if MyProcPort is missing, and a simple "unknown" if MyProcPort exists, but the expected fields are not set. Reported-by: Alexander Lakhin Author: Kyotaro Horiguchi, Michael Paquier Reviewed-by: Hayato Kuroda, Masahiko Sawada Discussion: https://postgr.es/m/17789-8b31c5a4672b74d9@postgresql.org Backpatch-through: 15
2023-01-05Fix calculation of which GENERATED columns need to be updated.Tom Lane
We were identifying the updatable generated columns of inheritance children by transposing the calculation made for their parent. However, there's nothing that says a traditional-inheritance child can't have generated columns that aren't there in its parent, or that have different dependencies than are in the parent's expression. (At present it seems that we don't enforce that for partitioning either, which is likely wrong to some degree or other; but the case clearly needs to be handled with traditional inheritance.) Hence, drop the very-klugy-anyway "extraUpdatedCols" RTE field in favor of identifying which generated columns depend on updated columns during executor startup. In HEAD we can remove extraUpdatedCols altogether; in back branches, it's still there but always empty. Another difference between the HEAD and back-branch versions of this patch is that in HEAD we can add the new bitmap field to ResultRelInfo, but that would cause an ABI break in back branches. Like 4b3e37993, add a List field at the end of struct EState instead. Back-patch to v13. The bogus calculation is also being made in v12, but it doesn't have the same visible effect because we don't use it to decide which generated columns to recalculate; as a consequence of which the patch doesn't apply easily. I think that there might still be a demonstrable bug associated with trigger firing conditions, but that's such a weird corner-case usage that I'm content to leave it unfixed in v12. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqFshLKNvQUd1DgwJ-7tsTp=dwv7KZqXC4j2wYBV1aCDUA@mail.gmail.com Discussion: https://postgr.es/m/2793383.1672944799@sss.pgh.pa.us
2022-11-25Fix handling of pending inserts in nodeModifyTable.c.Etsuro Fujita
Commit b663a4136, which allowed FDWs to INSERT rows in bulk, added to nodeModifyTable.c code to flush pending inserts to the foreign-table result relation(s) before completing processing of the ModifyTable node, but the code failed to take into account the case where the INSERT query has modifying CTEs, leading to incorrect results. Also, that commit failed to flush pending inserts before firing BEFORE ROW triggers so that rows are visible to such triggers. In that commit we scanned through EState's es_tuple_routing_result_relations or es_opened_result_relations list to find the foreign-table result relations to which pending inserts are flushed, but that would be inefficient in some cases. So to fix, 1) add a List member to EState to record the insert-pending result relations, and 2) modify nodeModifyTable.c so that it adds the foreign-table result relation to the list in ExecInsert() if appropriate, and flushes pending inserts properly using the list where needed. While here, fix a copy-and-pasteo in a comment in ExecBatchInsert(), which was added by that commit. Back-patch to v14 where that commit appeared. Discussion: https://postgr.es/m/CAPmGK16qutyCmyJJzgQOhfBq%3DNoGDqTB6O0QBZTihrbqre%2BoxA%40mail.gmail.com
2022-10-18Rename SetSingleFuncCall() to InitMaterializedSRF()Michael Paquier
Per discussion, the existing routine name able to initialize a SRF function with materialize mode is unpopular, so rename it. Equally, the flags of this function are renamed, as of: - SRF_SINGLE_USE_EXPECTED -> MAT_SRF_USE_EXPECTED_DESC - SRF_SINGLE_BLESS -> MAT_SRF_BLESS The previous function and flags introduced in 9e98583 are kept around for compatibility purposes, so as any extension code already compiled with v15 continues to work as-is. The declarations introduced here for compatibility will be removed from HEAD in a follow-up commit. The new names have been suggested by Andres Freund and Melanie Plageman. Discussion: https://postgr.es/m/20221013194820.ciktb2sbbpw7cljm@awork3.anarazel.de Backpatch-through: 15
2022-10-15Disallow MERGE cleanly for foreign partitionsAlvaro Herrera
While directly targetting a foreign table with MERGE was already expressly forbidden, we failed to catch the case of a partitioned table that has a foreign table as a partition; and the result if you try is an incomprehensible error. Fix that by adding a specific check. Backpatch to 15. Reported-by: Tatsuhiro Nakamori <bt22nakamorit@oss.nttdata.com> Discussion: https://postgr.es/m/bt22nakamorit@oss.nttdata.com
2022-10-03Revert "Optimize order of GROUP BY keys".Tom Lane
This reverts commit db0d67db2401eb6238ccc04c6407a4fd4f985832 and several follow-on fixes. The idea of making a cost-based choice of the order of the sorting columns is not fundamentally unsound, but it requires cost information and data statistics that we don't really have. For example, relying on procost to distinguish the relative costs of different sort comparators is pretty pointless so long as most such comparator functions are labeled with cost 1.0. Moreover, estimating the number of comparisons done by Quicksort requires more than just an estimate of the number of distinct values in the input: you also need some idea of the sizes of the larger groups, if you want an estimate that's good to better than a factor of three or so. That's data that's often unknown or not very reliable. Worse, to arrive at estimates of the number of calls made to the lower-order-column comparison functions, the code needs to make estimates of the numbers of distinct values of multiple columns, which are necessarily even less trustworthy than per-column stats. Even if all the inputs are perfectly reliable, the cost algorithm as-implemented cannot offer useful information about how to order sorting columns beyond the point at which the average group size is estimated to drop to 1. Close inspection of the code added by db0d67db2 shows that there are also multiple small bugs. These could have been fixed, but there's not much point if we don't trust the estimates to be accurate in-principle. Finally, the changes in cost_sort's behavior made for very large changes (often a factor of 2 or so) in the cost estimates for all sorting operations, not only those for multi-column GROUP BY. That naturally changes plan choices in many situations, and there's precious little evidence to show that the changes are for the better. Given the above doubts about whether the new estimates are really trustworthy, it's hard to summon much confidence that these changes are better on the average. Since we're hard up against the release deadline for v15, let's revert these changes for now. We can always try again later. Note: in v15, I left T_PathKeyInfo in place in nodes.h even though it's unreferenced. Removing it would be an ABI break, and it seems a bit late in the release cycle for that. Discussion: https://postgr.es/m/TYAPR01MB586665EB5FB2C3807E893941F5579@TYAPR01MB5866.jpnprd01.prod.outlook.com
2022-09-14postgres_fdw: Avoid 'variable not found in subplan target list' error.Etsuro Fujita
The tlist of the EvalPlanQual outer plan for a ForeignScan node is adjusted to produce a tuple whose descriptor matches the scan tuple slot for the ForeignScan node. But in the case where the outer plan contains an extra Sort node, if the new tlist contained columns required only for evaluating PlaceHolderVars or columns required only for evaluating local conditions, this would cause setrefs.c to fail with the error. The cause of this is that when creating the outer plan by injecting the Sort node into an alternative local join plan that could emit such extra columns as well, we fail to arrange for the outer plan to propagate them up through the Sort node, causing setrefs.c to fail to match up them in the new tlist to what is available from the outer plan. Repair. Per report from Alexander Pyhalov. Richard Guo and Etsuro Fujita, reviewed by Alexander Pyhalov and Tom Lane. Backpatch to all supported versions. Discussion: http://postgr.es/m/cfb17bf6dfdf876467bd5ef533852d18%40postgrespro.ru
2022-08-05postgres_fdw: Disable batch insertion when there are WCO constraints.Etsuro Fujita
When inserting a view referencing a foreign table that has WITH CHECK OPTION constraints, in single-insert mode postgres_fdw retrieves the data that was actually inserted on the remote side so that the WITH CHECK OPTION constraints are enforced with the data locally, but in batch-insert mode it cannot currently retrieve the data (except for the row first inserted through the view), resulting in enforcing the WITH CHECK OPTION constraints with the data passed from the core (except for the first-inserted row), which led to incorrect results when inserting into a view referencing a foreign table in which a remote BEFORE ROW INSERT trigger changes the rows inserted through the view so that they violate the view's WITH CHECK OPTION constraint. Also, the query inserting into the view caused an assertion failure in assert-enabled builds. Fix these by disabling batch insertion when inserting into such a view. Back-patch to v14 where batch insertion was added. Discussion: https://postgr.es/m/CAPmGK17LpbTZs4m4a_6THP54UBeK9fHvX8aVVA%2BC6yEZDZwQcg%40mail.gmail.com
2022-07-22postgres_fdw: Fix bug in checking of return value of PQsendQuery().Fujii Masao
When postgres_fdw begins an asynchronous data fetch, it submits FETCH query by using PQsendQuery(). If PQsendQuery() fails and returns 0, postgres_fdw should report an error. But, previously, postgres_fdw reported an error only when the return value is less than 0, though PQsendQuery() never return the values other than 0 and 1. Therefore postgres_fdw could not handle the failure to send FETCH query in an asynchronous data fetch. This commit fixes postgres_fdw so that it reports an error when PQsendQuery() returns 0. Back-patch to v14 where asynchronous execution was supported in postgres_fdw. Author: Fujii Masao Reviewed-by: Japin Li, Tom Lane Discussion: https://postgr.es/m/b187a7cf-d4e3-5a32-4d01-8383677797f3@oss.nttdata.com
2022-07-20Tweak detail and hint messages to be consistent with project policyMichael Paquier
Detail and hint messages should be full sentences and should end with a period, but some of the messages newly-introduced in v15 did not follow that. Author: Justin Pryzby Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/20220719120948.GF12702@telsasoft.com Backpatch-through: 15
2022-07-17postgres_fdw: set search_path to 'pg_catalog' while deparsing constants.Tom Lane
The motivation for this is to ensure successful transmission of the values of constants of regconfig and other reg* types. The remote will be reading them with search_path = 'pg_catalog', so schema qualification is necessary when referencing objects in other schemas. Per bug #17483 from Emmanuel Quincerot. Back-patch to all supported versions. (There's some other stuff to do here, but it's less back-patchable.) Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us
2022-07-07postgres_fdw: Fix grammar.Etsuro Fujita
Oversight in commit 4036bcbbb; back-patch to v15 where that appeared.
2022-05-12Pre-beta mechanical code beautification.Tom Lane
Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.
2022-05-12postgres_fdw: Update comments in make_new_connection().Etsuro Fujita
Expand the comment about the parallel_commit option to mention that the default is false. Also, since the comment about alteration of the keep_connections option, which was located above the expanded comment, holds true for the parallel_commit option, rewrite it to reflect this, and move it to after the expanded comment. Follow-up for commit 04e706d42. Discussion: https://postgr.es/m/CAPmGK16Kg2Bf90sqzcZ4YM5cN_G-4h7wFUS01qQpqNB%2B2BG5_w%40mail.gmail.com
2022-04-28Disable asynchronous execution if using gating Result nodes.Etsuro Fujita
mark_async_capable_plan(), which is called from create_append_plan() to determine whether subplans are async-capable, failed to take into account that the given subplan created from a given subpath might include a gating Result node if the subpath is a SubqueryScanPath or ForeignPath, causing a segmentation fault there when the subplan created from a SubqueryScanPath includes the Result node, or causing ExecAsyncRequest() to throw an error about an unrecognized node type when the subplan created from a ForeignPath includes the Result node, because in the latter case the Result node was unintentionally considered as async-capable, but we don't currently support executing Result nodes asynchronously. Fix by modifying mark_async_capable_plan() to disable asynchronous execution in such cases. Also, adjust code in the ProjectionPath case in mark_async_capable_plan(), for consistency with other cases, and adjust/improve comments there. is_async_capable_path() added in commit 27e1f1456, which was rewritten to mark_async_capable_plan() in a later commit, has the same issue, causing the error at execution mentioned above, so back-patch to v14 where the aforesaid commit went in. Per report from Justin Pryzby. Etsuro Fujita, reviewed by Zhihong Yu and Justin Pryzby. Discussion: https://postgr.es/m/20220408124338.GK24419%40telsasoft.com
2022-04-21postgres_fdw: Disable batch insert when BEFORE ROW INSERT triggers exist.Etsuro Fujita
Previously, we allowed this, but such triggers might query the table to insert into and act differently if the tuples that have already been processed and prepared for insertion are not there, so disable it in such cases. Back-patch to v14 where batch insert was added. Discussion: https://postgr.es/m/CAPmGK16_uPqsmgK0-LpLSUk54_BoK13bPrhxhfjSoSTVz414hA%40mail.gmail.com
2022-04-13Remove extraneous blank lines before block-closing bracesAlvaro Herrera
These are useless and distracting. We wouldn't have written the code with them to begin with, so there's no reason to keep them. Author: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com Discussion: https://postgr.es/m/attachment/133167/0016-Extraneous-blank-lines.patch
2022-04-06Allow asynchronous execution in more cases.Etsuro Fujita
In commit 27e1f1456, create_append_plan() only allowed the subplan created from a given subpath to be executed asynchronously when it was an async-capable ForeignPath. To extend coverage, this patch handles cases when the given subpath includes some other Path types as well that can be omitted in the plan processing, such as a ProjectionPath directly atop an async-capable ForeignPath, allowing asynchronous execution in partitioned-scan/partitioned-join queries with non-Var tlist expressions and more UNION queries. Andrey Lepikhov and Etsuro Fujita, reviewed by Alexander Pyhalov and Zhihong Yu. Discussion: https://postgr.es/m/659c37a8-3e71-0ff2-394c-f04428c76f08%40postgrespro.ru
2022-03-31Fix postgres_fdw to check shippability of sort clauses properly.Tom Lane
postgres_fdw would push ORDER BY clauses to the remote side without verifying that the sort operator is safe to ship. Moreover, it failed to print a suitable USING clause if the sort operator isn't default for the sort expression's type. The net result of this is that the remote sort might not have anywhere near the semantics we expect, which'd be disastrous for locally-performed merge joins in particular. We addressed similar issues in the context of ORDER BY within an aggregate function call in commit 7012b132d, but failed to notice that query-level ORDER BY was broken. Thus, much of the necessary logic already existed, but it requires refactoring to be usable in both cases. Back-patch to all supported branches. In HEAD only, remove the core code's copy of find_em_expr_for_rel, which is no longer used and really should never have been pushed into equivclass.c in the first place. Ronan Dunklau, per report from David Rowley; reviews by David Rowley, Ranier Vilela, and myself Discussion: https://postgr.es/m/CAApHDvr4OeC2DBVY--zVP83-K=bYrTD7F8SZDhN4g+pj2f2S-A@mail.gmail.com
2022-03-31Optimize order of GROUP BY keysTomas Vondra
When evaluating a query with a multi-column GROUP BY clause using sort, the cost may be heavily dependent on the order in which the keys are compared when building the groups. Grouping does not imply any ordering, so we're allowed to compare the keys in arbitrary order, and a Hash Agg leverages this. But for Group Agg, we simply compared keys in the order as specified in the query. This commit explores alternative ordering of the keys, trying to find a cheaper one. In principle, we might generate grouping paths for all permutations of the keys, and leave the rest to the optimizer. But that might get very expensive, so we try to pick only a couple interesting orderings based on both local and global information. When planning the grouping path, we explore statistics (number of distinct values, cost of the comparison function) for the keys and reorder them to minimize comparison costs. Intuitively, it may be better to perform more expensive comparisons (for complex data types etc.) last, because maybe the cheaper comparisons will be enough. Similarly, the higher the cardinality of a key, the lower the probability we’ll need to compare more keys. The patch generates and costs various orderings, picking the cheapest ones. The ordering of group keys may interact with other parts of the query, some of which may not be known while planning the grouping. E.g. there may be an explicit ORDER BY clause, or some other ordering-dependent operation, higher up in the query, and using the same ordering may allow using either incremental sort or even eliminate the sort entirely. The patch generates orderings and picks those minimizing the comparison cost (for various pathkeys), and then adds orderings that might be useful for operations higher up in the plan (ORDER BY, etc.). Finally, it always keeps the ordering specified in the query, on the assumption the user might have additional insights. This introduces a new GUC enable_group_by_reordering, so that the optimization may be disabled if needed. The original patch was proposed by Teodor Sigaev, and later improved and reworked by Dmitry Dolgov. Reviews by a number of people, including me, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed and Zhihong Yu. Author: Dmitry Dolgov, Teodor Sigaev, Tomas Vondra Reviewed-by: Tomas Vondra, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed, Zhihong Yu Discussion: https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru Discussion: https://postgr.es/m/CA%2Bq6zcW_4o2NC0zutLkOJPsFt80megSpX_dVRo6GK9PC-Jx_Ag%40mail.gmail.com
2022-03-25postgres_fdw: Minor cleanup for pgfdw_abort_cleanup().Etsuro Fujita
Commit 85c696112 introduced this function to deduplicate code in the transaction callback functions, but the SQL command passed as an argument to it was useless when it returned before aborting a remote transaction using the command. Modify pgfdw_abort_cleanup() so that it constructs the command when/if necessary, as before, removing the argument from it. Also update comments in pgfdw_abort_cleanup() and one of the calling functions. Etsuro Fujita, reviewed by David Zhang. Discussion: https://postgr.es/m/CAPmGK158hrd%3DZfXmgkmNFHivgh18e4oE2Gz151C2Q4OBDjZ08A%40mail.gmail.com
2022-03-08Simplify SRFs using materialize mode in contrib/ modulesMichael Paquier
9e98583 introduced a helper to centralize building their needed state (tuplestore, tuple descriptors, etc.), checking for any errors. This commit updates all places of contrib/ that can be switched to use SetSingleFuncCall() as a drop-in replacement, resulting in the removal of a lot of boilerplate code in all the modules updated by this commit. Per analysis, some places remain as they are: - pg_logdir_ls() in adminpack/ uses historically TYPEFUNC_RECORD as return type, and I suspect that changing it may cause issues at run-time with some of its past versions, down to 1.0. - dblink/ uses a wrapper function doing exactly the work of SetSingleFuncCall(). Here the switch should be possible, but rather invasive so it does not seem the extra backpatch maintenance cost. - tablefunc/, similarly, uses multiple helper functions with portions of SetSingleFuncCall() spread across the code paths of this module. Author: Melanie Plageman Discussion: https://postgr.es/m/CAAKRu_bvDPJoL9mH6eYwvBpPtTGQwbDzfJbCM-OjkSZDu5yTPg@mail.gmail.com
2022-02-24postgres_fdw: Add support for parallel commit.Etsuro Fujita
postgres_fdw commits remote (sub)transactions opened on remote server(s) in a local (sub)transaction one by one when the local (sub)transaction commits. This patch allows it to commit the remote (sub)transactions in parallel to improve performance. This is enabled by the server option "parallel_commit". The default is false. Etsuro Fujita, reviewed by Fujii Masao and David Zhang. Discussion: http://postgr.es/m/CAPmGK17dAZCXvwnfpr1eTfknTGdt%3DhYTV9405Gt5SqPOX8K84w%40mail.gmail.com
2022-02-21Disallow setting bogus GUCs within an extension's reserved namespace.Tom Lane
Commit 75d22069e tried to throw a warning for setting a custom GUC whose prefix belongs to a previously-loaded extension, if there is no such GUC defined by the extension. But that caused unstable behavior with parallel workers, because workers don't necessarily load extensions and GUCs in the same order their leader did. To make that work safely, we have to completely disallow the case. We now actually remove any such GUCs at the time of initial extension load, and then throw an error not just a warning if you try to add one later. While this might create a compatibility issue for a few people, the improvement in error-detection capability seems worth it; it's hard to believe that there's any good use-case for choosing such GUC names. This also un-reverts 5609cc01c (Rename EmitWarningsOnPlaceholders() to MarkGUCPrefixReserved()), since that function's old name is now even more of a misnomer. Florin Irion and Tom Lane Discussion: https://postgr.es/m/1902182.1640711215@sss.pgh.pa.us
2022-02-18postgres_fdw: Make postgres_fdw.application_name support more escape sequences.Fujii Masao
Commit 6e0cb3dec1 allowed postgres_fdw.application_name to include escape sequences %a (application name), %d (database name), %u (user name) and %p (pid). In addition to them, this commit makes it support the escape sequences for session ID (%c) and cluster name (%C). These are helpful to investigate where each remote transactions came from. Author: Fujii Masao Reviewed-by: Ryohei Takahashi, Kyotaro Horiguchi Discussion: https://postgr.es/m/1041dc9a-c976-049f-9f14-e7d94c29c4b2@oss.nttdata.com
2022-02-17Remove all traces of tuplestore_donestoring() in the C codeMichael Paquier
This routine is a no-op since dd04e95 from 2003, with a macro kept around for compatibility purposes. This has led to the same code patterns being copy-pasted around for no effect, sometimes in confusing ways like in pg_logical_slot_get_changes_guts() from logical.c where the code was actually incorrect. This issue has been discussed on two different threads recently, so rather than living with this legacy, remove any uses of this routine in the C code to simplify things. The compatibility macro is kept to avoid breaking any out-of-core modules that depend on it. Reported-by: Tatsuhito Kasahara, Justin Pryzby Author: Tatsuhito Kasahara Discussion: https://postgr.es/m/20211217200419.GQ17618@telsasoft.com Discussion: https://postgr.es/m/CAP0=ZVJeeYfAeRfmzqAF2Lumdiv4S4FewyBnZd4DPTrsSQKJKw@mail.gmail.com
2022-01-27postgres_fdw: Fix handling of a pending asynchronous request in ↵Etsuro Fujita
postgresReScanForeignScan(). Commit 27e1f1456 failed to process a pending asynchronous request made for a given ForeignScan node in postgresReScanForeignScan() (if any) in cases where we would only reset the next_tuple counter in that function, contradicting the assumption that there should be no pending asynchronous requests that have been made for async-capable subplans for the parent Append node after ReScan. This led to an assert failure in an assert-enabled build. I think this would also lead to mis-rewinding the cursor in that function in the case where we have already fetched one batch for the ForeignScan node and the asynchronous request has been made for the second batch, because even in that case we would just reset the counter when called from that function, so we would fail to execute MOVE BACKWARD ALL. To fix, modify that function to process the asynchronous request before restarting the scan. While at it, add a comment to a function to match other places. Per bug #17344 from Alexander Lakhin. Back-patch to v14 where the aforesaid commit came in. Patch by me. Test case by Alexander Lakhin, adjusted by me. Reviewed and tested by Alexander Lakhin and Dmitry Dolgov. Discussion: https://postgr.es/m/17344-226b78b00de73a7e@postgresql.org
2022-01-21postgres_fdw: Fix subabort cleanup of connections used in asynchronous ↵Etsuro Fujita
execution. Commit 27e1f1456 resets the per-connection states of connections used to scan foreign tables asynchronously during abort cleanup at main transaction end, but it failed to do so during subabort cleanup at subtransaction end, leading to a segmentation fault when re-executing an asynchronous-foreign-table-scan query in a transaction that was cancelled in a subtransaction of it. Fix by modifying pgfdw_abort_cleanup() to reset the per-connection state of a given connection also when called for subabort cleanup. Also, modify that function to do the reset in both the abort-cleanup and subabort-cleanup cases if necessary, to save cycles, and improve a comment on it a little bit. Back-patch to v14 where the aforesaid commit came in. Reviewed by Alexander Pyhalov Discussion: https://postgr.es/m/CAPmGK14cCV-JA7kNsyt2EUTKvZ4xkr2LNRthi1U1C3cqfGppAw@mail.gmail.com
2022-01-17Add Boolean nodePeter Eisentraut
Before, SQL-level boolean constants were represented by a string with a cast, and internal Boolean values in DDL commands were usually represented by Integer nodes. This takes the place of both of these uses, making the intent clearer and having some amount of type safety. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/8c1a2e37-c68d-703c-5a83-7a6077f4f997@enterprisedb.com
2022-01-07Update copyright for 2022Bruce Momjian
Backpatch-through: 10
2022-01-07postgres_fdw: Add regression test for postgres_fdw.application_name GUC.Fujii Masao
Commit 449ab63505 added postgres_fdw.application_name GUC that specifies a value for application_name configuration parameter used when postgres_fdw establishes a connection to a foreign server. Also commit 6e0cb3dec1 allowed it to include escape sequences. Both commits added the regression tests for the GUC, but those tests were reverted by commits 98dbef90eb and 5e64ad3697 because they were unstable and caused some buildfarm members to report the failure. This is the third try to add the regression test for postgres_fdw.application_name GUC. One of issues to make the test unstable was to have used postgres_fdw_disconnect_all() to close the existing remote connections. The test expected that the remote connection and its corresponding backend at the remote server disappeared just after postgres_fdw_disconnect_all() was executed, but it could take a bit time for them to disappear. To make sure that they exit, this commit makes the test use pg_terminate_backend() with the timeout at the remote server, instead. If the timeout is set to greater than zero, this function waits until they are actually terminated (or until the given time has passed). Another issue was that the test didn't take into consideration the case where postgres_fdw.application_name containing some escape sequences was converted to the string larger than NAMEDATALEN. In this case it was truncated to less than NAMEDATALEN when it's passed to the remote server, but the test expected wrongly that full string of application_name was always viewable. This commit changes the test so that it can handle that case. Author: Fujii Masao Reviewed-by: Masahiko Sawada, Hayato Kuroda, Kyotaro Horiguchi Discussion: https://postgr.es/m/3220909.1631054766@sss.pgh.pa.us Discussion: https://postgr.es/m/20211224.180006.2247635208768233073.horikyota.ntt@gmail.com Discussion: https://postgr.es/m/e7b61420-a97b-8246-77c4-a0d48fba5a45@oss.nttdata.com
2021-12-27Revert changes about warnings/errors for placeholders.Tom Lane
Revert commits 5609cc01c, 2ed8a8cc5, and 75d22069e until we have a less broken idea of how this should work in parallel workers. Per buildfarm. Discussion: https://postgr.es/m/1640909.1640638123@sss.pgh.pa.us
2021-12-27Rename EmitWarningsOnPlaceholders() to MarkGUCPrefixReserved().Tom Lane
This seems like a clearer name for what it does now. Provide a compatibility macro so that extensions don't have to convert to the new name right away. Discussion: https://postgr.es/m/116024.1640111629@sss.pgh.pa.us
2021-12-24postgres_fdw: Revert unstable tests for postgres_fdw.application_name.Fujii Masao
Commit 6e0cb3dec1 added the tests that check that escape sequences in postgres_fdw.application_name setting are replaced with status information expectedly. But they were unstable and caused some buildfarm members to report the failure. This commit reverts those unstable tests.