summaryrefslogtreecommitdiff
path: root/contrib/postgres_fdw
AgeCommit message (Collapse)Author
2024-12-23postgres_fdw: re-issue cancel requests a few times if necessary.Tom Lane
Despite the best efforts of commit 0e5c82380, we're still seeing occasional failures of postgres_fdw's query_cancel test in the buildfarm. Investigation suggests that its 100ms timeout is still not enough to reliably ensure that the remote side starts the query before receiving the cancel request --- and if it hasn't, it will just discard the request because it's idle. We discussed allowing a cancel request to kill the next-received query, but that would have wide and perhaps unpleasant side-effects. What seems safer is to make postgres_fdw do what a human user would likely do, which is issue another cancel request if the first one didn't seem to do anything. We'll keep the same overall 30 second grace period before concluding things are broken, but issue additional cancel requests after 1 second, then 2 more seconds, then 4, then 8. (The next one in series is 16 seconds, but we'll hit the 30 second timeout before that.) Having done that, revert the timeout in query_cancel.sql to 10 ms. That will still be enough on most machines, most of the time, for the remote query to start; but now we're intentionally risking the race condition occurring sometimes in the buildfarm, so that the repeat-cancel code path will get some testing. As before, back-patch to v17. We might eventually contemplate back-patching this further, and/or adding similar logic to dblink. But given the lack of field complaints to date, this feels like mostly an exercise in test case stabilization, so v17 is enough. Discussion: https://postgr.es/m/colnv3lzzmc53iu5qoawynr6qq7etn47lmggqr65ddtpjliq5d@glkveb4m6nop
2024-12-20Introduce CompactAttribute array in TupleDesc, take 2David Rowley
The new compact_attrs array stores a few select fields from FormData_pg_attribute in a more compact way, using only 16 bytes per column instead of the 104 bytes that FormData_pg_attribute uses. Using CompactAttribute allows performance-critical operations such as tuple deformation to be performed without looking at the FormData_pg_attribute element in TupleDesc which means fewer cacheline accesses. For some workloads, tuple deformation can be the most CPU intensive part of processing the query. Some testing with 16 columns on a table where the first column is variable length showed around a 10% increase in transactions per second for an OLAP type query performing aggregation on the 16th column. However, in certain cases, the increases were much higher, up to ~25% on one AMD Zen4 machine. This also makes pg_attribute.attcacheoff redundant. A follow-on commit will remove it, thus shrinking the FormData_pg_attribute struct by 4 bytes. Author: David Rowley Reviewed-by: Andres Freund, Victor Yegorov Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
2024-12-11Enable BUFFERS with EXPLAIN ANALYZE by defaultDavid Rowley
The topic of turning EXPLAIN's BUFFERS option on with the ANALYZE option has come up a few times over the past few years. In many ways, doing this seems like a good idea as it may be more obvious to users why a given query is running more slowly than they might expect. Also, from my own (David's) personal experience, I've seen users posting to the mailing lists with two identical plans, one slow and one fast asking why their query is sometimes slow. In many cases, this is due to additional reads. Having BUFFERS on by default may help reduce some of these questions, and if not, make it more obvious to the user before they post, or save a round-trip to the mailing list when additional I/O effort is the cause of the slowness. The general consensus is that we want BUFFERS on by default with ANALYZE. However, there were more than zero concerns raised with doing so. The primary reason against is the additional verbosity, making it harder to read large plans. Another concern was that buffer information isn't always useful so may not make sense to have it on by default. It's currently December, so let's commit this to see if anyone comes forward with a strong objection against making this change. We have over half a year remaining in the v18 cycle where we could still easily consider reverting this if someone were to come forward with a convincing enough reason as to why doing this is a bad idea. There were two patches independently submitted to achieve this goal, one by me and the other by Guillaume. This commit is a mix of both of these patches with some additional work done by me to adjust various additional places in the documentation which include EXPLAIN ANALYZE output. Author: Guillaume Lelarge, David Rowley Reviewed-by: Robert Haas, Greg Sabino Mullane, Michael Christofides Discussion: https://postgr.es/m/CANNMO++W7MM8T0KyXN3ZheXXt-uLVM3aEtZd+WNfZ=obxffUiA@mail.gmail.com
2024-11-28Remove useless casts to (void *)Peter Eisentraut
Many of them just seem to have been copied around for no real reason. Their presence causes (small) risks of hiding actual type mismatches or silently discarding qualifiers Discussion: https://www.postgresql.org/message-id/flat/461ea37c-8b58-43b4-9736-52884e862820@eisentraut.org
2024-11-22Add INT64_HEX_FORMAT and UINT64_HEX_FORMAT to c.h.Nathan Bossart
Like INT64_FORMAT and UINT64_FORMAT, these macros produce format strings for 64-bit integers. However, INT64_HEX_FORMAT and UINT64_HEX_FORMAT generate the output in hexadecimal instead of decimal. Besides introducing these macros, this commit makes use of them in several places. This was originally intended to be part of commit 5d6187d2a2, but I left it out because I felt there was a nonzero chance that back-patching these new macros into c.h could cause problems with third-party code. We tend to be less cautious with such changes in new major versions. Note that UINT64_HEX_FORMAT was originally added in commit ee1b30f128, but it was placed in test_radixtree.c, so it wasn't widely available. This commit moves UINT64_HEX_FORMAT to c.h. Discussion: https://postgr.es/m/ZwQvtUbPKaaRQezd%40nathan
2024-10-28Remove unused #include's from contrib, pl, test .c filesPeter Eisentraut
as determined by IWYU Similar to commit dbbca2cf299, but for contrib, pl, and src/test/. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
2024-10-07Fix Y2038 issues with MyStartTime.Nathan Bossart
Several places treat MyStartTime as a "long", which is only 32 bits wide on some platforms. In reality, MyStartTime is a pg_time_t, i.e., a signed 64-bit integer. This will lead to interesting bugs on the aforementioned systems in 2038 when signed 32-bit integers are no longer sufficient to store Unix time (e.g., "pg_ctl start" hanging). To fix, ensure that MyStartTime is handled as a 64-bit value everywhere. (Of course, users will need to ensure that time_t is 64 bits wide on their system, too.) Co-authored-by: Max Johnson Discussion: https://postgr.es/m/CO1PR07MB905262E8AC270FAAACED66008D682%40CO1PR07MB9052.namprd07.prod.outlook.com Backpatch-through: 12
2024-09-18postgres_fdw: Extend postgres_fdw_get_connections to return user name.Fujii Masao
This commit adds a "user_name" output column to the postgres_fdw_get_connections function, returning the name of the local user mapped to the foreign server for each connection. If a public mapping is used, it returns "public." This helps identify postgres_fdw connections more easily, such as determining which connections are invalid, closed, or used within the current transaction. No extension version bump is needed, as commit c297a47c5f already handled it for v18~. Author: Hayato Kuroda Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/b492a935-6c7e-8c08-e485-3c1d64d7d10f@oss.nttdata.com
2024-08-30Make postgres_fdw's query_cancel test less flaky.Tom Lane
This test occasionally shows +WARNING: could not get result of cancel request due to timeout which appears to be because the cancel request is sometimes unluckily sent to the remote session between queries, and then it's ignored. This patch tries to make that less probable in three ways: 1. Use a test query that does not involve remote estimates, so that no EXPLAINs are sent. 2. Make sure that the remote session is ready-to-go (transaction started, SET commands sent) before we start the timer. 3. Increase the statement_timeout to 100ms, to give the local session enough time to plan and issue the query. We might have to go higher than 100ms to make this adequately stable in the buildfarm, but let's see how it goes. Back-patch to v17 where this test was introduced. Jelte Fennema-Nio and Tom Lane Discussion: https://postgr.es/m/578934.1725045685@sss.pgh.pa.us
2024-08-21Treat number of disabled nodes in a path as a separate cost metric.Robert Haas
Previously, when a path type was disabled by e.g. enable_seqscan=false, we either avoided generating that path type in the first place, or more commonly, we added a large constant, called disable_cost, to the estimated startup cost of that path. This latter approach can distort planning. For instance, an extremely expensive non-disabled path could seem to be worse than a disabled path, especially if the full cost of that path node need not be paid (e.g. due to a Limit). Or, as in the regression test whose expected output changes with this commit, the addition of disable_cost can make two paths that would normally be distinguishible in cost seem to have fuzzily the same cost. To fix that, we now count the number of disabled path nodes and consider that a high-order component of both the startup cost and the total cost. Hence, the path list is now sorted by disabled_nodes and then by total_cost, instead of just by the latter, and likewise for the partial path list. It is important that this number is a count and not simply a Boolean; else, as soon as we're unable to respect disabled path types in all portions of the path, we stop trying to avoid them where we can. Because the path list is now sorted by the number of disabled nodes, the join prechecks must compute the count of disabled nodes during the initial cost phase instead of postponing it to final cost time. Counts of disabled nodes do not cross subquery levels; at present, there is no reason for them to do so, since the we do not postpone path selection across subquery boundaries (see make_subplan). Reviewed by Andres Freund, Heikki Linnakangas, and David Rowley. Discussion: http://postgr.es/m/CA+TgmoZ_+MS+o6NeGK2xyBv-xM+w1AfFVuHE4f_aq6ekHv7YSQ@mail.gmail.com
2024-08-05Restrict accesses to non-system views and foreign tables during pg_dump.Masahiko Sawada
When pg_dump retrieves the list of database objects and performs the data dump, there was possibility that objects are replaced with others of the same name, such as views, and access them. This vulnerability could result in code execution with superuser privileges during the pg_dump process. This issue can arise when dumping data of sequences, foreign tables (only 13 or later), or tables registered with a WHERE clause in the extension configuration table. To address this, pg_dump now utilizes the newly introduced restrict_nonsystem_relation_kind GUC parameter to restrict the accesses to non-system views and foreign tables during the dump process. This new GUC parameter is added to back branches too, but these changes do not require cluster recreation. Back-patch to all supported branches. Reviewed-by: Noah Misch Security: CVE-2024-7348 Backpatch-through: 12
2024-07-27postgres_fdw: Fix bug in connection status check.Fujii Masao
The buildfarm member "hake" reported a failure in the regression test added by commit 857df3cef7, where postgres_fdw_get_connections(true) returned unexpected results. The function postgres_fdw_get_connections(true) checks if a connection is closed by using POLLRDHUP in the requested events and calling poll(). Previously, the function only considered POLLRDHUP or 0 as valid returned events. However, poll() can also return POLLHUP, POLLERR, and/or POLLNVAL. So if any of these events were returned, postgres_fdw_get_connections(true) would report incorrect results. postgres_fdw_get_connections(true) failed to account for these return events. This commit updates postgres_fdw_get_connections(true) to correctly report a closed connection when poll() returns not only POLLRDHUP but also POLLHUP, POLLERR, or POLLNVAL. Discussion: https://postgr.es/m/fd8f6186-9e1e-4b9a-92c5-e71e3697d381@oss.nttdata.com
2024-07-26postgres_fdw: Add connection status check to postgres_fdw_get_connections().Fujii Masao
This commit extends the postgres_fdw_get_connections() function to check if connections are closed. This is useful for detecting closed postgres_fdw connections that could prevent successful transaction commits. Users can roll back transactions immediately upon detecting closed connections, avoiding unnecessary processing of failed transactions. This feature is available only on systems supporting the non-standard POLLRDHUP extension to the poll system call, including Linux. Author: Hayato Kuroda Reviewed-by: Shinya Kato, Zhihong Yu, Kyotaro Horiguchi, Andres Freund Reviewed-by: Onder Kalaci, Takamichi Osumi, Vignesh C, Tom Lane, Ted Yu Reviewed-by: Katsuragi Yuta, Peter Smith, Shubham Khanna, Fujii Masao Discussion: https://postgr.es/m/TYAPR01MB58662809E678253B90E82CE5F5889@TYAPR01MB5866.jpnprd01.prod.outlook.com
2024-07-26postgres_fdw: Add "used_in_xact" column to postgres_fdw_get_connections().Fujii Masao
This commit extends the postgres_fdw_get_connections() function to include a new used_in_xact column, indicating whether each connection is used in the current transaction. This addition is particularly useful for the upcoming feature that will check if connections are closed. By using those information, users can verify if postgres_fdw connections used in a transaction remain open. If any connection is closed, the transaction cannot be committed successfully. In this case users can roll back it immediately without waiting for transaction end. The SQL API for postgres_fdw_get_connections() is updated by this commit and may change in the future. To handle compatibility with older SQL declarations, an API versioning system is introduced, allowing the function to behave differently based on the API version. Author: Hayato Kuroda Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/be9382f7-5072-4760-8b3f-31d6dffa8d62@oss.nttdata.com
2024-07-22postgres_fdw: Split out the query_cancel test to its own fileAlvaro Herrera
This allows us to skip it in Cygwin, where it's reportedly flaky because of platform bugs or something. Backpatch to 17, where the test was introduced by commit 2466d6654f85. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/e4d0cb33-6be5-e4d5-ae49-9eac3ff2b005@gmail.com
2024-07-19postgres_fdw: Avoid "cursor can only scan forward" error.Etsuro Fujita
Commit d844cd75a disallowed rewind in a non-scrollable cursor to resolve anomalies arising from such a cursor operation. However, this failed to take into account the assumption in postgres_fdw that when rescanning a foreign relation, it can rewind the cursor created for scanning the foreign relation without specifying the SCROLL option, regardless of its scrollability, causing this error when it tried to do such a rewind in a non-scrollable cursor. Fix by modifying postgres_fdw to instead recreate the cursor, regardless of its scrollability, when rescanning the foreign relation. (If we had a way to check its scrollability, we could improve this by rewinding it if it is scrollable and recreating it if not, but we do not have it, so this commit modifies it to recreate it in any case.) Per bug #17889 from Eric Cyr. Devrim Gunduz also reported this problem. Back-patch to v15 where that commit enforced the prohibition. Reviewed by Tom Lane. Discussion: https://postgr.es/m/17889-e8c39a251d258dda%40postgresql.org Discussion: https://postgr.es/m/b415ac3255f8352d1ea921cf3b7ba39e0587768a.camel%40gunduz.org
2024-07-15Check lateral references within PHVs for memoize cache keysRichard Guo
If we intend to generate a Memoize node on top of a path, we need cache keys of some sort. Currently we search for the cache keys in the parameterized clauses of the path as well as the lateral_vars of its parent. However, it turns out that this is not sufficient because there might be lateral references derived from PlaceHolderVars, which we fail to take into consideration. This oversight can cause us to miss opportunities to utilize the Memoize node. Moreover, in some plans, failing to recognize all the cache keys could result in performance regressions. This is because without identifying all the cache keys, we would need to purge the entire cache every time we get a new outer tuple during execution. This patch fixes this issue by extracting lateral Vars from within PlaceHolderVars and subsequently including them in the cache keys. In passing, this patch also includes a comment clarifying that Memoize nodes are currently not added on top of join relation paths. This explains why this patch only considers PlaceHolderVars that are due to be evaluated at baserels. Author: Richard Guo Reviewed-by: Tom Lane, David Rowley, Andrei Lepikhov Discussion: https://postgr.es/m/CAMbWs48jLxn0pAPZpJ50EThZ569Xrw+=4Ac3QvkpQvNszbeoNg@mail.gmail.com
2024-07-05Support "Right Semi Join" plan shapesRichard Guo
Hash joins can support semijoin with the LHS input on the right, using the existing logic for inner join, combined with the assurance that only the first match for each inner tuple is considered, which can be achieved by leveraging the HEAP_TUPLE_HAS_MATCH flag. This can be very useful in some cases since we may now have the option to hash the smaller table instead of the larger. Merge join could likely support "Right Semi Join" too. However, the benefit of swapping inputs tends to be small here, so we do not address that in this patch. Note that this patch also modifies a test query in join.sql to ensure it continues testing as intended. With this patch the original query would result in a right-semi-join rather than semi-join, compromising its original purpose of testing the fix for neqjoinsel's behavior for semi-joins. Author: Richard Guo Reviewed-by: wenhui qiu, Alena Rybakina, Japin Li Discussion: https://postgr.es/m/CAMbWs4_X1mN=ic+SxcyymUqFx9bB8pqSLTGJ-F=MHy4PW3eRXw@mail.gmail.com
2024-06-07postgres_fdw: Refuse to send FETCH FIRST WITH TIES to remote servers.Etsuro Fujita
Previously, when considering LIMIT pushdown, postgres_fdw failed to check whether the query has this clause, which led to pushing false LIMIT clauses, causing incorrect results. This clause has been supported since v13, so we need to do a remote-version check before deciding that it will be safe to push such a clause, but we do not currently have a way to do the check (without accessing the remote server); disable pushing such a clause for now. Oversight in commit 357889eb1. Back-patch to v13, where that commit added the support. Per bug #18467 from Onder Kalaci. Patch by Japin Li, per a suggestion from Tom Lane, with some changes to the comments by me. Review by Onder Kalaci, Alvaro Herrera, and me. Discussion: https://postgr.es/m/18467-7bb89084ff03a08d%40postgresql.org
2024-05-21Re-allow planner to use Merge Append to efficiently implement UNION.Robert Haas
This reverts commit 7204f35919b7e021e8d1bc9f2d76fd6bfcdd2070, thus restoring 66c0185a3 (Allow planner to use Merge Append to efficiently implement UNION) as well as the follow-on commits d5d2205c8, 3b1a7eb28, 7487044d6. Per further discussion on pgsql-release, we wish to ship beta1 with this feature, and patch the bug that was found just before wrap, rather than shipping beta1 with the feature reverted.
2024-05-20Revert commit 66c0185a3 and follow-on patches.Tom Lane
This reverts 66c0185a3 (Allow planner to use Merge Append to efficiently implement UNION) as well as the follow-on commits d5d2205c8, 3b1a7eb28, 7487044d6. In addition to those, 07746a8ef had to be removed then re-applied in a different place, because 66c0185a3 moved the relevant code. The reason for this last-minute thrashing is that depesz found a case in which the patched code creates a completely wrong plan that silently gives incorrect query results. It's unclear what the cause is or how many cases are affected, but with beta1 wrap staring us in the face, there's no time for closer investigation. After we figure that out, we can decide whether to un-revert this for beta2 or hold it for v18. Discussion: https://postgr.es/m/Zktzf926vslR35Fv@depesz.com (also some private discussion among pgsql-release)
2024-04-21Make postgres_fdw request remote time zone 'GMT' not 'UTC'.Tom Lane
This should have the same results for all practical purposes. The advantage of selecting 'GMT' is that it's guaranteed to work even when the remote system's timezone database is missing entries, because pg_tzset() hard-wires handling of that, at least in 9.2 and later. (It seems like it would be a good idea to similarly hard-wire correct handling of 'UTC', but that'll be a little more invasive than I want to consider back-patching. Leave that for another day when we're not in feature freeze.) Per trouble report from Adnan Dautovic. Back-patch to all supported branches. Discussion: https://postgr.es/m/465248.1712211585@sss.pgh.pa.us
2024-04-11postgres_fdw: Improve comment about handling of asynchronous requests.Etsuro Fujita
We updated this comment in back branches (see commit f6f61a4bd et al); let's do so in HEAD as well for consistency. Discussion: https://postgr.es/m/CAPmGK142V1kqDfjo2H%2Bb54JTn2woVBrisFq%2B%3D9jwXwxr0VvbgA%40mail.gmail.com
2024-04-10Fixup various StringInfo function usagesDavid Rowley
This adjusts various appendStringInfo* function calls to use a more appropriate and efficient function with the same behavior. For example, use appendStringInfoChar() when appending a single character rather than appendStringInfo() and appendStringInfoString() when no formatting is required rather than using appendStringInfo(). All adjustments made here are in code that's new to v17, so it makes sense to fix these now rather than wait a few years and make backpatching harder. Discussion: https://postgr.es/m/CAApHDvojY2UvMiO+9_55ArTj10P1LBNJyyoGB+C65BLDNT0GsQ@mail.gmail.com Reviewed-by: Nathan Bossart, Tom Lane
2024-04-05Make libpqsrv_cancel's return const char *, not char *Alvaro Herrera
Per headerscheck's C++ check. Discussion: https://postgr.es/m/372769.1712179784@sss.pgh.pa.us
2024-04-04postgres_fdw: Remove useless ternary expression.Etsuro Fujita
There is no case where we would call pgfdw_exec_cleanup_query or pgfdw_exec_cleanup_query_{begin,end} with a NULL query string, so this expression is pointless; remove it and instead add to the latter functions an assertion ensuring the given query string is not NULL. Thinko in commit 815d61fcd. Discussion: https://postgr.es/m/CAPmGK14mm%2B%3DUjyjoWj_Hu7c%2BQqX-058RFfF%2BqOkcMZ_Nj52v-A%40mail.gmail.com
2024-03-30Stabilize postgres_fdw testAlvaro Herrera
The test fails when RESET statement_timeout takes longer than 10ms. Avoid the problem by using SET LOCAL instead. Overall, this test is not ideal: 10ms could be shorter than the time to have sent the query to the "remote" server, so it's possible that on some machines this test doesn't actually witness a remote query being cancelled. We may want to improve on this someday by using some other testing technique, but for now it's better than nothing. I verified manually that one round of remote cancellation occurs when this runs on my machine. Discussion: https://postgr.es/m/CAGECzQRsdWnj=YaaPCnA8d7E1AdbxRPBYmyBQRMPUijR2MpM_w@mail.gmail.com
2024-03-28libpq-be-fe-helpers.h: wrap new cancel APIsAlvaro Herrera
Commit 61461a300c1c introduced new functions to libpq for cancelling queries. This commit introduces a helper function that backend-side libraries and extensions can use to invoke those. This function takes a timeout and can itself be interrupted while it is waiting for a cancel request to be sent and processed, instead of being blocked. This replaces the usage of the old functions in postgres_fdw and dblink. Finally, it also adds some test coverage for the cancel support in postgres_fdw. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/CAGECzQT_VgOWWENUqvUV9xQmbaCyXjtRRAYO8W07oqashk_N+g@mail.gmail.com
2024-03-25Allow planner to use Merge Append to efficiently implement UNIONDavid Rowley
Until now, UNION queries have often been suboptimal as the planner has only ever considered using an Append node and making the results unique by either using a Hash Aggregate, or by Sorting the entire Append result and running it through the Unique operator. Both of these methods always require reading all rows from the union subqueries. Here we adjust the union planner so that it can request that each subquery produce results in target list order so that these can be Merge Appended together and made unique with a Unique node. This can improve performance significantly as the union child can make use of the likes of btree indexes and/or Merge Joins to provide the top-level UNION with presorted input. This is especially good if the top-level UNION contains a LIMIT node that limits the output rows to a small subset of the unioned rows as cheap startup plans can be used. Author: David Rowley Reviewed-by: Richard Guo, Andy Fan Discussion: https://postgr.es/m/CAApHDvpb_63XQodmxKUF8vb9M7CxyUyT4sWvEgqeQU-GB7QFoQ@mail.gmail.com
2024-03-19Improve EXPLAIN's display of SubPlan nodes and output parameters.Tom Lane
Historically we've printed SubPlan expression nodes as "(SubPlan N)", which is pretty uninformative. Trying to reproduce the original SQL for the subquery is still as impractical as before, and would be mighty verbose as well. However, we can still do better than that. Displaying the "testexpr" when present, and adding a keyword to indicate the SubLinkType, goes a long way toward showing what's really going on. In addition, this patch gets rid of EXPLAIN's use of "$n" to represent subplan and initplan output Params. Instead we now print "(SubPlan N).colX" or "(InitPlan N).colX" to represent the X'th output column of that subplan. This eliminates confusion with the use of "$n" to represent PARAM_EXTERN Params, and it's useful for the first part of this change because it eliminates needing some other indication of which subplan is referenced by a SubPlan that has a testexpr. In passing, this adds simple regression test coverage of the ROWCOMPARE_SUBLINK code paths, which were entirely unburdened by testing before. Tom Lane and Dean Rasheed, reviewed by Aleksander Alekseev. Thanks to Chantal Keller for raising the question of whether this area couldn't be improved. Discussion: https://postgr.es/m/2838538.1705692747@sss.pgh.pa.us
2024-03-13Make the order of the header file includes consistentPeter Eisentraut
Similar to commit 7e735035f20. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4-WhpCFMbXCjtJ%2BFzmjfPrp7Hw1pk4p%2BZpU95Kh3ofZ1A%40mail.gmail.com
2024-03-11Fix deparsing of Consts in postgres_fdw ORDER BYDavid Rowley
For UNION ALL queries where a union child query contained a foreign table, if the targetlist of that query contained a constant, and the top-level query performed an ORDER BY which contained the column for the constant value, then postgres_fdw would find the EquivalenceMember with the Const and then try to produce an ORDER BY containing that Const. This caused problems with INT typed Consts as these could appear to be requests to order by an ordinal column position rather than the constant value. This could lead to either an error such as: ERROR: ORDER BY position <int const> is not in select list or worse, if the constant value is a valid column, then we could just sort by the wrong column altogether. Here we fix this issue by just not including these Consts in the ORDER BY clause. In passing, add a new section for testing ORDER BY in the postgres_fdw tests and move two existing tests which were misplaced in the WHERE clause testing section into it. Reported-by: Michał Kłeczek Reviewed-by: Ashutosh Bapat, Richard Guo Bug: #18381 Discussion: https://postgr.es/m/0714C8B8-8D82-4ABB-9F8D-A0C3657E7B6E%40kleczek.org Discussion: https://postgr.es/m/18381-137456acd168bf93%40postgresql.org Backpatch-through: 12, oldest supported version
2024-02-15Pull up ANY-SUBLINK with the necessary lateral support.Alexander Korotkov
For ANY-SUBLINK, we adopted a two-stage pull-up approach to handle different types of scenarios. In the first stage, the sublink is pulled up as a subquery. Because of this, when writing this code, we did not have the ability to perform lateral joins, and therefore, we were unable to pull up Var with varlevelsup=1. Now that we have the ability to use lateral joins, we can eliminate this limitation. Author: Andy Fan <zhihui.fan1213@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru> Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
2024-01-23Add better handling of redundant IS [NOT] NULL qualsDavid Rowley
Until now PostgreSQL has not been very smart about optimizing away IS NOT NULL base quals on columns defined as NOT NULL. The evaluation of these needless quals adds overhead. Ordinarily, anyone who came complaining about that would likely just have been told to not include the qual in their query if it's not required. However, a recent bug report indicates this might not always be possible. Bug 17540 highlighted that when we optimize Min/Max aggregates the IS NOT NULL qual that the planner adds to make the rewritten plan ignore NULLs can cause issues with poor index choice. That particular case demonstrated that other quals, especially ones where no statistics are available to allow the planner a chance at estimating an approximate selectivity for can result in poor index choice due to cheap startup paths being prefered with LIMIT 1. Here we take generic approach to fixing this by having the planner check for NOT NULL columns and just have the planner remove these quals (when they're not needed) for all queries, not just when optimizing Min/Max aggregates. Additionally, here we also detect IS NULL quals on a NOT NULL column and transform that into a gating qual so that we don't have to perform the scan at all. This also works for join relations when the Var is not nullable by any outer join. This also helps with the self-join removal work as it must replace strict join quals with IS NOT NULL quals to ensure equivalence with the original query. Author: David Rowley, Richard Guo, Andy Fan Reviewed-by: Richard Guo, David Rowley Discussion: https://postgr.es/m/CAApHDvqg6XZDhYRPz0zgOcevSMo0d3vxA9DvHrZtKfqO30WTnw@mail.gmail.com Discussion: https://postgr.es/m/17540-7aa1855ad5ec18b4%40postgresql.org
2024-01-08Make dblink interruptible, via new libpqsrv APIs.Noah Misch
This replaces dblink's blocking libpq calls, allowing cancellation and allowing DROP DATABASE (of a database not involved in the query). Apart from explicit dblink_cancel_query() calls, dblink still doesn't cancel the remote side. The replacement for the blocking calls consists of new, general-purpose query execution wrappers in the libpqsrv facility. Out-of-tree extensions should adopt these. Use them in postgres_fdw, replacing a local implementation from which the libpqsrv implementation derives. This is a bug fix for dblink. Code inspection identified the bug at least thirteen years ago, but user complaints have not appeared. Hence, no back-patch for now. Discussion: https://postgr.es/m/20231122012945.74@rfd.leadboat.com
2024-01-03Update copyright for 2024Bruce Momjian
Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
2024-01-02Fix typos in comments and in one isolation test.Robert Haas
Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna. Some subtractions by me. Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org
2023-12-05Add support for deparsing semi-joins to contrib/postgres_fdwAlexander Korotkov
SEMI-JOIN is deparsed as the EXISTS subquery. It references outer and inner relations, so it should be evaluated as the condition in the upper-level WHERE clause. The signatures of deparseFromExprForRel() and deparseRangeTblRef() are revised so that they can add conditions to the upper level. PgFdwRelationInfo now has a hidden_subquery_rels field, referencing the relids used in the inner parts of semi-join. They can't be referred to from upper relations and should be used internally for equivalence member searches. The planner can create semi-join, which refers to inner rel vars in its target list. However, we deparse semi-join as an exists() subquery. So we skip the case when the target list references to inner rel of semi-join. Author: Alexander Pyhalov Reviewed-by: Ashutosh Bapat, Ian Lawrence Barwick, Yuuki Fujii, Tomas Vondra Discussion: https://postgr.es/m/c9e2a757cf3ac2333714eaf83a9cc184@postgrespro.ru
2023-11-30Improve "user mapping not found" error messagePeter Eisentraut
Display the name of the foreign server for which the user mapping was not found. Author: Ian Lawrence Barwick <barwick@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/CAB8KJ=jFzNaeyFtLcTZNOc6fd1+F93pGVLFa-wyt31wn7VNxqQ@mail.gmail.com
2023-11-23Use ResourceOwner to track WaitEventSets.Heikki Linnakangas
A WaitEventSet holds file descriptors or event handles (on Windows). If FreeWaitEventSet is not called, those fds or handles are leaked. Use ResourceOwners to track WaitEventSets, to clean those up automatically on error. This was a live bug in async Append nodes, if a FDW's ForeignAsyncRequest function failed. (In back branches, I will apply a more localized fix for that based on PG_TRY-PG_FINALLY.) The added test doesn't check for leaking resources, so it passed even before this commit. But at least it covers the code path. In the passing, fix misleading comment on what the 'nevents' argument to WaitEventSetWait means. Report by Alexander Lakhin, analysis and suggestion for the fix by Tom Lane. Fixes bug #17828. Reviewed-by: Alexander Lakhin, Thomas Munro Discussion: https://www.postgresql.org/message-id/472235.1678387869@sss.pgh.pa.us
2023-11-03Stabilize postgres_fdw tests on 32-bit machinesDavid Rowley
cac169d68 adjusted DEFAULT_FDW_TUPLE_COST and that seems to have caused a test to become unstable on 32-bit machines. 4b14e1871 tried to fix this as originally the plan was flipping between a Nested Loop and Hash Join. That commit forced the Nested Loop, but there's still flexibility to push or not push the sort to the remote server and 32-bit seems to prefer to push and on 64-bit, the costs prefer not to. Here let's just turn off enable_sort to significantly encourage the sort to take place on the remote server. Reported-by: Michael Paquier, Richard Guo Discussion: https://postgr.es/m/ZUM2IhA8X2lrG50K@paquier.xyz
2023-11-02Attempt to stabilize postgres_fdw testsDavid Rowley
cac169d68 adjusted DEFAULT_FDW_TUPLE_COST and that seems to have caused a test to become unstable on 32-bit machines. Try to make it stable again. Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZUM2IhA8X2lrG50K@paquier.xyz
2023-11-02Increase DEFAULT_FDW_TUPLE_COST from 0.01 to 0.2David Rowley
0.01 was unrealistically low as it's the same as the default cpu_tuple_cost and 10x cheaper than the default parallel_tuple_cost. It's hard to imagine a situation where fetching a tuple from a foreign server would be cheaper than fetching one from a parallel worker. After some experimentation on a loopback server, somewhere between 0.15 and 0.3 seems more realistic. Here we split the difference and set it to 0.2. This will cause operations that reduce the number of tuples (e.g. aggregation) to be more likely to take place on the foreign server. Adjusting this causes some plan changes in the postgres_fdw regression tests. This is because penalizing each Path with the additional tuple costs causes some dilution of the costs of the other operations being charged for and results in various paths appearing to be closer to the same costs such that add_path's STD_FUZZ_FACTOR is more likely to see two paths as costing (fuzzily) the same. This isn't ideal, but it shouldn't be reason enough to use artificially low costs. Discussion: https://postgr.es/m/CAApHDvopVjjfh5c1Ed2HRvDdfom2dEpMwwiu5-f1AnmYprJngA@mail.gmail.com
2023-10-26Add trailing commas to enum definitionsPeter Eisentraut
Since C99, there can be a trailing comma after the last value in an enum definition. A lot of new code has been introducing this style on the fly. Some new patches are now taking an inconsistent approach to this. Some add the last comma on the fly if they add a new last value, some are trying to preserve the existing style in each place, some are even dropping the last comma if there was one. We could nudge this all in a consistent direction if we just add the trailing commas everywhere once. I omitted a few places where there was a fixed "last" value that will always stay last. I also skipped the header files of libpq and ecpg, in case people want to use those with older compilers. There were also a small number of cases where the enum type wasn't used anywhere (but the enum values were), which ended up confusing pgindent a bit, so I left those alone. Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org
2023-10-05postgres_fdw: Replace WAIT_EVENT_EXTENSION with custom wait eventsMichael Paquier
Three custom wait events are added here: - "PostgresFdwCleanupResult", waiting while cleaning up PQgetResult() on transaction abort. - "PostgresFdwConnect", waiting to establish a connection to a remote server. - "PostgresFdwGetResult", waiting to receive a result from a remote server. Author: Masahiro Ikeda Discussion: https://postgr.es/m/197bce267fa691a0ac62c86c4ab904c4@oss.nttdata.com
2023-08-30postgres_fdw: Fix test for parameterized foreign scan.Etsuro Fujita
Commit e4106b252 should have updated this test, but did not; back-patch to all supported branches. Reviewed by Richard Guo. Discussion: http://postgr.es/m/CAPmGK15nR0NXLSCKQAcqbZbTzrzd5MozowWnTnGfPkayndF43Q%40mail.gmail.com
2023-08-15Fix code indentation vioaltion introduced in commit 9e9931d2b.Etsuro Fujita
Per buildfarm member koel
2023-08-15Re-allow FDWs and custom scan providers to replace joins with pseudoconstant ↵Etsuro Fujita
quals. This was disabled in commit 6f80a8d9c due to the lack of support for handling of pseudoconstant quals assigned to replaced joins in createplan.c. To re-allow it, this patch adds the support by 1) modifying the ForeignPath and CustomPath structs so that if they represent foreign and custom scans replacing a join with a scan, they store the list of RestrictInfo nodes to apply to the join, as in JoinPaths, and by 2) modifying create_scan_plan() in createplan.c so that it uses that list in that case, instead of the baserestrictinfo list, to get pseudoconstant quals assigned to the join, as mentioned in the commit message for that commit. Important item for the release notes: this is non-backwards-compatible since it modifies the ForeignPath and CustomPath structs, as mentioned above, and changes the argument lists for FDW helper functions create_foreignscan_path(), create_foreign_join_path(), and create_foreign_upper_path(). Richard Guo, with some additional changes by me, reviewed by Nishant Sharma, Suraj Kharage, and Richard Guo. Discussion: https://postgr.es/m/CADrsxdbcN1vejBaf8a%2BQhrZY5PXL-04mCd4GDu6qm6FigDZd6Q%40mail.gmail.com
2023-07-28Disallow replacing joins with scans in problematic cases.Etsuro Fujita
Commit e7cb7ee14, which introduced the infrastructure for FDWs and custom scan providers to replace joins with scans, failed to add support handling of pseudoconstant quals assigned to replaced joins in createplan.c, leading to an incorrect plan without a gating Result node when postgres_fdw replaced a join with such a qual. To fix, we could add the support by 1) modifying the ForeignPath and CustomPath structs to store the list of RestrictInfo nodes to apply to the join, as in JoinPaths, if they represent foreign and custom scans replacing a join with a scan, and by 2) modifying create_scan_plan() in createplan.c to use that list in that case, instead of the baserestrictinfo list, to get pseudoconstant quals assigned to the join; but #1 would cause an ABI break. So fix by modifying the infrastructure to just disallow replacing joins with such quals. Back-patch to all supported branches. Reported by Nishant Sharma. Patch by me, reviewed by Nishant Sharma and Richard Guo. Discussion: https://postgr.es/m/CADrsxdbcN1vejBaf8a%2BQhrZY5PXL-04mCd4GDu6qm6FigDZd6Q%40mail.gmail.com
2023-07-03Remove expensive test of postgres_fdw batch insertsTomas Vondra
The test inserted 70k rows into a foreign table, in order to verify correct behavior with more than 65535 parameters, and was added in response to a bug report. However, this is rather expensive, especially when running the tests under valgrind, CLOBBER_CACHE_ALWAYS etc. It doesn't seem worth it to keep running the test, so remove it from all branches (14+). Backpatch-through: 14 Discussion: https://postgr.es/m/2131017.1623451468@sss.pgh.pa.us