summaryrefslogtreecommitdiff
path: root/contrib/postgres_fdw
AgeCommit message (Collapse)Author
2024-04-21Make postgres_fdw request remote time zone 'GMT' not 'UTC'.Tom Lane
This should have the same results for all practical purposes. The advantage of selecting 'GMT' is that it's guaranteed to work even when the remote system's timezone database is missing entries, because pg_tzset() hard-wires handling of that, at least in 9.2 and later. (It seems like it would be a good idea to similarly hard-wire correct handling of 'UTC', but that'll be a little more invasive than I want to consider back-patching. Leave that for another day when we're not in feature freeze.) Per trouble report from Adnan Dautovic. Back-patch to all supported branches. Discussion: https://postgr.es/m/465248.1712211585@sss.pgh.pa.us
2024-03-11Fix deparsing of Consts in postgres_fdw ORDER BYDavid Rowley
For UNION ALL queries where a union child query contained a foreign table, if the targetlist of that query contained a constant, and the top-level query performed an ORDER BY which contained the column for the constant value, then postgres_fdw would find the EquivalenceMember with the Const and then try to produce an ORDER BY containing that Const. This caused problems with INT typed Consts as these could appear to be requests to order by an ordinal column position rather than the constant value. This could lead to either an error such as: ERROR: ORDER BY position <int const> is not in select list or worse, if the constant value is a valid column, then we could just sort by the wrong column altogether. Here we fix this issue by just not including these Consts in the ORDER BY clause. In passing, add a new section for testing ORDER BY in the postgres_fdw tests and move two existing tests which were misplaced in the WHERE clause testing section into it. Reported-by: Michał Kłeczek Reviewed-by: Ashutosh Bapat, Richard Guo Bug: #18381 Discussion: https://postgr.es/m/0714C8B8-8D82-4ABB-9F8D-A0C3657E7B6E%40kleczek.org Discussion: https://postgr.es/m/18381-137456acd168bf93%40postgresql.org Backpatch-through: 12, oldest supported version
2023-08-30postgres_fdw: Fix test for parameterized foreign scan.Etsuro Fujita
Commit e4106b252 should have updated this test, but did not; back-patch to all supported branches. Reviewed by Richard Guo. Discussion: http://postgr.es/m/CAPmGK15nR0NXLSCKQAcqbZbTzrzd5MozowWnTnGfPkayndF43Q%40mail.gmail.com
2023-07-28Disallow replacing joins with scans in problematic cases.Etsuro Fujita
Commit e7cb7ee14, which introduced the infrastructure for FDWs and custom scan providers to replace joins with scans, failed to add support handling of pseudoconstant quals assigned to replaced joins in createplan.c, leading to an incorrect plan without a gating Result node when postgres_fdw replaced a join with such a qual. To fix, we could add the support by 1) modifying the ForeignPath and CustomPath structs to store the list of RestrictInfo nodes to apply to the join, as in JoinPaths, if they represent foreign and custom scans replacing a join with a scan, and by 2) modifying create_scan_plan() in createplan.c to use that list in that case, instead of the baserestrictinfo list, to get pseudoconstant quals assigned to the join; but #1 would cause an ABI break. So fix by modifying the infrastructure to just disallow replacing joins with such quals. Back-patch to all supported branches. Reported by Nishant Sharma. Patch by me, reviewed by Nishant Sharma and Richard Guo. Discussion: https://postgr.es/m/CADrsxdbcN1vejBaf8a%2BQhrZY5PXL-04mCd4GDu6qm6FigDZd6Q%40mail.gmail.com
2022-12-15postgres_fdw: Fix assertion in estimate_path_cost_size().Etsuro Fujita
Commit 08d2d58a2 added this assertion assuming that the retrieved_rows estimate for a foreign relation is set to at least one row in estimate_path_cost_size(), but the assumption isn't correct because if the relation is a foreign table with tuples=0, the estimate would be set to 0 in there when using local stats, and if the query's WHERE clause has a NULL condition, the estimate could be set to 0 in there when using remote estimates. (Note: even in the latter case the assertion could be reachable when costing foreign final paths.) Repair. Per bug #17713 from Robins Tharakan. This patch was already applied to v13 or later; apply it to v12 as well where the aforementioned commit was added. Thanks to Richard Guo for investigation. Discussion: http://postgr.es/m/17713-92cce66de7e81c04%40postgresql.org Discussion: http://postgr.es/m/CAEP4nAza%2B0fTCLkgkKYux3JDo3tUBTQORehP%2BaCxSNURpSFpHw%40mail.gmail.com
2022-09-14postgres_fdw: Avoid 'variable not found in subplan target list' error.Etsuro Fujita
The tlist of the EvalPlanQual outer plan for a ForeignScan node is adjusted to produce a tuple whose descriptor matches the scan tuple slot for the ForeignScan node. But in the case where the outer plan contains an extra Sort node, if the new tlist contained columns required only for evaluating PlaceHolderVars or columns required only for evaluating local conditions, this would cause setrefs.c to fail with the error. The cause of this is that when creating the outer plan by injecting the Sort node into an alternative local join plan that could emit such extra columns as well, we fail to arrange for the outer plan to propagate them up through the Sort node, causing setrefs.c to fail to match up them in the new tlist to what is available from the outer plan. Repair. Per report from Alexander Pyhalov. Richard Guo and Etsuro Fujita, reviewed by Alexander Pyhalov and Tom Lane. Backpatch to all supported versions. Discussion: http://postgr.es/m/cfb17bf6dfdf876467bd5ef533852d18%40postgrespro.ru
2022-07-17postgres_fdw: set search_path to 'pg_catalog' while deparsing constants.Tom Lane
The motivation for this is to ensure successful transmission of the values of constants of regconfig and other reg* types. The remote will be reading them with search_path = 'pg_catalog', so schema qualification is necessary when referencing objects in other schemas. Per bug #17483 from Emmanuel Quincerot. Back-patch to all supported versions. (There's some other stuff to do here, but it's less back-patchable.) Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us
2022-03-31Fix postgres_fdw to check shippability of sort clauses properly.Tom Lane
postgres_fdw would push ORDER BY clauses to the remote side without verifying that the sort operator is safe to ship. Moreover, it failed to print a suitable USING clause if the sort operator isn't default for the sort expression's type. The net result of this is that the remote sort might not have anywhere near the semantics we expect, which'd be disastrous for locally-performed merge joins in particular. We addressed similar issues in the context of ORDER BY within an aggregate function call in commit 7012b132d, but failed to notice that query-level ORDER BY was broken. Thus, much of the necessary logic already existed, but it requires refactoring to be usable in both cases. Back-patch to all supported branches. In HEAD only, remove the core code's copy of find_em_expr_for_rel, which is no longer used and really should never have been pushed into equivclass.c in the first place. Ronan Dunklau, per report from David Rowley; reviews by David Rowley, Ranier Vilela, and myself Discussion: https://postgr.es/m/CAApHDvr4OeC2DBVY--zVP83-K=bYrTD7F8SZDhN4g+pj2f2S-A@mail.gmail.com
2021-12-03postgres_fdw: Fix unexpected reporting of empty message.Fujii Masao
pgfdw_report_error() in postgres_fdw gets a message from PGresult or PGconn to report an error received from a remote server. Previously if it could get a message from neither of them, it reported empty message unexpectedly. The cause of this issue was that pgfdw_report_error() didn't handle properly the case where no message could be obtained and its local variable message_primary was set to '\0'. This commit improves pgfdw_report_error() so that it reports the message "could not obtain ..." when it gets no message and message_primary is set to '\0'. This is the same behavior as when message_primary is NULL. dblink_res_error() in dblink has the same issue, so this commit also improves it in the same way. Back-patch to all supported branches. Author: Fujii Masao Reviewed-by: Bharath Rupireddy Discussion: https://postgr.es/m/477c16c8-7ea4-20fc-38d5-ed3a77ed616c@oss.nttdata.com
2021-10-13postgres_fdw: Move comments about elog level in (sub)abort cleanup.Etsuro Fujita
The comments were misplaced when adding postgres_fdw. Fix that by moving the comments to more appropriate functions. Author: Etsuro Fujita Backpatch-through: 9.6 Discussion: https://postgr.es/m/CAPmGK164sAXQtC46mDFyu6d-T25Mzvh5qaRNkit06VMmecYnOA%40mail.gmail.com
2021-10-06Fix null-pointer crash in postgres_fdw's conversion_error_callback.Tom Lane
Commit c7b7311f6 adjusted conversion_error_callback to always use information from the query's rangetable, to avoid doing catalog lookups in an already-failed transaction. However, as a result of the utterly inadequate documentation for make_tuple_from_result_row, I failed to realize that fsstate could be NULL in some contexts. That led to a crash if we got a conversion error in such a context. Fix by falling back to the previous coding when fsstate is NULL. Improve the commentary, too. Per report from Andrey Borodin. Back-patch to 9.6, like the previous patch. Discussion: https://postgr.es/m/08916396-55E4-4D68-AB3A-BD6066F9E5C0@yandex-team.ru
2021-08-05postgres_fdw: Fix issues with generated columns in foreign tables.Etsuro Fujita
postgres_fdw imported generated columns from the remote tables as plain columns, and caused failures like "ERROR: cannot insert a non-DEFAULT value into column "foo"" when inserting into the foreign tables, as it tried to insert values into the generated columns. To fix, we do the following under the assumption that generated columns in a postgres_fdw foreign table are defined so that they represent generated columns in the underlying remote table: * Send DEFAULT for the generated columns to the foreign server on insert or update, not generated column values computed on the local server. * Add to postgresImportForeignSchema() an option "import_generated" to include column generated expressions in the definitions of foreign tables imported from a foreign server. The option is true by default. The assumption seems reasonable, because that would make a query of the postgres_fdw foreign table return values for the generated columns that are consistent with the generated expression. While here, fix another issue in postgresImportForeignSchema(): it tried to include column generated expressions as column default expressions in the foreign table definitions when the import_default option was enabled. Per bug #16631 from Daniel Cherniy. Back-patch to v12 where generated columns were added. Discussion: https://postgr.es/m/16631-e929fe9db0ffc7cf%40postgresql.org
2021-07-28Avoid using ambiguous word "non-negative" in error messages.Fujii Masao
The error messages using the word "non-negative" are confusing because it's ambiguous about whether it accepts zero or not. This commit improves those error messages by replacing it with less ambiguous word like "greater than zero" or "greater than or equal to zero". Also this commit added the note about the word "non-negative" to the error message style guide, to help writing the new error messages. When postgres_fdw option fetch_size was set to zero, previously the error message "fetch_size requires a non-negative integer value" was reported. This error message was outright buggy. Therefore back-patch to all supported versions where such buggy error message could be thrown. Reported-by: Hou Zhijie Author: Bharath Rupireddy Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/OS0PR01MB5716415335A06B489F1B3A8194569@OS0PR01MB5716.jpnprd01.prod.outlook.com
2021-07-06Avoid doing catalog lookups in postgres_fdw's conversion_error_callback.Tom Lane
As in 50371df26, this is a bad idea since the callback can't really know what error is being thrown and thus whether or not it is safe to attempt catalog accesses. Rather than pushing said accesses into the mainline code where they'd usually be a waste of cycles, we can look at the query's rangetable instead. This change does mean that we'll be printing query aliases (if any were used) rather than the table or column's true name. But that doesn't seem like a bad thing: it's certainly a more useful definition in self-join cases, for instance. In any case, it seems unlikely that any applications would be depending on this detail, so it seems safe to change. Patch by me. Original complaint by Andres Freund; Bharath Rupireddy noted the connection to conversion_error_callback. Discussion: https://postgr.es/m/20210106020229.ne5xnuu6wlondjpe@alap3.anarazel.de
2021-06-02Fix planner's row-mark code for inheritance from a foreign table.Tom Lane
Commit 428b260f8 broke planning of cases where row marks are needed (SELECT FOR UPDATE, etc) and one of the query's tables is a foreign table that has regular table(s) as inheritance children. We got the reverse case right, but apparently were thinking that foreign tables couldn't be inheritance parents. Not so; so we need to be able to add a CTID junk column while adding a new child, not only a wholerow junk column. Back-patch to v12 where the faulty code came in. Amit Langote Discussion: https://postgr.es/m/CA+HiwqEmo3FV1LAQ4TVyS2h1WM=kMkZUmbNuZSCnfHvMcUcPeA@mail.gmail.com
2021-03-30Update obsolete comment.Etsuro Fujita
Back-patch to all supported branches. Author: Etsuro Fujita Discussion: https://postgr.es/m/CAPmGK17DwzaSf%2BB71dhL2apXdtG-OmD6u2AL9Cq2ZmAR0%2BzapQ%40mail.gmail.com
2021-02-08Fix permission checks on constraint violation errors on partitions.Heikki Linnakangas
If a cross-partition UPDATE violates a constraint on the target partition, and the columns in the new partition are in different physical order than in the parent, the error message can reveal columns that the user does not have SELECT permission on. A similar bug was fixed earlier in commit 804b6b6db4. The cause of the bug is that the callers of the ExecBuildSlotValueDescription() function got confused when constructing the list of modified columns. If the tuple was routed from a parent, we converted the tuple to the parent's format, but the list of modified columns was grabbed directly from the child's RTE entry. ExecUpdateLockMode() had a similar issue. That lead to confusion on which columns are key columns, leading to wrong tuple lock being taken on tables referenced by foreign keys, when a row is updated with INSERT ON CONFLICT UPDATE. A new isolation test is added for that corner case. With this patch, the ri_RangeTableIndex field is no longer set for partitions that don't have an entry in the range table. Previously, it was set to the RTE entry of the parent relation, but that was confusing. NOTE: This modifies the ResultRelInfo struct, replacing the ri_PartitionRoot field with ri_RootResultRelInfo. That's a bit risky to backpatch, because it breaks any extensions accessing the field. The change that ri_RangeTableIndex is not set for partitions could potentially break extensions, too. The ResultRelInfos are visible to FDWs at least, and this patch required small changes to postgres_fdw. Nevertheless, this seem like the least bad option. I don't think these fields widely used in extensions; I don't think there are FDWs out there that uses the FDW "direct update" API, other than postgres_fdw. If there is, you will get a compilation error, so hopefully it is caught quickly. Backpatch to 11, where support for both cross-partition UPDATEs, and unique indexes on partitioned tables, were added. Reviewed-by: Amit Langote Security: CVE-2021-3393
2021-01-21Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.Tom Lane
Previously, pull_varnos() took the relids of a PlaceHolderVar as being equal to the relids in its contents, but that fails to account for the possibility that we have to postpone evaluation of the PHV due to outer joins. This could result in a malformed plan. The known cases end up triggering the "failed to assign all NestLoopParams to plan nodes" sanity check in createplan.c, but other symptoms may be possible. The right value to use is the join level we actually intend to evaluate the PHV at. We can get that from the ph_eval_at field of the associated PlaceHolderInfo. However, there are some places that call pull_varnos() before the PlaceHolderInfos have been created; in that case, fall back to the conservative assumption that the PHV will be evaluated at its syntactic level. (In principle this might result in missing some legal optimization, but I'm not aware of any cases where it's an issue in practice.) Things are also a bit ticklish for calls occurring during deconstruct_jointree(), but AFAICS the ph_eval_at fields should have reached their final values by the time we need them. The main problem in making this work is that pull_varnos() has no way to get at the PlaceHolderInfos. We can fix that easily, if a bit tediously, in HEAD by passing it the planner "root" pointer. In the back branches that'd cause an unacceptable API/ABI break for extensions, so leave the existing entry points alone and add new ones with the additional parameter. (If an old entry point is called and encounters a PHV, it'll fall back to using the syntactic level, again possibly missing some valid optimization.) Back-patch to v12. The computation is surely also wrong before that, but it appears that we cannot reach a bad plan thanks to join order restrictions imposed on the subquery that the PlaceHolderVar came from. The error only became reachable when commit 4be058fe9 allowed trivial subqueries to be collapsed out completely, eliminating their join order restrictions. Per report from Stephan Springl. Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2020-12-28postgres_fdw: Fix connection leak.Fujii Masao
In postgres_fdw, the cached connections to foreign servers will not be closed until the local session exits if the user mappings or foreign servers that those connections depend on are dropped. Those connections can be leaked. To fix that connection leak issue, after a change to a pg_foreign_server or pg_user_mapping catalog entry, this commit makes postgres_fdw close the connections depending on that entry immediately if current transaction has not used those connections yet. Otherwise, mark those connections as invalid and then close them at the end of current transaction, since they cannot be closed in the midst of the transaction using them. Closed connections will be remade at the next opportunity if necessary. Back-patch to all supported branches. Author: Bharath Rupireddy Reviewed-by: Zhihong Yu, Zhijie Hou, Fujii Masao Discussion: https://postgr.es/m/CALj2ACVNcGH_6qLY-4_tXz8JLvA+4yeBThRfxMz7Oxbk1aHcpQ@mail.gmail.com
2020-11-10Fix and simplify some usages of TimestampDifference().Tom Lane
Introduce TimestampDifferenceMilliseconds() to simplify callers that would rather have the difference in milliseconds, instead of the select()-oriented seconds-and-microseconds format. This gets rid of at least one integer division per call, and it eliminates some apparently-easy-to-mess-up arithmetic. Two of these call sites were in fact wrong: * pg_prewarm's autoprewarm_main() forgot to multiply the seconds by 1000, thus ending up with a delay 1000X shorter than intended. That doesn't quite make it a busy-wait, but close. * postgres_fdw's pgfdw_get_cleanup_result() thought it needed to compute microseconds not milliseconds, thus ending up with a delay 1000X longer than intended. Somebody along the way had noticed this problem but misdiagnosed the cause, and imposed an ad-hoc 60-second limit rather than fixing the units. This was relatively harmless in context, because we don't care that much about exactly how long this delay is; still, it's wrong. There are a few more callers of TimestampDifference() that don't have a direct need for seconds-and-microseconds, but can't use TimestampDifferenceMilliseconds() either because they do need microsecond precision or because they might possibly deal with intervals long enough to overflow 32-bit milliseconds. It might be worth inventing another API to improve that, but that seems outside the scope of this patch; so those callers are untouched here. Given the fact that we are fixing some bugs, and the likelihood that future patches might want to back-patch code that uses this new API, back-patch to all supported branches. Alexey Kondratov and Tom Lane Discussion: https://postgr.es/m/3b1c053a21c07c1ed5e00be3b2b855ef@postgrespro.ru
2020-11-09In security-restricted operations, block enqueue of at-commit user code.Noah Misch
Specifically, this blocks DECLARE ... WITH HOLD and firing of deferred triggers within index expressions and materialized view queries. An attacker having permission to create non-temp objects in at least one schema could execute arbitrary SQL functions under the identity of the bootstrap superuser. One can work around the vulnerability by disabling autovacuum and not manually running ANALYZE, CLUSTER, REINDEX, CREATE INDEX, VACUUM FULL, or REFRESH MATERIALIZED VIEW. (Don't restore from pg_dump, since it runs some of those commands.) Plain VACUUM (without FULL) is safe, and all commands are fine when a trusted user owns the target object. Performance may degrade quickly under this workaround, however. Back-patch to 9.5 (all supported versions). Reviewed by Robert Haas. Reported by Etienne Stalmans. Security: CVE-2020-25695
2020-01-26In postgres_fdw, don't try to ship MULTIEXPR updates to remote server.Tom Lane
In a statement like "UPDATE remote_tab SET (x,y) = (SELECT ...)", we'd conclude that the statement could be directly executed remotely, because the sub-SELECT is in a resjunk tlist item that's not examined for shippability. Currently that ends up crashing if the sub-SELECT contains any remote Vars. Prevent the crash by deeming MULTIEXEC Params to be unshippable. This is a bit of a brute-force solution, since if the sub-SELECT *doesn't* contain any remote Vars, the current execution technology would work; but that's not a terribly common use-case for this syntax, I think. In any case, we generally don't try to ship sub-SELECTs, so it won't surprise anybody that this doesn't end up as a remote direct update. I'd be inclined to see if that general limitation can be fixed before worrying about this case further. Per report from Lukáš Sobotka. Back-patch to 9.6. 9.5 had MULTIEXPR, but we didn't try to perform remote direct updates then, so the case didn't arise anyway. Discussion: https://postgr.es/m/CAJif3k+iA_ekBB5Zw2hDBaE1wtiQa4LH4_JUXrrMGwTrH0J01Q@mail.gmail.com
2019-12-20libpq should expose GSS-related parameters even when not implemented.Tom Lane
We realized years ago that it's better for libpq to accept all connection parameters syntactically, even if some are ignored or restricted due to lack of the feature in a particular build. However, that lesson from the SSL support was for some reason never applied to the GSSAPI support. This is causing various buildfarm members to have problems with a test case added by commit 6136e94dc, and it's just a bad idea from a user-experience standpoint anyway, so fix it. While at it, fix some places where parameter-related infrastructure was added with the aid of a dartboard, or perhaps with the aid of the anti-pattern "add new stuff at the end". It should be safe to rearrange the contents of struct pg_conn even in released branches, since that's private to libpq (and we'd have to move some fields in some builds to fix this, anyway). Back-patch to all supported branches. Discussion: https://postgr.es/m/11297.1576868677@sss.pgh.pa.us
2019-12-10Fix handling of multiple AFTER ROW triggers on a foreign table.Etsuro Fujita
AfterTriggerExecute() retrieves a fresh tuple or pair of tuples from a tuplestore and then stores the tuple(s) in the passed-in slot(s) if AFTER_TRIGGER_FDW_FETCH, while it uses the most-recently-retrieved tuple(s) stored in the slot(s) if AFTER_TRIGGER_FDW_REUSE. This was done correctly before 12, but commit ff11e7f4b broke it by mistakenly clearing the tuple(s) stored in the slot(s) in that function, leading to an assertion failure as reported in bug #16139 from Alexander Lakhin. Also, fix some other issues with the aforementioned commit in passing: * For tg_newslot, which is a slot added to the TriggerData struct by the commit to store new updated tuples, it didn't ensure the slot was NULL if there was no such tuple. * The commit failed to update the documentation about the trigger interface. Author: Etsuro Fujita Backpatch-through: 12 Discussion: https://postgr.es/m/16139-94f9ccf0db6119ec%40postgresql.org
2019-11-08postgres_fdw: Fix error message for PREPARE TRANSACTION.Etsuro Fujita
Currently, postgres_fdw does not support preparing a remote transaction for two-phase commit even in the case where the remote transaction is read-only, but the old error message appeared to imply that that was not supported only if the remote transaction modified remote tables. Change the message so as to include the case where the remote transaction is read-only. Also fix a comment above the message. Also add a note about the lack of supporting PREPARE TRANSACTION to the postgres_fdw documentation. Reported-by: Gilles Darold Author: Gilles Darold and Etsuro Fujita Reviewed-by: Michael Paquier and Kyotaro Horiguchi Backpatch-through: 9.4 Discussion: https://postgr.es/m/08600ed3-3084-be70-65ba-279ab19618a5%40darold.net
2019-07-01pgindent run prior to branching v12.Tom Lane
pgperltidy and reformat-dat-files too, though the latter didn't find anything to change.
2019-07-01Fix many typos and inconsistenciesMichael Paquier
Author: Alexander Lakhin Discussion: https://postgr.es/m/af27d1b3-a128-9d62-46e0-88f424397f44@gmail.com
2019-06-14postgres_fdw: Fix costing of pre-sorted foreign paths with local stats.Etsuro Fujita
Commit aa09cd242 modified estimate_path_cost_size() so that it reuses cached costs of a basic foreign path for a given foreign-base/join relation when costing pre-sorted foreign paths for that relation, but it incorrectly re-computed retrieved_rows, an estimated number of rows fetched from the remote side, which is needed for costing both the basic and pre-sorted foreign paths. To fix, handle retrieved_rows the same way as the cached costs: store in that relation's fpinfo the retrieved_rows estimate computed for costing the basic foreign path, and reuse it when costing the pre-sorted foreign paths. Also, reuse the rows/width estimates stored in that relation's fpinfo when costing the pre-sorted foreign paths, to make the code consistent. In commit ffab494a4, to extend the costing mentioned above to the foreign-grouping case, I made a change to add_foreign_grouping_paths() to store in a given foreign-grouped relation's RelOptInfo the rows estimate for that relation for reuse, but this patch makes that change unnecessary since we already store the row estimate in that relation's fpinfo, which this patch reuses when costing a foreign path for that relation with the sortClause ordering; remove that change. In passing, fix thinko in commit 7012b132d: in estimate_path_cost_size(), the width estimate for a given foreign-grouped relation to be stored in that relation's fpinfo was reset incorrectly when costing a basic foreign path for that relation with local stats. Apply the patch to HEAD only to avoid destabilizing existing plan choices. Author: Etsuro Fujita Discussion: https://postgr.es/m/CAPmGK17jaJLPDEkgnP2VmkOg=5wT8YQ1CqssU8JRpZ_NSE+dqQ@mail.gmail.com
2019-06-13postgres_fdw: Account for triggers in non-direct remote UPDATE planning.Etsuro Fujita
Previously, in postgresPlanForeignModify, we planned an UPDATE operation on a foreign table so that we transmit only columns that were explicitly targets of the UPDATE, so as to avoid unnecessary data transmission, but if there were BEFORE ROW UPDATE triggers on the foreign table, those triggers might change values for non-target columns, in which case we would miss sending changed values for those columns. Prevent optimizing away transmitting all columns if there are BEFORE ROW UPDATE triggers on the foreign table. This is an oversight in commit 7cbe57c34 which added triggers on foreign tables, so apply the patch all the way back to 9.4 where that came in. Author: Shohei Mochizuki Reviewed-by: Amit Langote Discussion: https://postgr.es/m/201905270152.x4R1q3qi014550@toshiba.co.jp
2019-06-11postgres_fdw: Reorder C includes.Etsuro Fujita
Reorder header files in postgres_fdw.c and connection.c in alphabetical order. Author: Etsuro Fujita Reviewed-by: Alvaro Herrera Discussion: https://postgr.es/m/CAPmGK17ZmNb-EELqu8LmMh2t2uFdbfWNVDEfDO5-bpejHPONMQ@mail.gmail.com
2019-05-26Fix typos.Amit Kapila
Reported-by: Alexander Lakhin Author: Alexander Lakhin Reviewed-by: Amit Kapila and Tom Lane Discussion: https://postgr.es/m/7208de98-add8-8537-91c0-f8b089e2928c@gmail.com
2019-05-22Phase 2 pgindent run for v12.Tom Lane
Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
2019-05-22Initial pgindent run for v12.Tom Lane
This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
2019-05-13postgres_fdw: Fix typo in comment.Etsuro Fujita
2019-05-09postgres_fdw: Fix cost estimation for aggregate pushdown.Etsuro Fujita
In commit 7012b132d0, which added support for aggregate pushdown in postgres_fdw, the expense of evaluating the final scan/join target computed by make_group_input_target() was not accounted for at all in costing aggregate pushdown paths with local statistics. The right fix for this would be to have a separate upper stage to adjust the final scan/join relation (see comments for apply_scanjoin_target_to_paths()); but for now, fix by adding the tlist eval cost when costing aggregate pushdown paths with local statistics. Apply this to HEAD only to avoid destabilizing existing plan choices. Author: Etsuro Fujita Reviewed-By: Antonin Houska Discussion: https://postgr.es/m/5C66A056.60007%40lab.ntt.co.jp
2019-04-27Avoid postgres_fdw crash for a targetlist entry that's just a Param.Tom Lane
foreign_grouping_ok() is willing to put fairly arbitrary expressions into the targetlist of a remote SELECT that's doing grouping or aggregation on the remote side, including expressions that have no foreign component to them at all. This is possibly a bit dubious from an efficiency standpoint; but it rises to the level of a crash-causing bug if the expression is just a Param or non-foreign Var. In that case, the expression will necessarily also appear in the fdw_exprs list of values we need to send to the remote server, and then setrefs.c's set_foreignscan_references will mistakenly replace the fdw_exprs entry with a Var referencing the targetlist result. The root cause of this problem is bad design in commit e7cb7ee14: it put logic into set_foreignscan_references that IMV is postgres_fdw-specific, and yet this bug shows that it isn't postgres_fdw-specific enough. The transformation being done on fdw_exprs assumes that fdw_exprs is to be evaluated with the fdw_scan_tlist as input, which is not how postgres_fdw uses it; yet it could be the right thing for some other FDW. (In the bigger picture, setrefs.c has no business assuming this for the other expression fields of a ForeignScan either.) The right fix therefore would be to expand the FDW API so that the FDW could inform setrefs.c how it intends to evaluate these various expressions. We can't change that in the back branches though, and we also can't just summarily change setrefs.c's behavior there, or we're likely to break external FDWs. As a stopgap, therefore, hack up postgres_fdw so that it won't attempt to send targetlist entries that look exactly like the fdw_exprs entries they'd produce. In most cases this actually produces a superior plan, IMO, with less data needing to be transmitted and returned; so we probably ought to think harder about whether we should ship tlist expressions at all when they don't contain any foreign Vars or Aggs. But that's an optimization not a bug fix so I left it for later. One case where this produces an inferior plan is where the expression in question is actually a GROUP BY expression: then the restriction prevents us from using remote grouping. It might be possible to work around that (since that would reduce to group-by-a-constant on the remote side); but it seems like a pretty unlikely corner case, so I'm not sure it's worth expending effort solely to improve that. In any case the right long-term answer is to fix the API as sketched above, and then revert this hack. Per bug #15781 from Sean Johnston. Back-patch to v10 where the problem was introduced. Discussion: https://postgr.es/m/15781-2601b1002bad087c@postgresql.org
2019-04-24postgres_fdw: Fix incorrect handling of row movement for remote partitions.Etsuro Fujita
Commit 3d956d9562 added support for update row movement in postgres_fdw. This patch fixes the following issues introduced by that commit: * When a remote partition chosen to insert routed rows into was also an UPDATE subplan target rel that would be updated later, the UPDATE that used a direct modification plan modified those routed rows incorrectly because those routed rows were visible to the later UPDATE command. The right fix for this would be to have some way in postgres_fdw in which the later UPDATE command ignores those routed rows, but it seems hard to do so with the current infrastructure. For now throw an error in that case. * When a remote partition chosen to insert routed rows into was also an UPDATE subplan target rel, fmstate created for the UPDATE that used a non-direct modification plan was mistakenly overridden by another fmstate created for inserting those routed rows into the partition. This caused 1) server crash when the partition would be updated later, and 2) resource leak when the partition had been already updated. To avoid that, adjust the treatment of the fmstate for the inserting. As for #1, since we would also have the incorrectness issue as mentioned above, error out in that case as well. Update the docs to mention that postgres_fdw currently does not handle the case where a remote partition chosen to insert a routed row into is also an UPDATE subplan target rel that will be updated later. Author: Amit Langote and Etsuro Fujita Reviewed-by: Amit Langote Backpatch-through: 11 where row movement in postgres_fdw was added Discussion: https://postgr.es/m/21e7eaa4-0d4d-20c2-a1f7-c7e96f4ce440@lab.ntt.co.jp
2019-04-19Fix two memory leaks around force-storing tuples in slots.Andres Freund
As reported by Tom, when ExecStoreMinimalTuple() had to perform a conversion to store the minimal tuple in the slot, it forgot to respect the shouldFree flag, and leaked the tuple into the current memory context if true. Fix that by freeing the tuple in that case. Looking at the relevant code made me (Andres) realize that not having the shouldFree parameter to ExecForceStoreHeapTuple() was a bad idea. Some callers had to locally implement the necessary logic, and in one case it was missing, creating a potential per-group leak in non-hashed aggregation. The choice to not free the tuple in ExecComputeStoredGenerated() is not pretty, but not introduced by this commit - I'll start a separate discussion about it. Reported-By: Tom Lane Discussion: https://postgr.es/m/366.1555382816@sss.pgh.pa.us
2019-04-06Add support TCP user timeout in libpq and the backend serverMichael Paquier
Similarly to the set of parameters for keepalive, a connection parameter for libpq is added as well as a backend GUC, called tcp_user_timeout. Increasing the TCP user timeout is useful to allow a connection to survive extended periods without end-to-end connection, and decreasing it allows application to fail faster. By default, the parameter is 0, which makes the connection use the system default, and follows a logic close to the keepalive parameters in its handling. When connecting through a Unix-socket domain, the parameters have no effect. Author: Ryohei Nagaura Reviewed-by: Fabien Coelho, Robert Haas, Kyotaro Horiguchi, Kirk Jamison, Mikalai Keida, Takayuki Tsunakawa, Andrei Yahorau Discussion: https://postgr.es/m/EDA4195584F5064680D8130B1CA91C45367328@G01JPEXMBYT04
2019-04-02postgres_fdw: Perform the (FINAL, NULL) upperrel operations remotely.Etsuro Fujita
The upper-planner pathification allows FDWs to arrange to push down different types of upper-stage operations to the remote side. This commit teaches postgres_fdw to do it for the (FINAL, NULL) upperrel, which is responsible for doing LockRows, LIMIT, and/or ModifyTable. This provides the ability for postgres_fdw to handle SELECT commands so that it 1) skips the LockRows step (if any) (note that this is safe since it performs early locking) and 2) pushes down the LIMIT and/or OFFSET restrictions (if any) to the remote side. This doesn't handle the INSERT/UPDATE/DELETE cases. Author: Etsuro Fujita Reviewed-By: Antonin Houska and Jeff Janes Discussion: https://postgr.es/m/87pnz1aby9.fsf@news-spur.riddles.org.uk
2019-04-02postgres_fdw: Modify regression tests for EPQ-related planning problems.Etsuro Fujita
This prevents the tests added by commit 4bbf6edfbd and adjusted by commit 99f6a17dd6 from being useless by plan changes created by an upcoming commit. Author: Etsuro Fujita Discussion: https://postgr.es/m/87pnz1aby9.fsf@news-spur.riddles.org.uk
2019-04-02postgres_fdw: Perform the (ORDERED, NULL) upperrel operations remotely.Etsuro Fujita
The upper-planner pathification allows FDWs to arrange to push down different types of upper-stage operations to the remote side. This commit teaches postgres_fdw to do it for the (ORDERED, NULL) upperrel, which is responsible for evaluating the query's ORDER BY ordering. Since postgres_fdw is already able to evaluate that ordering remotely for foreign baserels and foreign joinrels (see commit aa09cd242f et al.), this adds support for that for foreign grouping relations. Author: Etsuro Fujita Reviewed-By: Antonin Houska and Jeff Janes Discussion: https://postgr.es/m/87pnz1aby9.fsf@news-spur.riddles.org.uk
2019-03-30Speed up planning when partitions can be pruned at plan time.Tom Lane
Previously, the planner created RangeTblEntry and RelOptInfo structs for every partition of a partitioned table, even though many of them might later be deemed uninteresting thanks to partition pruning logic. This incurred significant overhead when there are many partitions. Arrange to postpone creation of these data structures until after we've processed the query enough to identify restriction quals for the partitioned table, and then apply partition pruning before not after creation of each partition's data structures. In this way we need not open the partition relations at all for partitions that the planner has no real interest in. For queries that can be proven at plan time to access only a small number of partitions, this patch improves the practical maximum number of partitions from under 100 to perhaps a few thousand. Amit Langote, reviewed at various times by Dilip Kumar, Jesper Pedersen, Yoshikazu Imai, and David Rowley Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp
2019-03-30Generated columnsPeter Eisentraut
This is an SQL-standard feature that allows creating columns that are computed from expressions rather than assigned, similar to a view or materialized view but on a column basis. This implements one kind of generated column: stored (computed on write). Another kind, virtual (computed on read), is planned for the future, and some room is left for it. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/b151f851-4019-bdb1-699e-ebab07d2f40a@2ndquadrant.com
2019-03-26Get rid of duplicate child RTE for a partitioned table.Tom Lane
We've been creating duplicate RTEs for partitioned tables just because we do so for regular inheritance parent tables. But unlike regular-inheritance parents which are themselves regular tables and thus need to be scanned, partitioned tables don't need the extra RTE. This makes the conditions for building a child RTE the same as those for building an AppendRelInfo, allowing minor simplification in expand_single_inheritance_child. Since the planner's actual processing is driven off the AppendRelInfo list, nothing much changes beyond that, we just have one fewer useless RTE entry. Amit Langote, reviewed and hacked a bit by me Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp
2019-03-25Suppress Append and MergeAppend plan nodes that have a single child.Tom Lane
If there's only one child relation, the Append or MergeAppend isn't doing anything useful, and can be elided. It does have a purpose during planning though, which is to serve as a buffer between parent and child Var numbering. Therefore we keep it all the way through to setrefs.c, and get rid of it only after fixing references in the plan level(s) above it. This works largely the same as setrefs.c's ancient hack to get rid of no-op SubqueryScan nodes, and can even share some code with that. Note the change to make setrefs.c use apply_tlist_labeling rather than ad-hoc code. This has the effect of propagating the child's resjunk and ressortgroupref labels, which formerly weren't propagated when removing a SubqueryScan. Doing that is demonstrably necessary for the [Merge]Append cases, and seems harmless for SubqueryScan, if only because trivial_subqueryscan is afraid to collapse cases where the resjunk marking differs. (I suspect that restriction could now be removed, though it's unclear that it'd make any new matches possible, since the outer query can't have references to a child resjunk column.) David Rowley, reviewed by Alvaro Herrera and Tomas Vondra Discussion: https://postgr.es/m/CAKJS1f_7u8ATyJ1JGTMHFoKDvZdeF-iEBhs+sM_SXowOr9cArg@mail.gmail.com
2019-02-26Use slots in trigger infrastructure, except for the actual invocation.Andres Freund
In preparation for abstracting table storage, convert trigger.c to track tuples in slots. Which also happens to make code calling triggers simpler. As the calling interface for triggers themselves is not changed in this patch, HeapTuples still are extracted from the slot at that time. But that's handled solely inside trigger.c, not visible to callers. It's quite likely that we'll want to revise the external trigger interface, but that's a separate large project. As part of this work the slots used for old/new/return tuples are moved from EState into ResultRelInfo, as different updated tables might need different slots. The slots are now also now created on-demand, which is good both from an efficiency POV, but also makes the modifying code simpler. Author: Andres Freund, Amit Khandekar and Ashutosh Bapat Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-02-16Allow user control of CTE materialization, and change the default behavior.Tom Lane
Historically we've always materialized the full output of a CTE query, treating WITH as an optimization fence (so that, for example, restrictions from the outer query cannot be pushed into it). This is appropriate when the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE query is non-recursive and side-effect-free, there's no hazard of changing the query results by pushing restrictions down. Another argument for materialization is that it can avoid duplicate computation of an expensive WITH query --- but that only applies if the WITH query is called more than once in the outer query. Even then it could still be a net loss, if each call has restrictions that would allow just a small part of the WITH query to be computed. Hence, let's change the behavior for WITH queries that are non-recursive and side-effect-free. By default, we will inline them into the outer query (removing the optimization fence) if they are called just once. If they are called more than once, we will keep the old behavior by default, but the user can override this and force inlining by specifying NOT MATERIALIZED. Lastly, the user can force the old behavior by specifying MATERIALIZED; this would mainly be useful when the query had deliberately been employing WITH as an optimization fence to prevent a poor choice of plan. Andreas Karlsson, Andrew Gierth, David Fetter Discussion: https://postgr.es/m/87sh48ffhb.fsf@news-spur.riddles.org.uk
2019-02-09Build out the planner support function infrastructure.Tom Lane
Add support function requests for estimating the selectivity, cost, and number of result rows (if a SRF) of the target function. The lack of a way to estimate selectivity of a boolean-returning function in WHERE has been a recognized deficiency of the planner since Berkeley days. This commit finally fixes it. In addition, non-constant estimates of cost and number of output rows are now possible. We still fall back to looking at procost and prorows if the support function doesn't service the request, of course. To make concrete use of the possibility of estimating output rowcount for SRFs, this commit adds support functions for array_unnest(anyarray) and the integer variants of generate_series; the lack of plausible rowcount estimates for those, even when it's obvious to a human, has been a repeated subject of complaints. Obviously, much more could now be done in this line, but I'm mostly just trying to get the infrastructure in place. Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-07Split create_foreignscan_path() into three functions.Tom Lane
Up to now postgres_fdw has been using create_foreignscan_path() to generate not only base-relation paths, but also paths for foreign joins and foreign upperrels. This is wrong, because create_foreignscan_path() calls get_baserel_parampathinfo() which will only do the right thing for baserels. It accidentally fails to fail for unparameterized paths, which are the only ones postgres_fdw (thought it) was handling, but we really need different APIs for the baserel and join cases. In HEAD, the best thing to do seems to be to split up the baserel, joinrel, and upperrel cases into three functions so that they can have different APIs. I haven't actually given create_foreign_join_path a different API in this commit: we should spend a bit of time thinking about just what we want to do there, since perhaps FDWs would want to do something different from the build-up-a-join-pairwise approach that get_joinrel_parampathinfo expects. In the meantime, since postgres_fdw isn't prepared to generate parameterized joins anyway, just give it a defense against trying to plan joins with lateral refs. In addition (and this is what triggered this whole mess) fix bug #15613 from Srinivasan S A, by teaching file_fdw and postgres_fdw that plain baserel foreign paths still have outer refs if the relation has lateral_relids. Add some assertions in relnode.c to catch future occurrences of the same error --- in particular, to catch other FDWs doing that, but also as backstop against core-code mistakes like the one fixed by commit bdd9a99aa. Bug #15613 also needs to be fixed in the back branches, but the appropriate fix will look quite a bit different there, since we don't want to assume that existing FDWs get the word right away. Discussion: https://postgr.es/m/15613-092be1be9576c728@postgresql.org