summaryrefslogtreecommitdiff
path: root/contrib/postgres_fdw
AgeCommit message (Collapse)Author
2021-07-28Avoid using ambiguous word "non-negative" in error messages.Fujii Masao
The error messages using the word "non-negative" are confusing because it's ambiguous about whether it accepts zero or not. This commit improves those error messages by replacing it with less ambiguous word like "greater than zero" or "greater than or equal to zero". Also this commit added the note about the word "non-negative" to the error message style guide, to help writing the new error messages. When postgres_fdw option fetch_size was set to zero, previously the error message "fetch_size requires a non-negative integer value" was reported. This error message was outright buggy. Therefore back-patch to all supported versions where such buggy error message could be thrown. Reported-by: Hou Zhijie Author: Bharath Rupireddy Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/OS0PR01MB5716415335A06B489F1B3A8194569@OS0PR01MB5716.jpnprd01.prod.outlook.com
2021-07-06Avoid doing catalog lookups in postgres_fdw's conversion_error_callback.Tom Lane
As in 50371df26, this is a bad idea since the callback can't really know what error is being thrown and thus whether or not it is safe to attempt catalog accesses. Rather than pushing said accesses into the mainline code where they'd usually be a waste of cycles, we can look at the query's rangetable instead. This change does mean that we'll be printing query aliases (if any were used) rather than the table or column's true name. But that doesn't seem like a bad thing: it's certainly a more useful definition in self-join cases, for instance. In any case, it seems unlikely that any applications would be depending on this detail, so it seems safe to change. Patch by me. Original complaint by Andres Freund; Bharath Rupireddy noted the connection to conversion_error_callback. Discussion: https://postgr.es/m/20210106020229.ne5xnuu6wlondjpe@alap3.anarazel.de
2021-03-30Update obsolete comment.Etsuro Fujita
Back-patch to all supported branches. Author: Etsuro Fujita Discussion: https://postgr.es/m/CAPmGK17DwzaSf%2BB71dhL2apXdtG-OmD6u2AL9Cq2ZmAR0%2BzapQ%40mail.gmail.com
2020-12-28postgres_fdw: Fix connection leak.Fujii Masao
In postgres_fdw, the cached connections to foreign servers will not be closed until the local session exits if the user mappings or foreign servers that those connections depend on are dropped. Those connections can be leaked. To fix that connection leak issue, after a change to a pg_foreign_server or pg_user_mapping catalog entry, this commit makes postgres_fdw close the connections depending on that entry immediately if current transaction has not used those connections yet. Otherwise, mark those connections as invalid and then close them at the end of current transaction, since they cannot be closed in the midst of the transaction using them. Closed connections will be remade at the next opportunity if necessary. Back-patch to all supported branches. Author: Bharath Rupireddy Reviewed-by: Zhihong Yu, Zhijie Hou, Fujii Masao Discussion: https://postgr.es/m/CALj2ACVNcGH_6qLY-4_tXz8JLvA+4yeBThRfxMz7Oxbk1aHcpQ@mail.gmail.com
2020-11-10Fix and simplify some usages of TimestampDifference().Tom Lane
Introduce TimestampDifferenceMilliseconds() to simplify callers that would rather have the difference in milliseconds, instead of the select()-oriented seconds-and-microseconds format. This gets rid of at least one integer division per call, and it eliminates some apparently-easy-to-mess-up arithmetic. Two of these call sites were in fact wrong: * pg_prewarm's autoprewarm_main() forgot to multiply the seconds by 1000, thus ending up with a delay 1000X shorter than intended. That doesn't quite make it a busy-wait, but close. * postgres_fdw's pgfdw_get_cleanup_result() thought it needed to compute microseconds not milliseconds, thus ending up with a delay 1000X longer than intended. Somebody along the way had noticed this problem but misdiagnosed the cause, and imposed an ad-hoc 60-second limit rather than fixing the units. This was relatively harmless in context, because we don't care that much about exactly how long this delay is; still, it's wrong. There are a few more callers of TimestampDifference() that don't have a direct need for seconds-and-microseconds, but can't use TimestampDifferenceMilliseconds() either because they do need microsecond precision or because they might possibly deal with intervals long enough to overflow 32-bit milliseconds. It might be worth inventing another API to improve that, but that seems outside the scope of this patch; so those callers are untouched here. Given the fact that we are fixing some bugs, and the likelihood that future patches might want to back-patch code that uses this new API, back-patch to all supported branches. Alexey Kondratov and Tom Lane Discussion: https://postgr.es/m/3b1c053a21c07c1ed5e00be3b2b855ef@postgrespro.ru
2020-11-09In security-restricted operations, block enqueue of at-commit user code.Noah Misch
Specifically, this blocks DECLARE ... WITH HOLD and firing of deferred triggers within index expressions and materialized view queries. An attacker having permission to create non-temp objects in at least one schema could execute arbitrary SQL functions under the identity of the bootstrap superuser. One can work around the vulnerability by disabling autovacuum and not manually running ANALYZE, CLUSTER, REINDEX, CREATE INDEX, VACUUM FULL, or REFRESH MATERIALIZED VIEW. (Don't restore from pg_dump, since it runs some of those commands.) Plain VACUUM (without FULL) is safe, and all commands are fine when a trusted user owns the target object. Performance may degrade quickly under this workaround, however. Back-patch to 9.5 (all supported versions). Reviewed by Robert Haas. Reported by Etienne Stalmans. Security: CVE-2020-25695
2020-01-26In postgres_fdw, don't try to ship MULTIEXPR updates to remote server.Tom Lane
In a statement like "UPDATE remote_tab SET (x,y) = (SELECT ...)", we'd conclude that the statement could be directly executed remotely, because the sub-SELECT is in a resjunk tlist item that's not examined for shippability. Currently that ends up crashing if the sub-SELECT contains any remote Vars. Prevent the crash by deeming MULTIEXEC Params to be unshippable. This is a bit of a brute-force solution, since if the sub-SELECT *doesn't* contain any remote Vars, the current execution technology would work; but that's not a terribly common use-case for this syntax, I think. In any case, we generally don't try to ship sub-SELECTs, so it won't surprise anybody that this doesn't end up as a remote direct update. I'd be inclined to see if that general limitation can be fixed before worrying about this case further. Per report from Lukáš Sobotka. Back-patch to 9.6. 9.5 had MULTIEXPR, but we didn't try to perform remote direct updates then, so the case didn't arise anyway. Discussion: https://postgr.es/m/CAJif3k+iA_ekBB5Zw2hDBaE1wtiQa4LH4_JUXrrMGwTrH0J01Q@mail.gmail.com
2019-12-20libpq should expose GSS-related parameters even when not implemented.Tom Lane
We realized years ago that it's better for libpq to accept all connection parameters syntactically, even if some are ignored or restricted due to lack of the feature in a particular build. However, that lesson from the SSL support was for some reason never applied to the GSSAPI support. This is causing various buildfarm members to have problems with a test case added by commit 6136e94dc, and it's just a bad idea from a user-experience standpoint anyway, so fix it. While at it, fix some places where parameter-related infrastructure was added with the aid of a dartboard, or perhaps with the aid of the anti-pattern "add new stuff at the end". It should be safe to rearrange the contents of struct pg_conn even in released branches, since that's private to libpq (and we'd have to move some fields in some builds to fix this, anyway). Back-patch to all supported branches. Discussion: https://postgr.es/m/11297.1576868677@sss.pgh.pa.us
2019-11-08postgres_fdw: Fix error message for PREPARE TRANSACTION.Etsuro Fujita
Currently, postgres_fdw does not support preparing a remote transaction for two-phase commit even in the case where the remote transaction is read-only, but the old error message appeared to imply that that was not supported only if the remote transaction modified remote tables. Change the message so as to include the case where the remote transaction is read-only. Also fix a comment above the message. Also add a note about the lack of supporting PREPARE TRANSACTION to the postgres_fdw documentation. Reported-by: Gilles Darold Author: Gilles Darold and Etsuro Fujita Reviewed-by: Michael Paquier and Kyotaro Horiguchi Backpatch-through: 9.4 Discussion: https://postgr.es/m/08600ed3-3084-be70-65ba-279ab19618a5%40darold.net
2019-06-13postgres_fdw: Account for triggers in non-direct remote UPDATE planning.Etsuro Fujita
Previously, in postgresPlanForeignModify, we planned an UPDATE operation on a foreign table so that we transmit only columns that were explicitly targets of the UPDATE, so as to avoid unnecessary data transmission, but if there were BEFORE ROW UPDATE triggers on the foreign table, those triggers might change values for non-target columns, in which case we would miss sending changed values for those columns. Prevent optimizing away transmitting all columns if there are BEFORE ROW UPDATE triggers on the foreign table. This is an oversight in commit 7cbe57c34 which added triggers on foreign tables, so apply the patch all the way back to 9.4 where that came in. Author: Shohei Mochizuki Reviewed-by: Amit Langote Discussion: https://postgr.es/m/201905270152.x4R1q3qi014550@toshiba.co.jp
2019-05-13postgres_fdw: Fix typo in comment.Etsuro Fujita
2019-04-27Avoid postgres_fdw crash for a targetlist entry that's just a Param.Tom Lane
foreign_grouping_ok() is willing to put fairly arbitrary expressions into the targetlist of a remote SELECT that's doing grouping or aggregation on the remote side, including expressions that have no foreign component to them at all. This is possibly a bit dubious from an efficiency standpoint; but it rises to the level of a crash-causing bug if the expression is just a Param or non-foreign Var. In that case, the expression will necessarily also appear in the fdw_exprs list of values we need to send to the remote server, and then setrefs.c's set_foreignscan_references will mistakenly replace the fdw_exprs entry with a Var referencing the targetlist result. The root cause of this problem is bad design in commit e7cb7ee14: it put logic into set_foreignscan_references that IMV is postgres_fdw-specific, and yet this bug shows that it isn't postgres_fdw-specific enough. The transformation being done on fdw_exprs assumes that fdw_exprs is to be evaluated with the fdw_scan_tlist as input, which is not how postgres_fdw uses it; yet it could be the right thing for some other FDW. (In the bigger picture, setrefs.c has no business assuming this for the other expression fields of a ForeignScan either.) The right fix therefore would be to expand the FDW API so that the FDW could inform setrefs.c how it intends to evaluate these various expressions. We can't change that in the back branches though, and we also can't just summarily change setrefs.c's behavior there, or we're likely to break external FDWs. As a stopgap, therefore, hack up postgres_fdw so that it won't attempt to send targetlist entries that look exactly like the fdw_exprs entries they'd produce. In most cases this actually produces a superior plan, IMO, with less data needing to be transmitted and returned; so we probably ought to think harder about whether we should ship tlist expressions at all when they don't contain any foreign Vars or Aggs. But that's an optimization not a bug fix so I left it for later. One case where this produces an inferior plan is where the expression in question is actually a GROUP BY expression: then the restriction prevents us from using remote grouping. It might be possible to work around that (since that would reduce to group-by-a-constant on the remote side); but it seems like a pretty unlikely corner case, so I'm not sure it's worth expending effort solely to improve that. In any case the right long-term answer is to fix the API as sketched above, and then revert this hack. Per bug #15781 from Sean Johnston. Back-patch to v10 where the problem was introduced. Discussion: https://postgr.es/m/15781-2601b1002bad087c@postgresql.org
2019-02-07Ensure that foreign scans with lateral refs are planned correctly.Tom Lane
As reported in bug #15613 from Srinivasan S A, file_fdw and postgres_fdw neglected to mark plain baserel foreign paths as parameterized when the relation has lateral_relids. Other FDWs have surely copied this mistake, so rather than just patching those two modules, install a band-aid fix in create_foreignscan_path to rectify the mistake centrally. Although the band-aid is enough to fix the visible symptom, correct the calls in file_fdw and postgres_fdw anyway, so that they are valid examples for external FDWs. Also, since the band-aid isn't enough to make this work for parameterized foreign joins, throw an elog(ERROR) if such a case is passed to create_foreignscan_path. This shouldn't pose much of a problem for existing external FDWs, since it's likely they aren't trying to make such paths anyway (though some of them may need a defense against joins with lateral_relids, similar to the one this patch installs into postgres_fdw). Add some assertions in relnode.c to catch future occurrences of the same error --- in particular, as backstop against core-code mistakes like the one fixed by commit bdd9a99aa. Discussion: https://postgr.es/m/15613-092be1be9576c728@postgresql.org
2018-12-12Repair bogus EPQ plans generated for postgres_fdw foreign joins.Tom Lane
postgres_fdw's postgresGetForeignPlan() assumes without checking that the outer_plan it's given for a join relation must have a NestLoop, MergeJoin, or HashJoin node at the top. That's been wrong at least since commit 4bbf6edfb (which could cause insertion of a Sort node on top) and it seems like a pretty unsafe thing to Just Assume even without that. Through blind good fortune, this doesn't seem to have any worse consequences today than strange EXPLAIN output, but it's clearly trouble waiting to happen. To fix, test the node type explicitly before touching Join-specific fields, and avoid jamming the new tlist into a node type that can't do projection. Export a new support function from createplan.c to avoid building low-level knowledge about the latter into FDWs. Back-patch to 9.6 where the faulty coding was added. Note that the associated regression test cases don't show any changes before v11, apparently because the tests back-patched with 4bbf6edfb don't actually exercise the problem case before then (there's no top-level Sort in those plans). Discussion: https://postgr.es/m/8946.1544644803@sss.pgh.pa.us
2018-11-28C comment: remove extra '*'Bruce Momjian
Reported-by: Etsuro Fujita Discussion: https://postgr.es/m/5BFE34DE.1080404@lab.ntt.co.jp Author: Etsuro Fujita Backpatch-through: 10
2018-08-28postgres_fdw: don't push ORDER BY with no vars (bug #15352)Andrew Gierth
Commit aa09cd242 changed a condition in find_em_expr_for_rel from being a bms_equal comparison of relids to bms_is_subset, in order to support order by clauses on foreign joins. But this also allows through the degenerate case of expressions with no Vars at all (and hence empty relids), including integer constants which will be parsed unexpectedly on the remote (viz. "ERROR: ORDER BY position 0 is not in select list" as in the bug report). Repair by adding an additional !bms_is_empty test. Backpatch through to 9.6 where the aforementioned change was made. Per bug #15352 from Maksym Boguk; analysis and patch by me. Discussion: https://postgr.es/m/153518420278.1478.14875560810251994661@wrigleys.postgresql.org
2018-07-14Fix hashjoin costing mistake introduced with inner_unique optimization.Tom Lane
In final_cost_hashjoin(), commit 9c7f5229a allowed inner_unique cases to follow a code path previously used only for SEMI/ANTI joins; but it neglected to fix an if-test within that path that assumed SEMI and ANTI were the only possible cases. This resulted in a wrong value for hashjointuples, and an ensuing bad cost estimate, for inner_unique normal joins. Fortunately, for inner_unique normal joins we can assume the number of joined tuples is the same as for a SEMI join; so there's no need for more code, we just have to invert the test to check for ANTI not SEMI. It turns out that in two contrib tests in which commit 9c7f5229a changed the plan expected for a query, the change was actually wrong and induced by this estimation error, not by any real improvement. Hence this patch also reverts those changes. Per report from RK Korlapati. Backpatch to v10 where the error was introduced. David Rowley Discussion: https://postgr.es/m/CA+SNy03bhq0fodsfOkeWDCreNjJVjsdHwUsb7AG=jpe0PtZc_g@mail.gmail.com
2018-07-09Prevent accidental linking of system-supplied copies of libpq.so etc.Tom Lane
Back-patch commit dddfc4cb2, which broke LDFLAGS and related Makefile variables into two parts, one for within-build-tree library references and one for external libraries, to ensure that the order of -L flags has all of the former before all of the latter. This turns out to fix a problem recently noted on buildfarm member peripatus, that we attempted to incorporate code from libpgport.a into a shared library. That will fail on platforms that are sticky about putting non-PIC code into shared libraries. (It's quite surprising we hadn't seen such failures before, since the code in question has been like that for a long time.) I think that peripatus' problem could have been fixed with just a subset of this patch; but since the previous issue of accidentally linking to the wrong copy of a Postgres shlib seems likely to bite people in the field, let's just back-patch the whole change. Now that commit dddfc4cb2 has survived some beta testing, I'm less afraid to back-patch it than I was at the time. This also fixes undesired inclusion of "-DFRONTEND" in pg_config's CPPFLAGS output (in 9.6 and up) and undesired inclusion of "-L../../src/common" in its LDFLAGS output (in all supported branches). Back-patch to v10 and older branches; this is already in v11. Discussion: https://postgr.es/m/20180704234304.bq2dxispefl65odz@ler-imac.local
2018-04-20Change more places to be less trusting of RestrictInfo.is_pushed_down.Tom Lane
On further reflection, commit e5d83995e didn't go far enough: pretty much everywhere in the planner that examines a clause's is_pushed_down flag ought to be changed to use the more complicated behavior where we also check the clause's required_relids. Otherwise we could make incorrect decisions about whether, say, a clause is safe to use as a hash clause. Some (many?) of these places are safe as-is, either because they are never reached while considering a parameterized path, or because there are additional checks that would reject a pushed-down clause anyway. However, it seems smarter to just code them all the same way rather than rely on easily-broken reasoning of that sort. In support of that, invent a new macro RINFO_IS_PUSHED_DOWN that should be used in place of direct tests on the is_pushed_down flag. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se
2018-01-30Fix test case for 'outer pathkeys do not match mergeclauses' fix.Robert Haas
Commit 4bbf6edfbd5d03743ff82dda2f00c738fb3208f5 added a test case, but it turns out that the test case doesn't reliably test for the bug, and in the context of the regression test suite did not because ANALYZE had not been run. Report and patch by Etsuro Fujita. I added a comment along lines previously suggested by Tom Lane. Discussion: http://postgr.es/m/5A6195D8.8060206@lab.ntt.co.jp
2018-01-17postgres_fdw: Avoid 'outer pathkeys do not match mergeclauses' error.Robert Haas
When pushing down a join to a foreign server, postgres_fdw constructs an alternative plan to be used for any EvalPlanQual rechecks that prove to be necessary. This plan is stored as the outer subplan of the Foreign Scan implementing the pushed-down join. Previously, this alternative plan could have a different nominal sort ordering than its parent, which seemed OK since there will only be one tuple per base table anyway in the case of an EvalPlanQual recheck. Actually, though, it caused a problem if that path was used as a building block for the EvalPlanQual recheck plan of a higher-level foreign join, because we could end up with a merge join one of whose inputs was not labelled with the correct sort order. Repair by injecting an extra Sort node into the EvalPlanQual recheck plan whenever it would otherwise fail to be sorted at least as well as its parent Foreign Scan. Report by Jeff Janes. Patch by me, reviewed by Tom Lane, who also provided the test case and comment text. Discussion: http://postgr.es/m/CAMkU=1y2G8VOVBHv3iXU2TMAj7-RyBFFW1uhkr5sm9LQ2=X35g@mail.gmail.com
2018-01-12Fix postgres_fdw to cope with duplicate GROUP BY entries.Tom Lane
Commit 7012b132d, which added the ability to push down aggregates and grouping to the remote server, wasn't careful to ensure that the remote server would have the same idea we do about which columns are the grouping columns, in cases where there are textually identical GROUP BY expressions. Such cases typically led to "targetlist item has multiple sortgroupref labels" errors. To fix this reliably, switch over to using "GROUP BY column-number" syntax rather than "GROUP BY expression" in transmitted queries, and adjust foreign_grouping_ok() to be more careful about duplicating the sortgroupref labeling of the local pathtarget. Per bug #14890 from Sean Johnston. Back-patch to v10 where the buggy code was introduced. Jeevan Chalke, reviewed by Ashutosh Bapat Discussion: https://postgr.es/m/20171107134948.1508.94783@wrigleys.postgresql.org
2017-11-27Fix creation of resjunk tlist entries for inherited mixed UPDATE/DELETE.Tom Lane
rewriteTargetListUD's processing is dependent on the relkind of the query's target table. That was fine at the time it was made to act that way, even for queries on inheritance trees, because all tables in an inheritance tree would necessarily be plain tables. However, the 9.5 feature addition allowing some members of an inheritance tree to be foreign tables broke the assumption that rewriteTargetListUD's output tlist could be applied to all child tables with nothing more than column-number mapping. This led to visible failures if foreign child tables had row-level triggers, and would also break in cases where child tables belonged to FDWs that used methods other than CTID for row identification. To fix, delay running rewriteTargetListUD until after the planner has expanded inheritance, so that it is applied separately to the (already mapped) tlist for each child table. We can conveniently call it from preprocess_targetlist. Refactor associated code slightly to avoid the need to heap_open the target relation multiple times during preprocess_targetlist. (The APIs remain a bit ugly, particularly around the point of which steps scribble on parse->targetList and which don't. But avoiding such scribbling would require a change in FDW callback APIs, which is more pain than it's worth.) Also fix ExecModifyTable to ensure that "tupleid" is reset to NULL when we transition from rows providing a CTID to rows that don't. (That's really an independent bug, but it manifests in much the same cases.) Add a regression test checking one manifestation of this problem, which was that row-level triggers on a foreign child table did not work right. Back-patch to 9.5 where the problem was introduced. Etsuro Fujita, reviewed by Ildus Kurbangaliev and Ashutosh Bapat Discussion: https://postgr.es/m/20170514150525.0346ba72@postgrespro.ru
2017-08-17Remove bogus line from comment.Robert Haas
Spotted by Tom Lane Discussion: http://postgr.es/m/27897.1502901074@sss.pgh.pa.us
2017-07-24When WCOs are present, disable direct foreign table modification.Robert Haas
If the user modifies a view that has CHECK OPTIONs and this gets translated into a modification to an underlying relation which happens to be a foreign table, the check options should be enforced. In the normal code path, that was happening properly, but it was not working properly for "direct" modification because the whole operation gets pushed to the remote side in that case and we never have an option to enforce the constraint against individual tuples. Fix by disabling direct modification when there is a need to enforce CHECK OPTIONs. Etsuro Fujita, reviewed by Kyotaro Horiguchi and by me. Discussion: http://postgr.es/m/f8a48f54-6f02-9c8a-5250-9791603171ee@lab.ntt.co.jp
2017-07-21Stabilize postgres_fdw regression tests.Tom Lane
The new test cases added in commit 8bf58c0d9 turn out to have output that can vary depending on the lc_messages setting prevailing on the test server. Hide the remote end's error messages to ensure stable output. This isn't a terribly desirable solution; we'd rather know that the connection failed for the expected reason and not some other one. But there seems little choice for the moment. Per buildfarm. Discussion: https://postgr.es/m/18419.1500658570@sss.pgh.pa.us
2017-07-21Re-establish postgres_fdw connections after server or user mapping changes.Tom Lane
Previously, postgres_fdw would keep on using an existing connection even if the user did ALTER SERVER or ALTER USER MAPPING commands that should affect connection parameters. Teach it to watch for catcache invals on these catalogs and re-establish connections when the relevant catalog entries change. Per bug #14738 from Michal Lis. In passing, clean up some rather crufty decisions in commit ae9bfc5d6 about where fields of ConnCacheEntry should be reset. We now reset all the fields whenever we open a new connection. Kyotaro Horiguchi, reviewed by Ashutosh Bapat and myself. Back-patch to 9.3 where postgres_fdw appeared. Discussion: https://postgr.es/m/20170710113917.7727.10247@wrigleys.postgresql.org
2017-06-30Fix typo in commentPeter Eisentraut
Author: Albe Laurenz <laurenz.albe@wien.gv.at>
2017-06-22postgres_fdw: Move function prototype to correct section.Robert Haas
Etsuro Fujita, reviewed by Ashutosh Bapat. Discussion: http://postgr.es/m/93a9c487-9920-a38f-da96-503422c50f59@lab.ntt.co.jp
2017-06-21Phase 3 of pgindent updates.Tom Lane
Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21Phase 2 of pgindent updates.Tom Lane
Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-15Fix low-probability leaks of PGresult objects in the backend.Tom Lane
We had three occurrences of essentially the same coding pattern wherein we tried to retrieve a query result from a libpq connection without blocking. In the case where PQconsumeInput failed (typically indicating a lost connection), all three loops simply gave up and returned, forgetting to clear any previously-collected PGresult object. Since those are malloc'd not palloc'd, the oversight results in a process-lifespan memory leak. One instance, in libpqwalreceiver, is of little significance because the walreceiver process would just quit anyway if its connection fails. But we might as well fix it. The other two instances, in postgres_fdw, are somewhat more worrisome because at least in principle the scenario could be repeated, allowing the amount of memory leaked to build up to something worth worrying about. Moreover, in these cases the loops contain CHECK_FOR_INTERRUPTS calls, as well as other calls that could potentially elog(ERROR), providing another way to exit without having cleared the PGresult. Here we need to add PG_TRY logic similar to what exists in quite a few other places in postgres_fdw. Coverity noted the libpqwalreceiver bug; I found the other two cases by checking all calls of PQconsumeInput. Back-patch to all supported versions as appropriate (9.2 lacks postgres_fdw, so this is really quite unexciting for that branch). Discussion: https://postgr.es/m/22620.1497486981@sss.pgh.pa.us
2017-06-13psql: Use more consistent capitalization of some output headingsPeter Eisentraut
2017-06-07postgres_fdw: Allow cancellation of transaction control commands.Robert Haas
Commit f039eaac7131ef2a4cf63a10cf98486f8bcd09d2, later back-patched with commit 1b812afb0eafe125b820cc3b95e7ca03821aa675, allowed many of the queries issued by postgres_fdw to fetch remote data to respond to cancel interrupts in a timely fashion. However, it didn't do anything about the transaction control commands, which remained noninterruptible. Improve the situation by changing do_sql_command() to retrieve query results using pgfdw_get_result(), which uses the asynchronous interface to libpq so that it can check for interrupts every time libpq returns control. Since this might result in a situation where we can no longer be sure that the remote transaction state matches the local transaction state, add a facility to force all levels of the local transaction to abort if we've lost track of the remote state; without this, an apparently-successful commit of the local transaction might fail to commit changes made on the remote side. Also, add a 60-second timeout for queries issue during transaction abort; if that expires, give up and mark the state of the connection as unknown. Drop all such connections when we exit the local transaction. Together, these changes mean that if we're aborting the local toplevel transaction anyway, we can just drop the remote connection in lieu of waiting (possibly for a very long time) for it to complete an abort. This still leaves quite a bit of room for improvement. PQcancel() has no asynchronous interface, so if we get stuck sending the cancel request we'll still hang. Also, PQsetnonblocking() is not used, which means we could block uninterruptibly when sending a query. There might be some other optimizations possible as well. Nonetheless, this allows us to escape a wait for an unresponsive remote server quickly in many more cases than previously. Report by Suraj Kharage. Patch by me and Rafia Sabih. Review and testing by Amit Kapila and Tushar Ahuja. Discussion: http://postgr.es/m/CAF1DzPU8Kx+fMXEbFoP289xtm3bz3t+ZfxhmKavr98Bh-C0TqQ@mail.gmail.com
2017-05-18Don't explicitly mark range partitioning columns NOT NULL.Robert Haas
This seemed like a good idea originally because there's no way to mark a range partition as accepting NULL, but that now seems more like a current limitation than something we want to lock down for all time. For example, there's a proposal to add the notion of a default partition which accepts all rows not otherwise routed, which directly conflicts with the idea that a range-partitioned table should never allow nulls anywhere. So let's change this while we still can, by putting the NOT NULL test into the partition constraint instead of changing the column properties. Amit Langote and Robert Haas, reviewed by Amit Kapila Discussion: http://postgr.es/m/8e2dd63d-c6fb-bb74-3c2b-ed6d63629c9d@lab.ntt.co.jp
2017-05-17Post-PG 10 beta1 pgindent runBruce Momjian
perltidy run not included.
2017-04-24postgres_fdw: Fix join push down with extensionsPeter Eisentraut
Objects in an extension are shippable to a foreign server if the extension is part of the foreign server definition's shippable extensions list. But this was not properly considered in some cases when checking whether a join condition can be pushed to a foreign server and the join condition uses an object from a shippable extension. So the join would never be pushed down in those cases. So, the list of extensions needs to be made available in fpinfo of the relation being considered to be pushed down before any expressions are assessed for being shippable. Fix foreign_join_ok() to do that for a join relation. The code to save FDW options in fpinfo is scattered at multiple places. Bring all of that together into functions apply_server_options(), apply_table_options(), and merge_fdw_options(). David Rowley and Ashutosh Bapat, per report from David Rowley
2017-04-11Simplify handling of remote-qual pass-forward in postgres_fdw.Tom Lane
Commit 0bf3ae88a encountered a need to pass the finally chosen remote qual conditions forward from postgresGetForeignPlan to postgresPlanDirectModify. It solved that by sticking them into the plan node's fdw_private list, which in hindsight was a pretty bad idea. In the first place, there's no use for those qual trees either in EXPLAIN or execution; indeed they could never safely be used for any post-planning purposes, because they would not get processed by setrefs.c. So they're just dead weight to carry around in the finished plan tree, plus being an attractive nuisance for somebody who might get the idea that they could be used that way. Secondly, because those qual trees (sometimes) contained RestrictInfos, they created a plan-transmission hazard for parallel query, which is how come we noticed a problem. We dealt with that symptom in commit 28b047875, but really a more straightforward and more efficient fix is to pass the data through in a new field of struct PgFdwRelationInfo. So do it that way. (There's no need to revert 28b047875, as it has sufficient reason to live anyway.) Per fuzz testing by Andreas Seltenreich. Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de
2017-04-11Handle restriction clause lists more uniformly in postgres_fdw.Tom Lane
Clauses in the lists retained by postgres_fdw during planning were sometimes bare boolean clauses, sometimes RestrictInfos, and sometimes a mixture of the two in the same list. The comment about that situation didn't come close to telling the full truth, either. Aside from being confusing, this had a couple of bad practical consequences: * waste of planning cycles due to inability to cache per-clause selectivity and cost estimates; * sometimes, RestrictInfos would sneak into the fdw_private list of a finished Plan node, causing failures if, for example, we tried to ship the Plan tree to a parallel worker. (It may well be that it's a bug in the parallel-query logic that we would ever try to ship such a plan to a parallel worker, but in any case this deserves to be cleaned up.) To fix, rearrange so that clause lists in PgFdwRelationInfo are always lists of RestrictInfos, and then strip the RestrictInfos at the last minute when making a Plan node. In passing do a bit of refactoring and comment cleanup in postgresGetForeignPlan and foreign_join_ok. Although the messiness here dates back at least to 9.6, there's no evidence that it causes anything worse than wasted planning cycles in 9.6, so no back-patch for now. Per fuzz testing by Andreas Seltenreich. Tom Lane and Ashutosh Bapat Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de
2017-04-10Improve castNode notation by introducing list-extraction-specific variants.Tom Lane
This extends the castNode() notation introduced by commit 5bcab1114 to provide, in one step, extraction of a list cell's pointer and coercion to a concrete node type. For example, "lfirst_node(Foo, lc)" is the same as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode that have appeared so far include a list extraction call, so this is pretty widely useful, and it saves a few more keystrokes compared to the old way. As with the previous patch, back-patch the addition of these macros to pg_list.h, so that the notation will be available when back-patching. Patch by me, after an idea of Andrew Gierth's. Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-07Optimize joins when the inner relation can be proven unique.Tom Lane
If there can certainly be no more than one matching inner row for a given outer row, then the executor can move on to the next outer row as soon as it's found one match; there's no need to continue scanning the inner relation for this outer row. This saves useless scanning in nestloop and hash joins. In merge joins, it offers the opportunity to skip mark/restore processing, because we know we have not advanced past the first possible match for the next outer row. Of course, the devil is in the details: the proof of uniqueness must depend only on joinquals (not otherquals), and if we want to skip mergejoin mark/restore then it must depend only on merge clauses. To avoid adding more planning overhead than absolutely necessary, the present patch errs in the conservative direction: there are cases where inner_unique or skip_mark_restore processing could be used, but it will not do so because it's not sure that the uniqueness proof depended only on "safe" clauses. This could be improved later. David Rowley, reviewed and rather heavily editorialized on by me Discussion: https://postgr.es/m/CAApHDvqF6Sw-TK98bW48TdtFJ+3a7D2mFyZ7++=D-RyPsL76gw@mail.gmail.com
2017-04-06Reset API of clause_selectivity()Simon Riggs
Discussion: https://postgr.es/m/CAKJS1f9yurJQW9pdnzL+rmOtsp2vOytkpXKGnMFJEO-qz5O5eA@mail.gmail.com
2017-04-05Collect and use multi-column dependency statsSimon Riggs
Follow on patch in the multi-variate statistics patch series. CREATE STATISTICS s1 WITH (dependencies) ON (a, b) FROM t; ANALYZE; will collect dependency stats on (a, b) and then use the measured dependency in subsequent query planning. Commit 7b504eb282ca2f5104b5c00b4f05a3ef6bb1385b added CREATE STATISTICS with n-distinct coefficients. These are now specified using the mutually exclusive option WITH (ndistinct). Author: Tomas Vondra, David Rowley Reviewed-by: Kyotaro HORIGUCHI, Álvaro Herrera, Dean Rasheed, Robert Haas and many other comments and contributions Discussion: https://postgr.es/m/56f40b20-c464-fad2-ff39-06b668fac47c@2ndquadrant.com
2017-04-05Capitalize names of PLs consistentlyPeter Eisentraut
Author: Daniel Gustafsson <daniel@yesql.se>
2017-04-03Abstract logic to allow for multiple kinds of child rels.Robert Haas
Currently, the only type of child relation is an "other member rel", which is the child of a baserel, but in the future joins and even upper relations may have child rels. To facilitate that, introduce macros that test to test for particular RelOptKind values, and use them in various places where they help to clarify the sense of a test. (For example, a test may allow RELOPT_OTHER_MEMBER_REL either because it intends to allow child rels, or because it intends to allow simple rels.) Also, remove find_childrel_top_parent, which will not work for a child rel that is not a baserel. Instead, add a new RelOptInfo member top_parent_relids to track the same kind of information in a more generic manner. Ashutosh Bapat, slightly tweaked by me. Review and testing of the patch set from which this was taken by Rajkumar Raghuwanshi and Rafia Sabih. Discussion: http://postgr.es/m/CA+TgmoagTnF2yqR3PT2rv=om=wJiZ4-A+ATwdnriTGku1CLYxA@mail.gmail.com
2017-03-31postgres_fdw: Teach IMPORT FOREIGN SCHEMA about partitioning.Robert Haas
Don't import partitions. Do import partitioned tables which are not themselves partitions. Report by Stephen Frost. Design and patch by Michael Paquier, reviewed by Amit Langote. Documentation revised by me. Discussion: http://postgr.es/m/20170309141531.GD9812@tamriel.snowman.net
2017-03-27Support hashed aggregation with grouping sets.Andrew Gierth
This extends the Aggregate node with two new features: HashAggregate can now run multiple hashtables concurrently, and a new strategy MixedAggregate populates hashtables while doing sorted grouping. The planner will now attempt to save as many sorts as possible when planning grouping sets queries, while not exceeding work_mem for the estimated combined sizes of all hashtables used. No SQL-level changes are required. There should be no user-visible impact other than the new EXPLAIN output and possible changes to result ordering when ORDER BY was not used (which affected a few regression tests). The enable_hashagg option is respected. Author: Andrew Gierth Reviewers: Mark Dilger, Andres Freund Discussion: https://postgr.es/m/87vatszyhj.fsf@news-spur.riddles.org.uk
2017-03-25Faster expression evaluation and targetlist projection.Andres Freund
This replaces the old, recursive tree-walk based evaluation, with non-recursive, opcode dispatch based, expression evaluation. Projection is now implemented as part of expression evaluation. This both leads to significant performance improvements, and makes future just-in-time compilation of expressions easier. The speed gains primarily come from: - non-recursive implementation reduces stack usage / overhead - simple sub-expressions are implemented with a single jump, without function calls - sharing some state between different sub-expressions - reduced amount of indirect/hard to predict memory accesses by laying out operation metadata sequentially; including the avoidance of nearly all of the previously used linked lists - more code has been moved to expression initialization, avoiding constant re-checks at evaluation time Future just-in-time compilation (JIT) has become easier, as demonstrated by released patches intended to be merged in a later release, for primarily two reasons: Firstly, due to a stricter split between expression initialization and evaluation, less code has to be handled by the JIT. Secondly, due to the non-recursive nature of the generated "instructions", less performance-critical code-paths can easily be shared between interpreted and compiled evaluation. The new framework allows for significant future optimizations. E.g.: - basic infrastructure for to later reduce the per executor-startup overhead of expression evaluation, by caching state in prepared statements. That'd be helpful in OLTPish scenarios where initialization overhead is measurable. - optimizing the generated "code". A number of proposals for potential work has already been made. - optimizing the interpreter. Similarly a number of proposals have been made here too. The move of logic into the expression initialization step leads to some backward-incompatible changes: - Function permission checks are now done during expression initialization, whereas previously they were done during execution. In edge cases this can lead to errors being raised that previously wouldn't have been, e.g. a NULL array being coerced to a different array type previously didn't perform checks. - The set of domain constraints to be checked, is now evaluated once during expression initialization, previously it was re-built every time a domain check was evaluated. For normal queries this doesn't change much, but e.g. for plpgsql functions, which caches ExprStates, the old set could stick around longer. The behavior around might still change. Author: Andres Freund, with significant changes by Tom Lane, changes by Heikki Linnakangas Reviewed-By: Tom Lane, Heikki Linnakangas Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-16postgres_fdw: Push down FULL JOINs with restriction clauses.Robert Haas
The previous deparsing logic wasn't smart enough to produce subqueries when deparsing; make it smart enough to do that. However, we only do it that way when necessary, because it generates more complicated SQL which will be harder for any humans reading the queries to understand. Etsuro Fujita, reviewed by Ashutosh Bapat Discussion: http://postgr.es/m/c449261a-b033-dc02-9254-2fe5b7044795@lab.ntt.co.jp
2017-03-09Fix hard-coded relkind constants in assorted other files.Tom Lane
Although it's reasonable to expect that most of these constants will never change, that does not make it good programming style to hard-code the value rather than using the RELKIND_FOO macros. I think I've now gotten all the hard-coded references in C code. Unfortunately there's no equally convenient way to parameterize SQL files ... Discussion: https://postgr.es/m/11145.1488931324@sss.pgh.pa.us