From e7cb7ee14555cc9c5773e2c102efd6371f6f2005 Mon Sep 17 00:00:00 2001 From: Robert Haas Date: Fri, 1 May 2015 08:50:35 -0400 Subject: Allow FDWs and custom scan providers to replace joins with scans. Foreign data wrappers can use this capability for so-called "join pushdown"; that is, instead of executing two separate foreign scans and then joining the results locally, they can generate a path which performs the join on the remote server and then is scanned locally. This commit does not extend postgres_fdw to take advantage of this capability; it just provides the infrastructure. Custom scan providers can use this in a similar way. Previously, it was only possible for a custom scan provider to scan a single relation. Now, it can scan an entire join tree, provided of course that it knows how to produce the same results that the join would have produced if executed normally. KaiGai Kohei, reviewed by Shigeru Hanada, Ashutosh Bapat, and me. --- doc/src/sgml/custom-scan.sgml | 40 ++++++++++++++++++++++++++++++++++++++++ doc/src/sgml/fdwhandler.sgml | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 76 insertions(+) (limited to 'doc/src') diff --git a/doc/src/sgml/custom-scan.sgml b/doc/src/sgml/custom-scan.sgml index 8a4a3dfcfeb..9fd1db6fde4 100644 --- a/doc/src/sgml/custom-scan.sgml +++ b/doc/src/sgml/custom-scan.sgml @@ -81,6 +81,28 @@ typedef struct CustomPath detailed below. + + A custom scan provider can also add join paths; in this case, the scan + must produce the same output as would normally be produced by the join + it replaces. To do this, the join provider should set the following hook. + This hook may be invoked repeatedly for the same pair of relations, with + different combinations of inner and outer relations; it is the + responsibility of the hook to minimize duplicated work. + +typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root, + RelOptInfo *joinrel, + RelOptInfo *outerrel, + RelOptInfo *innerrel, + List *restrictlist, + JoinType jointype, + SpecialJoinInfo *sjinfo, + SemiAntiJoinFactors *semifactors, + Relids param_source_rels, + Relids extra_lateral_rels); +extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook; + + + Custom Path Callbacks @@ -124,7 +146,9 @@ typedef struct CustomScan Scan scan; uint32 flags; List *custom_exprs; + List *custom_ps_tlist; List *custom_private; + List *custom_relids; const CustomScanMethods *methods; } CustomScan; @@ -141,11 +165,27 @@ typedef struct CustomScan is only used by the custom scan provider itself. Plan trees must be able to be duplicated using copyObject, so all the data stored within these two fields must consist of nodes that function can handle. + custom_relids is set by the core code to the set of relations + which this scan node must handle; except when this scan is replacing a + join, it will have only one member. methods must point to a (usually statically allocated) object implementing the required custom scan methods, which are further detailed below. + + When a CustomScan scans a single relation, + scan.scanrelid should be the range table index of the table + to be scanned, and custom_ps_tlist should be + NULL. When it replaces a join, scan.scanrelid + should be zero, and custom_ps_tlist should be a list of + TargetEntry nodes. This is necessary because, when a join + is replaced, the target list cannot be constructed from the table + definition. At execution time, this list will be used to initialize the + tuple descriptor of the TupleTableSlot. It will also be + used by EXPLAIN, when deparsing. + + Custom Scan Callbacks diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml index 5af41318e5c..04f3c224331 100644 --- a/doc/src/sgml/fdwhandler.sgml +++ b/doc/src/sgml/fdwhandler.sgml @@ -598,6 +598,42 @@ IsForeignRelUpdatable (Relation rel); + + FDW Routines For Remote Joins + + +void +GetForeignJoinPaths(PlannerInfo *root, + RelOptInfo *joinrel, + RelOptInfo *outerrel, + RelOptInfo *innerrel, + List *restrictlist, + JoinType jointype, + SpecialJoinInfo *sjinfo, + SemiAntiJoinFactors *semifactors, + Relids param_source_rels, + Relids extra_lateral_rels); + + Create possible access paths for a join of two foreign tables managed + by the same foreign data wrapper. + This optional function is called during query planning. + + + This function the FDW to add ForeignScan paths for the + supplied joinrel. Typically, the FDW will send the whole + join to the remote server as a single query, as performing the join + remotely rather than locally is typically much more efficient. + + + Since we cannot construct the slot descriptor for a remote join from + the catalogs, the FDW should set the scanrelid of the + ForeignScan to zero and fdw_ps_tlist + to an appropriate list of TargetEntry nodes. + Junk entries will be ignored, but can be present for the benefit of + deparsing performed by EXPLAIN. + + + FDW Routines for <command>EXPLAIN</> -- cgit v1.2.3